Skip to content
This repository was archived by the owner on Jan 11, 2026. It is now read-only.

Commit b37fdf7

Browse files
logregress: added docs
1 parent 09f54d5 commit b37fdf7

1 file changed

Lines changed: 90 additions & 1 deletion

File tree

docs/ml/logregress.md

Lines changed: 90 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,4 +48,93 @@ When you have the final values from your derivative calculation, you can use it
4848

4949
## The code
5050

51-
Coming soon
51+
The data used here is the [Breast Cancer Wisconsin (Diagnostic) Data Set](https://www.kaggle.com/datasets/uciml/breast-cancer-wisconsin-data) which has bee modified to look like [this](https://gitlab.com/adwaithrajesh/linear-ml-test/-/blob/main/data/bcancer.csv), where we
52+
don't have id's and M=0, and B=1
53+
54+
```c
55+
#define INCLUDE_MAT_CONVERSIONS
56+
#include "ds/mat.h"
57+
#include "ml/logisticregress.h"
58+
#include "model/metrics.h"
59+
#include "model/train_test_split.h"
60+
#include "parsers/csv.h"
61+
62+
int main(void) {
63+
CSV *csv_reader = csv_init(569, 31, ',');
64+
csv_parse(csv_reader, "data/bcancer.csv");
65+
66+
Mat *X = csv_get_mat_slice(csv_reader, (Slice){1, 31});
67+
Mat *Y = csv_get_mat_slice(csv_reader, (Slice){0, 1});
68+
Mat *X_train, *X_test, *Y_train, *Y_test;
69+
70+
train_test_split(X, Y, &X_train, &X_test, &Y_train, &Y_test, 0.3, 101);
71+
72+
logregress_set_max_iter(2000);
73+
LogisticRegressionModel *model = logregress_init();
74+
logregress_fit(model, X_train, Y_train);
75+
76+
// printf("prediction: %lf\n", logregress_predict(model, (double[]){15.22, 30.62, 103.4, 716.9, ... , 0}, 30));
77+
Array *preds = logregress_predict_many(model, X_test);
78+
Array *true = mat_get_col_arr(Y_test, 0);
79+
80+
logregress_print(model);
81+
82+
printf("confusion matrix: \n");
83+
Mat *conf_mat = model_confusion_matrix(true, preds);
84+
mat_print(conf_mat);
85+
86+
arr_free(true);
87+
arr_free(preds);
88+
logregress_free(model);
89+
mat_free_many(7, X, Y, X_test, X_train, Y_test, Y_train, conf_mat);
90+
csv_free(csv_reader);
91+
}
92+
```
93+
94+
```console
95+
LogisticRegressionModel(bias: 0.5159147, loss: -12.4263621, weights: 0x5556e8a732c0)
96+
weights:
97+
1546.6922009
98+
1139.6829595
99+
8552.1648900
100+
2522.0044946
101+
11.8724211
102+
-19.3345598
103+
-44.9646156
104+
-18.4984994
105+
23.8378678
106+
10.1676564
107+
0.2338315
108+
103.3839701
109+
-139.7864354
110+
-4498.8563443
111+
0.2662770
112+
-6.5798244
113+
-8.6158697
114+
-1.6938180
115+
1.6508702
116+
-0.3857419
117+
1650.7843571
118+
1445.0283208
119+
8312.7672485
120+
-4024.9280673
121+
13.2972726
122+
-72.4527931
123+
-111.8298475
124+
-26.6204266
125+
28.0612275
126+
5.4099162
127+
confusion matrix:
128+
57.00 10.00
129+
2.00 101.00
130+
```
131+
132+
Now, what does the confusion matrix generated by sklean look like.
133+
134+
```python
135+
array([[ 59, 7],
136+
[ 3, 102]])
137+
```
138+
139+
we are pretty close...
140+
checkout the python implementation [here](https://gitlab.com/adwaithrajesh/linear-ml-test/-/blob/main/notebooks/log.ipynb)

0 commit comments

Comments
 (0)