Skip to content

Burak4627/C-Machine-Learning

Repository files navigation

C Machine Learning

Multi-threaded Telnet server that loads a CSV dataset, preprocesses it, trains a linear regression model, and serves per-client predictions over a socket connection.

What It Does

  • Loads one of the packaged datasets (housing, student performance, or generic MLR) and inspects columns.
  • Encodes categorical columns and normalizes numeric/target columns using parallel preprocessing threads.
  • Trains a plain OLS model with one worker thread per coefficient; streams progress logs to the connected client.
  • Exposes an interactive Telnet session where clients pick a dataset, see metadata, and send feature values for on-demand predictions.

Project Layout

  • main.c — entry point; starts the server loop.
  • server.c/.h — TCP listener on port 60000 (change PORT_NUMBER in main.c); handles client workflow and messaging.
  • csv_loader.c/.h — CSV loading, header parsing, type detection.
  • preprocessing.c/.h — per-column categorical encoding and numeric normalization (threaded).
  • regression.c/.h — normal equation assembly and OLS solve (threaded), plus prediction helpers.
  • data_structs.h — shared dataset and metadata structures.
  • utils.c/.h — fatal error helper.
  • Sample data: Housing.csv, Student_Performance.csv, multiple_linear_regression_dataset.csv.

Build

Requires a POSIX-like environment with gcc, make, and pthread support.

make          # builds ./day4_build
make clean    # removes objects and binary

Run

  1. Ensure the CSV files are present in the working directory.
  2. Start the server:
./day4_build
  1. From another terminal, connect via Telnet:
telnet localhost 60000
  1. Follow the prompts:
    • Enter a dataset filename (e.g., Housing.csv).
    • Wait for preprocessing/training logs and the dataset summary.
    • Enter feature values as prompted to receive normalized and real-scale predictions.

Notes

  • Type detection samples a few rows and forces the last column to be the target.
  • Categorical encoding has special handling for yes/no and furnishing status; other categories are mapped by observed values.
  • Thread counts are based on feature count (regression) and column roles (preprocessing); globals in main.c expose limits if you need to tune them.
  • Recommended to run on a Linux/Unix environment for socket/pthread compatibility.
  • Built for educational purposes (day-by-day ML-in-C exploration).

Authors

Troubleshooting

  • If a dataset is missing, the server exits early; verify file names match exactly.
  • On Windows, build and run under WSL or a POSIX layer to satisfy the socket/pthread APIs.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors