Learning

This project serves a dual purpose. Throughout undergrad my computational training was applied (comp-bio/structural bio research + informatics and modelling coursework) with algorithmic implementations coming mainly from high level modelling libraries but, as I continue to grow into a data-scientist I am looking to grow my experience in numerical computing. My two main goals at the moment are to:

Implement algorithms from scratch using only scientific computing libraries for vectorization, and dataframe oriented frameworks like Pandas.
Write out the derivations for the various models I have been using day-to-day and build on them in order to expand my theoretical statistical foundations.

The result should be a robust library of foundational biostatistical and machine learning methods derived in the docs, implemented in my backend, and used in the dashboard currently under development.

Implementations so far and where the code lives

Outlined below are the different models I have implemented so far.

Implementations

Foundations

GLMs

Mixed models

GLMMs

Code

There are examples at the bottom of each python module inside of the "main" block that can be run to test out each implementation if anyone perusing is curious to see. There are, of-course unlisted dependencies (i.e. numpy/scipy/scikit-learn) but this is not really meant to be an entirely public use at the moment so for now I leave it up to the user to pip/conda install their way to success.

The front-end will eventually contain case-studies I picked out of interest for me. However, they will all be done using my own package! Which is a neat way to work on both front and back at once!

Future directions

Front end

I plan on also adding in some kind of TypeScript front end GUI/chart displayer mostly to try to get some practice using JavaScript/TypeScript.

There is actually already a univariate regression implemented in TypeScript before I realized that there weren’t many good vectorized math packages (aside from like tensorflow but this came with its own suite of problems) in the Node.js version of TypeScript and it's really not meant for that anyways but, it gave me a solid foundation thus far.

Perhaps I will also add in some SQLite for databse operations. Although, we would be just moving CSV's around inside folders it would be proof of concept.

Future algorithms and models

There are a number of models and algorithms im interested in implementing as I go along. Here is a small list of the things I am aiming for in the immediate future

In the near future:

k-means
PCA
negative bionomial regression
Markov Models

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Learning

Implementations so far and where the code lives

Implementations

Foundations

GLMs

Mixed models

GLMMs

Code

Future directions

Front end

Future algorithms and models

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Learning

Implementations so far and where the code lives

Implementations

Foundations

GLMs

Mixed models

GLMMs

Code

Future directions

Front end

Future algorithms and models