MIFS – parallelized Mutual Information based Feature Selection module

danielhomola Blog 2 Comments

TL,DR: I wrapped up three mutual information based feature selection methods in a scikit-learn like module. You can find it on my GitHub. It is very easy to use, you can run the example.py or import it into your project and apply it to your data like any other scikit-learn method.

Mutual information based filter methods The following bit is adopted …

danielhomolaMIFS – parallelized Mutual Information based Feature Selection module

Linear algebra notes and LaTeX

danielhomola Blog 2 Comments

TL;DR I wanted to take a linear algebra course. I also wanted to learn LaTeX. I did both, and wrote a 70-something page long document from my notes of the Linear Algebra Foundations and Frontiers MOOC by The University of Texas at Austin. It still isn’t completely finished and I’m sure there are tons of typos in it but here it is: LAFF notes. …

danielhomolaLinear algebra notes and LaTeX

BorutaPy – an all relevant feature selection method

danielhomola Blog 14 Comments

TL,DR: There’s a pretty clever all-relevant feature selection method, which was conceived by Witold R. Rudnicki and developed by Miron B. Kursa at the ICM UW. Here is its website. While working on my PhD project I read their paper, really liked the method, but didn’t quite like how slow it was. It’s based on R’s Random Forest implementation which runs …

danielhomolaBorutaPy – an all relevant feature selection method

Sending emails from Python through a Gmail account

danielhomola Blog Leave a Comment

In my current research I’m building a new research tool for the integration and visualisation of genomic data. The application starts with a file-upload form, then runs a pretty complex pipeline on our server and cluster that can take hours to complete. So once it’s finished I need to notify the users about any errors, warnings, send them the result files and …

danielhomolaSending emails from Python through a Gmail account