This repository holds all the ipython notebooks and datasets for the "Learning scikit-learn: machine learning in Python" book, by Raúl Garreta and Guillermo Moncecchi. Chapters are self explanatory, so you do not have to own the book to understand the code. Enjoy, and let us know any bug or suggested addition or explanation you need!
Unfortunately, due to edition issues, the code included with the book has many, many typos. That is why we recommend checking (and downloading!) these notebooks for up-to-date, corrected versions of the algorithms presented. The code here included will also be timely tested against the latest versions of scikit-learn and related libraries.
- Chapter 1 - A Gentle Introduction to Machine Learning (2nd. ed!)
- Chapter 2 - Supervised Learning - Image Recognition with Support Vector Machines
- Chapter 2 - Supervised Learning - Regression
- Chapter 2 - Supervised Learning - Text Classification with Naive Bayes
- Chapter 2 - Supervised Learning - Explaining Titanic Hypothesis with Decision Trees
- Chapter 3 - Unsupervised Learning - Clustering Handwritten Digits
- Chapter 3 - Unsupervised Learning - Principal Component Analysis
- Chapter 4 - Advanced Features - Feature Engineering and Selection
- Chapter 4 - Advanced Features - Model Selection