Author: ajit jaokar
After testing this idea for the last few months, we have formally launched this concept
The idea of ‘Data Science Coding in a weekend’ originated from meetups we conducted in London
The idea is simple but effective
We choose a complex section of code and try to learn it in detail over the weekend
We work backwards i.e. try to drill down the concepts behind the main ideas
This led to the philosophy which I articulated in learn machine learning coding basics in a weekend a new approach
And the first book free book classification and regression in a weekend
The “in a weekend” series of books on Data Science Central can be seen as an online version of our London based meetups. All the books have a single community HERE. Like a meetup, the books are free to use. The code is in open source. We have drawn upon many sources which we have referenced in the books
For this first book, the steps in the code are
Regression
Load and describe the data
Exploratory Data Analysis
Exploratory data analysis – numerical
Exploratory data analysis – visual
Analyse the target variable
compute the correlation
Pre-process the data
Dealing with missing values
Treatment of categorical values
Remove the outliers
Normalise the data
Split the data
Choose a Baseline algorithm
defining / instantiating the baseline model
fitting the model we have developed to our training set
Define the evaluation metric
predict scores against our test set and assess how good it is
Refine our dataset with additional columns
Test Alternative Models
Choose the best model and optimise its parameters
Gridsearch
Classification
Load the data
Exploratory data analysis
Analyse the target variable
Check if the data is balanced
Check the co-relations
Split the data
Choose a Baseline algorithm
Train and Test the Model
Choose an evaluation metric
Refine our dataset
Feature engineering
Test Alternative Models
Ensemble models
Choose the best model and optimise its parameters
The second book – coming by next week – is entitled “Azure machine learning in a weekend”.
I introduced the book in this blog – Azure machine learning concepts – an introduction. Most of us start learning development using a language like Python or R. But when you work professionally, you typically end up working with a Cloud platform. The top three Cloud platforms today in terms of market share are AWS, Azure and GCP(Google). These platforms are similar. Our goal is to learn the how to develop for these platform. We start with Azure and then with Google next month.
We welcome your comments on the books and approach
You can download the first book free book classification and regression in a weekend and join the community HERE