As of now, Python 3 readiness ( shows that 360 of the
360 top packages for Python support 3.x. So in the second edition of Mastering Machine Learning with Python in Six Steps, all the code examples have been fully updated to Python 3, a great deal of time has been spent to fix all the editorial corrections from the first edition, and also added de-noising signal using wavelet transform example code.

The area of machine learning has seen a phenomenal growth in the recent past, and I have been lucky to have had the chance to be part of some of the exciting projects around machine learning applications for solving business problems. At this juncture, I’m happy to announce the launch of my book “Mastering Machine Learning with Python in Six Steps”, which is my answer to the quote “If there’s a book that you want to read, but it hasn’t been written yet, then you must write it” – Toni Morrison.


This book is a practical guide towards novice to master in machine learning with Python 3 in six steps. The six steps path has been designed based on the “Six degrees of separation” theory which states that everyone and everything is a maximum of six steps away. Note that the theory deals with the quality of connections, rather than their existence. So, a great effort has been taken to design an eminent, yet simple six steps covering fundamentals to advanced topics gradually that will help a beginner walk his way from no or least knowledge of machine learning in Python to all the way to becoming a master practitioner. This book is also helpful for current Machine Learning practitioners to learn the advanced topics such as Hyperparameter tuning, various ensemble techniques, Natural Language Processing (NLP), deep learning, and basics of reinforcement learning.



Each topic has two parts, the first part will cover the theoretical concepts and the second part will cover practical implementation with different Python packages. The traditional approach of math to machine learning i.e., learning all the mathematics then understanding how to implement them to solve problems need a great deal of time/effort which has proven to be not efficient for working professionals looking to switch careers. Hence the focus in this book has been more on simplification, such that the theory/math behind algorithms have been covered only to extend required to get you started.

I recommend readers to work with the book instead of reading it. Real learning goes on only through active participation. Hence, all the code presented in the book are available in the form of iPython notebooks to enable you to try out these examples yourselves and extend them to your advantage or interest as required later. You can find the full code on Apress Github repository here! or on my Github repository here!

Who This Book Is For:

This book will serve as a great resource for learning machine learning concepts and implementation techniques for:

  • Python developers or data engineers looking to expand their knowledge or career into machine learning area.
  • A current non-Python (R, SAS, SPSS, Matlab or any other language) machine learning practitioners looking to expand their implementation skills in Python.
  • Novice machine learning practitioners looking to learn advanced topics such as hyperparameter tuning, various ensemble techniques, Natural Language Processing (NLP), deep learning, and basics of reinforcement learning.

Table of Content:

  • Chapter 1: Step 1 – Getting Started in Python                     …………………………..      1
  • Chapter 2: Step 2 – Introduction to Machine Learning                     ……………..  53
  • Chapter 3: Step 3 – Fundamentals of Machine Learning                     …………  117
  • Chapter 4: Step 4 – Model Diagnosis and Tuning                      …………………..    209
  • Chapter 5: Step 5 – Text Mining and Recommender Systems        ……………   251
  • Chapter 6: Step 6 – Deep and Reinforcement Learning                   …………..   297
  • Chapter 7: Conclusion                                                                                 ……………. 345

Link to buy the book:


or from Apress:


Detailed Table of Content:

  • Chapter 1: Step 1 – Getting Started in Python

    The Best Things in Life Are Free
    The Rising Star
    Python 2.7.x or Python 3.4.x?
    Windows Installation
    OSX Installation
    Linux Installation
    Python from Official Website
    Running Python
    Key Concepts
    Python Identifiers
    My First Python Program
    Code Blocks (Indentation & Suites)
    Basic Object Types
    When to Use List vs. Tuples vs. Set vs. Dictionary
    Comments in Python
    Multiline Statement
    Basic Operators
    Control Structure
    User-Defined Functions
    File Input/Output
    Exception Handling

  • Chapter 2: Step 2 – Introduction to Machine Learning

    History and Evolution
    Artificial Intelligence Evolution
    Different Forms
    Data Mining
    Data Analytics
    Data Science
    Statistics vs. Data Mining vs. Data Analytics vs. Data Science
    Machine Learning Categories
    Supervised Learning
    Unsupervised Learning
    Reinforcement Learning
    Frameworks for Building Machine Learning Systems
    Knowledge Discovery Databases (KDD)
    Cross-Industry Standard Process for Data Mining
    SEMMA (Sample, Explore, Modify, Model, Assess)
    KDD vs. CRISP-DM vs. SEMMA
    Machine Learning Python Packages
    Data Analysis Packages
    Machine Learning Core Libraries

  • Chapter 3: Step 3 – Fundamentals of Machine Learning

    Machine Learning Perspective of Data
    Scales of Measurement
    Nominal Scale of Measurement
    Ordinal Scale of Measurement
    Interval Scale of Measurement
    Ratio Scale of Measurement
    Feature Engineering
    Dealing with Missing Data
    Handling Categorical Data
    Normalizing Data
    Feature Construction or Generation
    Exploratory Data Analysis (EDA)
    Univariate Analysis
    Multivariate Analysis
    Supervised Learning– Regression
    Correlation and Causation
    Fitting a Slope
    How Good Is Your Model?

    Polynomial Regression

  Multivariate Regression
Multicollinearity and Variation Inflation Factor (VIF)
Interpreting the OLS Regression Results
Regression Diagnosis
Nonlinear Regression
Supervised Learning – Classification
Logistic Regression
Evaluating a Classification Model Performance
ROC Curve
Fitting Line
Stochastic Gradient Descent
Multiclass Logistic Regression
Generalized Linear Models
Supervised Learning – Process Flow
Decision Trees
Support Vector Machine (SVM)
k Nearest Neighbors (kNN)
Time-Series Forecasting
Unsupervised Learning Process Flow Clustering
Finding Value of k
Hierarchical Clustering
Principal Component Analysis (PCA)

Chapter 4: Step 4 – Model Diagnosis and Tuning

Optimal Probability Cutoff Point
Which Error Is Costly?
Rare Event or Imbalanced Dataset
Known Disadvantages
Which Resampling Technique Is the Best?
Bias and Variance
K-Fold Cross-Validation
Stratified K-Fold Cross-Validation
Ensemble Methods
Feature Importance
Extremely Randomized Trees (ExtraTree)
How Does the Decision Boundary Look?
Bagging – Essential Tuning Parameters
Example Illustration for AdaBoost
Gradient Boosting
Boosting – Essential Tuning Parameters
Xgboost (eXtreme Gradient Boosting)
Ensemble Voting – Machine Learning’s Biggest Heroes United
Hard Voting vs. Soft Voting
Hyperparameter Tuning

Noise reduction from IoT data

  • Chapter 5: Step 5 – Text Mining and Recommender Systems

    Text Mining Process Overview
    Data Assemble (Text)
    Social Media
    Step 1 – Get Access Key (One-Time Activity)
    Step 2 – Fetching Tweets
    Data Preprocessing (Text)
    Convert to Lower Case and Tokenize
    Removing Noise
    Part of Speech (PoS) Tagging
    Bag of Words (BoW)
    Term Frequency-Inverse Document Frequency (TF-IDF)
    Data Exploration (Text)
    Frequency Chart
    Word Cloud
    Lexical Dispersion Plot
    Co-occurrence Matrix
    Model Building
    Text Similarity
    Text Clustering
    Latent Semantic Analysis (LSA)
    Topic Modeling
    Latent Dirichlet Allocation (LDA)
    Non-negative Matrix Factorization
    Text Classification
    Sentiment Analysis
    Deep Natural Language Processing (DNLP)
    Recommender Systems
    Content-Based Filtering
    Collaborative Filtering (CF)

  • Chapter 6: Step 6 – Deep and Reinforcement Learning

    Artificial Neural Network (ANN)
    What Goes Behind, When Computers Look at an Image?
    Why Not a Simple Classification Model for Images?
    Perceptron – Single Artificial Neuron
    Multilayer Perceptrons (Feedforward Neural Network)
    Load MNIST Data
    Key Parameters for scikit-learn MLP
    Restricted Boltzman Machines (RBM)
    MLP Using Keras
    Dimension Reduction Using Autoencoder
    De-noise Image Using Autoencoder
    Convolution Neural Network (CNN)
    CNN on CIFAR10 Dataset
    CNN on MNIST Dataset
    Recurrent Neural Network (RNN)
    Long Short-Term Memory (LSTM)
    Transfer Learning
    Reinforcement Learning

  • Chapter 7: Conclusion

    Start with Questions/Hypothesis Then Move to Data!
    Don’t Reinvent the Wheels from Scratch
    Start with Simple Models
    Focus on Feature Engineering
             Beware of Common ML Imposters
    Happy Machine Learning