As of now, Python 3 readiness (http://py3readiness.org/) shows that 360 of the
360 top packages for Python support 3.x. So in the second edition of Mastering Machine Learning with Python in Six Steps, all the code examples have been fully updated to Python 3, a great deal of time has been spent to fix all the editorial corrections from the first edition, and also added de-noising signal using wavelet transform example code.
The area of machine learning has seen a phenomenal growth in the recent past, and I have been lucky to have had the chance to be part of some of the exciting projects around machine learning applications for solving business problems. At this juncture, I’m happy to announce the launch of my book “Mastering Machine Learning with Python in Six Steps”, which is my answer to the quote “If there’s a book that you want to read, but it hasn’t been written yet, then you must write it” – Toni Morrison.
This book is a practical guide towards novice to master in machine learning with Python 3 in six steps. The six steps path has been designed based on the “Six degrees of separation” theory which states that everyone and everything is a maximum of six steps away. Note that the theory deals with the quality of connections, rather than their existence. So, a great effort has been taken to design an eminent, yet simple six steps covering fundamentals to advanced topics gradually that will help a beginner walk his way from no or least knowledge of machine learning in Python to all the way to becoming a master practitioner. This book is also helpful for current Machine Learning practitioners to learn the advanced topics such as Hyperparameter tuning, various ensemble techniques, Natural Language Processing (NLP), deep learning, and basics of reinforcement learning.
Each topic has two parts, the first part will cover the theoretical concepts and the second part will cover practical implementation with different Python packages. The traditional approach of math to machine learning i.e., learning all the mathematics then understanding how to implement them to solve problems need a great deal of time/effort which has proven to be not efficient for working professionals looking to switch careers. Hence the focus in this book has been more on simplification, such that the theory/math behind algorithms have been covered only to extend required to get you started.
I recommend readers to work with the book instead of reading it. Real learning goes on only through active participation. Hence, all the code presented in the book are available in the form of iPython notebooks to enable you to try out these examples yourselves and extend them to your advantage or interest as required later. You can find the full code on Apress Github repository here! or on my Github repository here!
Who This Book Is For:
This book will serve as a great resource for learning machine learning concepts and implementation techniques for:
- Python developers or data engineers looking to expand their knowledge or career into machine learning area.
- A current non-Python (R, SAS, SPSS, Matlab or any other language) machine learning practitioners looking to expand their implementation skills in Python.
- Novice machine learning practitioners looking to learn advanced topics such as hyperparameter tuning, various ensemble techniques, Natural Language Processing (NLP), deep learning, and basics of reinforcement learning.
Table of Content:
- Chapter 1: Step 1 – Getting Started in Python ………………………….. 1
- Chapter 2: Step 2 – Introduction to Machine Learning …………….. 53
- Chapter 3: Step 3 – Fundamentals of Machine Learning ………… 117
- Chapter 4: Step 4 – Model Diagnosis and Tuning ………………….. 209
- Chapter 5: Step 5 – Text Mining and Recommender Systems …………… 251
- Chapter 6: Step 6 – Deep and Reinforcement Learning ………….. 297
- Chapter 7: Conclusion ……………. 345
Link to buy the book:
or from Apress: https://www.apress.com/in/book/9781484249468
Detailed Table of Content:
-
Chapter 1: Step 1 – Getting Started in Python
The Best Things in Life Are Free
The Rising Star
Python 2.7.x or Python 3.4.x?
Windows Installation
OSX Installation
Linux Installation
Python from Official Website
Running Python
Key Concepts
Python Identifiers
Keywords
My First Python Program
Code Blocks (Indentation & Suites)
Basic Object Types
When to Use List vs. Tuples vs. Set vs. Dictionary
Comments in Python
Multiline Statement
Basic Operators
Control Structure
Lists
Tuple
Sets
Dictionary
User-Defined Functions
Module
File Input/Output
Exception Handling
Endnotes -
Chapter 2: Step 2 – Introduction to Machine Learning
History and Evolution
Artificial Intelligence Evolution
Different Forms
Statistics
Data Mining
Data Analytics
Data Science
Statistics vs. Data Mining vs. Data Analytics vs. Data Science
Machine Learning Categories
Supervised Learning
Unsupervised Learning
Reinforcement Learning
Frameworks for Building Machine Learning Systems
Knowledge Discovery Databases (KDD)
Cross-Industry Standard Process for Data Mining
SEMMA (Sample, Explore, Modify, Model, Assess)
KDD vs. CRISP-DM vs. SEMMA
Machine Learning Python Packages
Data Analysis Packages
NumPy
Pandas
Matplotlib
Machine Learning Core Libraries
Endnotes -
Chapter 3: Step 3 – Fundamentals of Machine Learning
Machine Learning Perspective of Data
Scales of Measurement
Nominal Scale of Measurementhttps://www.apress.com/in/book/9781484249468
Ordinal Scale of Measurement
Interval Scale of Measurement
Ratio Scale of Measurement
Feature Engineering
Dealing with Missing Data
Handling Categorical Data
Normalizing Data
Feature Construction or Generation
Exploratory Data Analysis (EDA)
Univariate Analysis
Multivariate Analysis
Supervised Learning– Regression
Correlation and Causation
Fitting a Slope
How Good Is Your Model?Polynomial Regression
Multivariate Regression
Multicollinearity and Variation Inflation Factor (VIF)
Interpreting the OLS Regression Results
Regression Diagnosis
Regularization
Nonlinear Regression
Supervised Learning – Classification
Logistic Regression
Evaluating a Classification Model Performance
ROC Curve
Fitting Line
Stochastic Gradient Descent
Regularization
Multiclass Logistic Regression
Generalized Linear Models
Supervised Learning – Process Flow
Decision Trees
Support Vector Machine (SVM)
k Nearest Neighbors (kNN)
Time-Series Forecasting
Unsupervised Learning Process Flow Clustering
K-means
Finding Value of k
Hierarchical Clustering
Principal Component Analysis (PCA)
Endnotes
Chapter 4: Step 4 – Model Diagnosis and Tuning
Optimal Probability Cutoff Point
Which Error Is Costly?
Rare Event or Imbalanced Dataset
Known Disadvantages
Which Resampling Technique Is the Best?
Bias and Variance
Bias
Variance
K-Fold Cross-Validation
Stratified K-Fold Cross-Validation
Ensemble Methods
Bagging
Feature Importance
RandomForest
Extremely Randomized Trees (ExtraTree)
How Does the Decision Boundary Look?
Bagging – Essential Tuning Parameters
Boosting
Example Illustration for AdaBoost
Gradient Boosting
Boosting – Essential Tuning Parameters
Xgboost (eXtreme Gradient Boosting)
Ensemble Voting – Machine Learning’s Biggest Heroes United
Hard Voting vs. Soft Voting
Stacking
Hyperparameter Tuning
GridSearch
RandomSearch
Noise reduction from IoT data
Endnotes
-
Chapter 5: Step 5 – Text Mining and Recommender Systems
Text Mining Process Overview
Data Assemble (Text)
Social Media
Step 1 – Get Access Key (One-Time Activity)
Step 2 – Fetching Tweets
Data Preprocessing (Text)
Convert to Lower Case and Tokenize
Removing Noise
Part of Speech (PoS) Tagging
Stemming
Lemmatization
N-grams
Bag of Words (BoW)
Term Frequency-Inverse Document Frequency (TF-IDF)
Data Exploration (Text)
Frequency Chart
Word Cloud
Lexical Dispersion Plot
Co-occurrence Matrix
Model Building
Text Similarity
Text Clustering
Latent Semantic Analysis (LSA)
Topic Modeling
Latent Dirichlet Allocation (LDA)
Non-negative Matrix Factorization
Text Classification
Sentiment Analysis
Deep Natural Language Processing (DNLP)
Recommender Systems
Content-Based Filtering
Collaborative Filtering (CF)
Endnotes -
Chapter 6: Step 6 – Deep and Reinforcement Learning
Artificial Neural Network (ANN)
What Goes Behind, When Computers Look at an Image?
Why Not a Simple Classification Model for Images?
Perceptron – Single Artificial Neuron
Multilayer Perceptrons (Feedforward Neural Network)
Load MNIST Data
Key Parameters for scikit-learn MLP
Restricted Boltzman Machines (RBM)
MLP Using Keras
Autoencoders
Dimension Reduction Using Autoencoder
De-noise Image Using Autoencoder
Convolution Neural Network (CNN)
CNN on CIFAR10 Dataset
CNN on MNIST Dataset
Recurrent Neural Network (RNN)
Long Short-Term Memory (LSTM)
Transfer Learning
Reinforcement Learning
Endnotes -
Chapter 7: Conclusion
Summary
Tips
Start with Questions/Hypothesis Then Move to Data!
Don’t Reinvent the Wheels from Scratch
Start with Simple Models
Focus on Feature Engineering
Beware of Common ML Imposters
Happy Machine Learning