This is not a complete list as I am still building it. I should note that I prefer books to courses so I am more knowledgeable about them. Notice that for the books, there are a lot more books than just pure machine learning as most of at least my day to day is not pure machine learning.
Recommended Books
Build a Career in Data Science (Emily Robinson & Jacqueline Nolis)
Benefits:
- Good career advice for different types of jobs and companies
- Underrated interview questions in the book
- Good advice on different data science projects
- Good knowledge of how to build a portfolio
- Mentions my blog post, “How to Build a Data Science Portfolio”
Drawbacks (I am reaching on this):
- It talks about self learning a bit or learning on the job which is probably not super realististic.
Deep Learning for Coders with fastai and PyTorch: AI Applications Without a PhD (Jeremy Howard & Sylvain Gugger)
Benefits:
- Heard good things. Haven’t read it yet.
Deep Learning with Python (François Chollet)
Benefits:
- Written by François Chollet, creator of Keras
- Massively popular book
- Second Edition came out relatively recently (2021)
Drawbacks (I am reaching on this):
- Liked the first edition. Haven’t reviewed the second edition yet.
Dive into Deep Learning (many authors)
Benefits:
- Interactive deep learning book with code, math, and discussions
- Implemented with NumPy/MXNet, PyTorch, and TensorFlow
- Adopted at 300 universities from 55 countries
- Open source book that is updated very frequently
Drawbacks (I am reaching on this):
- Haven’t gone through entire book yet
Effective Pandas Book (Matt Harrison)
Benefits:
- Absolutely excellent visuals
- Very highly reviewed book on pandas
Drawbacks (I am reaching on this):
- Haven’t gone through entire book yet
Hands-on Machine Learning with Scikit-Learn, Keras & TensorFlow (Aurélien Géron)
Benefits:
- Reading this now
Drawbacks (I am reaching on this):
- Dont think the book defines basic terminology as much as it should.
Introduction to Computation and Programming using Python (John V. Guttag)
Benefits:
- Python book which goes over object oriented programming, algorithmic complexity, data structures, and statistics
- I like the distribution and confidence interval part of the book (even though its a small section)
Drawbacks (I am reaching on this):
- Not super useful for data science, but algorithmic complexity and data structures appear in interviews!
- Uses pylab for some examples and calculates standard deviation from scratch. While this isn’t a deal breaker, it can be confusing for some students.
Introduction to Machine Learning with Python (Andreas C. Muller & Sarah Guido)
Benefits:
- Very good machine learning introduction for visual learners (The author put a lot of time into making images for the book, this is not a common thing)
- Book gives intuitive explanation of how algorithms work
- Authors are big in the machine learning community
Drawbacks (I am reaching on this):
- First edition published in 2016 so I would like a second edition, but the book is still relevant
- Not a lot of math (in case you want to dive deep into it)
Machine Learning with PyTorch and Scikit-Learn by Sebastian Raschka, Yuxi (Hayden) Liu, & Vahid Mirjalili
Benefits:
- PyTorch/Scikit-Learn based
- Machine Learning to Deep Learning explanations are pretty good
- Nice images supporting concepts
- Will be published Published in 2022 with the author constantly updating the book’s code
Drawbacks (I am reaching on this):
- Book hasn’t published yet
Pattern Recognition and Machine Learning (Christopher M. Bishop)
Benefits:
- Very good machine learning theory with relatively intuitive math
Drawbacks (I am reaching on this):
- Book published in 2006 so probably needs some updating
- You need to know linear algebra and calculus to understand a lot of the book
Python Data Science Handbook (Jake VanderPlas)
Benefits:
- Book covers jupyter, numpy, pandas, matplotlib, and some scikit-learn
- Read the book in its entirety online at https://jakevdp.github.io/PythonDataScienceHandbook/
- Book is constantly updated online
Drawbacks (I am reaching on this):
- The machine learning part is too basic
Python Machine Learning (scikit-learn, and TensorFlow 2) by Sebastian Raschka & Vahid Mirjalili
Benefits:
- TensorFlow/Keras/Scikit-Learn based
- Machine Learning to Deep Learning explanations are pretty good
- Nice images supporting concepts
- Published in late 2019 with the author constantly updating the book’s code
Drawbacks (I am reaching on this):
- I dont like the initial explanation of deep learning relative to other sources in this list
The Elements of Statistical Learning (Hastie & Tibshirani & Friedman)
Benefits:
- Great statistical theory for a lot of basic machine learning algorithms
- I like the section on wide data (p bigger than n).
Drawbacks (I am reaching on this):
- Uses R for examples
Recommended Courses
Coursera: Applied Data Science with Python Specialization
Benefits:
- Good for some machine learning/data science introduction in python
Drawbacks:
- Never took all of the courses myself
- Less basic than python for everybody, but still basic
Coursera: Data Science Specialization
Benefits:
- Good start to data science in R with some focus in statistics and machine learning
- You will learn a litle git which is useful for showcasing your projects
Drawbacks:
- Older specialization
- Probably a bit harder than other specializations on coursera
Coursera: Deep Learning Specialization
Benefits:
- TensorFlow based
- Great theory
- Everyone knows the content for the course so interview quesions commonly come from it
- Most quizzes are good tests of knowledge
Drawbacks:
- Haven’t finished this yet so not familiar enough to criticize.
Coursera: Python for Everybody Specialization
Benefits:
- Great starter course for people learning python in a data science context
- A lot of courses derive their material from this specialization (5 courses) on Python
- You can audit without paying (hard to figure out for some)
Drawbacks:
- Course is basic
Coursera: Machine Learning (Andrew Ng)
Benefits:
- Great theory
- Everyone knows the content for the course so interview quesions commonly come from it
- Most quizzes are good tests of knowledge
- You can audit without paying (hard to figure out for some)
Drawbacks:
- I would highly recommend NOT doing the assignments since they are in Octave/MATLAB which almost nobody uses in industry
- It is an older class which hasn’t been updated, but this is a minor issue
Coursera: Mathematics for Machine Learning: Linear Algebra
Benefits:
- Haven’t tried it out yet
Drawbacks:
- Haven’t tried it out yet
Coursera: SQL for Data Science (Sadie St. Lawrence)
Benefits:
- Wonderful theory and application of SQL
- Very popular class
- While the first course only covers SQL, you can take other courses in the Learn SQL Basics for Data Science Specialization which cover AB testing and Apache Spark
Drawbacks:
- Haven’t had time to finish the course
fastai: Practical Deep Learning for Coders, v3
Benefits:
- fastai based (Pytorch and Other Library Wrappers)
- Learn by doing, probably the fastest way to start doing deep learning
- Free
Drawbacks:
- Haven’t finished this yet so not familiar enough to criticize
LinkedIn Learning: 15 Tips for Landing a Data Science Job
Benefits:
- Among other things, course covers types of data science jobs, how to look for jobs, how to build a portfolio, how to network, and how to do well in interviews
Drawbacks:
- Course is very similar to Get a Remote Data Science Job
LinkedIn Learning: Get a Remote Data Science Job
Benefits:
- Among other things, course covers types of data science jobs, how to look for jobs, how to build a portfolio, how to network, and how to do well in interviews
Drawbacks:
- Course is very similar to 15 Tips for Landing a Data Science Job
LinkedIn Learning: Machine Learning with Scikit-Learn (Python)
Benefits:
- Course covers a variety of algorithms like linear regression, logistic regression, decision trees, bagged trees, random forests, K-Means, and various applications of PCA
- Lots of visuals for each algorithm
Drawbacks:
- Course is only 43 minutes so students typically say they spend a lot of time with the notebooks that come with the course.
LinkedIn Learning: Python for Data Visualization
Benefits:
- I made this course on pandas, matplotlib, seaborn, etc and it is about as comprehensive as possible.
Drawbacks:
- Course is about 2 hours
Recommended Websites
For the recommended sites, I have some that I would have put on here if there weren’t some issues with them (moral, ethical, etc).
Real Python
Benefits:
- Very highly curated content.
Drawbacks:
- Not everything is free, but there is so much free content!
YouTube Channel: Corey Schafer
Benefits:
Drawbacks:
- SQL tutorials are great, just wish there was more (common thing I hear from students in analytics classes I teach)
Haven’t Looked at, but Heard Good Things
I will take these at some point to see how they are.