Fast.ai’s Deep Learning Course Honest Review— is it Worth it?

You’ll train lots of models right away, but glaze over some of the more academic topics. Bottom line: You will definitely learn loads of useful material.

Jake Krajewski
9 min readNov 24, 2019
I wonder if using a softmax for my Banana ripeness detector will give me better results… Photo by Juan Rumimpunu on Unsplash

I am currently about 3/4 through (lesson 6 of 8) fast.ai’s online course (Practical Deep Learning for Coders, v3) and I wanted to share some details in case you’re thinking about taking it yourself.

First things first, I heard about this course on a Lex Fridman (@lexfridman) interview with Jeremy Howard (@jeremyphoward). The question to Howard was something like — “What advice do you have for someone who wants to get started in deep learning?” And the reply was something like, “Train lots of models.” It may have also been a little bit of a plug for his course, but having been awarded “The Best Course in AI” by CogX in 2019, it seemed like it was worth checking out, and I really wanted to get my hands dirty.

“Train lots of models” — Jeremy Howard, Fast.ai

The course itself is completely free — we’re talking hours of relevant video lectures (from 2019), thousands of forum posts, pretrained models, example datasets, code documentation, the full fastai library, you name it. You need to create an account to post on the forums, but that’s about it. And what you get is a comprehensive course that can get you running out the door, training and deploying models on the world wide web. They even included idiot proof methods for creating high quality datasets using Google image search.

The main focus of the course thus far, has been on the benefits of transfer learning to utilize already trained, highly successful models such as Resnet-34, or Restnet-50 to make short work of learning new tasks with great accuracy. For example, with computer vision, it’s like a taking a successful model that is already very accurate on complex visual input (such as Imagenet) and using it to handle your specific task at hand. This is accomplished with the fastai library, a layer over Pytorch in Python that simplifies and systematizes some of the less constrained, free-form approaches used in constructing deep neural nets.

Teaching philosophy

The teaching philosophy of Howard is based on the “whole game” approach (based on this book by David Perkins). The general metaphor Howard uses is to imagine if you learned everything about playing soccer (football), from the way the ball is manufactured, its materials and design, the best type of grass to play on, the history of the sport, the physics behind the ball, etc., all before even kicking the ball and just playing.

Photo by Jeffrey F Lin on Unsplash

Clearly, the better approach would be to just let you have a ball and play around a bit. With this in mind, he zips you through the basics, flies past specifics and has you training state-of-the-art image classifiers with “world class” results.

As the lessons go on (each lesson is approximately 2hrs), the complexity does as well. After you’re already able to train a decently performing model, Howard provides clear methods for digging into the code behind fast.ai’s speediness (it is indeed fast), and brings up charts and illustrations explaining what is going on behind the scenes.

One part I really liked is when he brought up a spreadsheet in Excel and drew up tables that represented a model’s inputs, parameters, and predictions, while showing us how to train a basic neural network layer in a spreadsheet! (Spoiler alert: spreadsheets are not the fastest at machine learning). It was like popping the hood on a neural network and watching it change as it figured out how to best come up with accurate predictions.

At one point though, he did throw some heavy shade at Google Sheets and it brings me to a point that I want to call out about the teaching style, minor but worth mentioning.

At odds with the academic approach?

Howard is unassuming about his success. At times, it can feel like the slightest bit of humble bragging when he describes how a model he trained “just yesterday morning” has achieved better accuracy than the currently best research papers. And this happens quite a lot, apparently. There are several occasions when a model he is teaching us how to train ends up being better than the best result he can dig up in all the research papers on the web. And the results do seem to back him up. So is it strange that it seems like he has a little chip on his shoulder towards researchers and academia?

Photo by Jason Leung on Unsplash

More than once in his lectures, he draws attention to researchers and machine learning PhD’s, and advanced mathematics as if they belonged to this “other” group that utilized Greek letters and complicated names such as “ReLU” to obscure otherwise simple concepts. Or that if you’ve been taught this thing in a traditional setting, you’re going to have to now unlearn it because they were wrong. It’s almost as if to communicate, they’re wrong, we’re right, and we didn’t have to spend thousands of dollars going through the system to get there — I get that maybe it’s a result of “democratizing” machine learning, and removing the academic stigma from the subject, but math and research are beautiful and helpful tools, not hindrances. They have their places, Jeremy, we all do.

There’s only one other nitpick I have and that’s the consistent choice to employ overly concise variable names, both in the lecture notebooks and the fastai library itself. For example, for the training notebooks, the file location is abbreviated: “/nbs/dl1/” which I assume stands for /notebooks/deeplearning1. The lesson is peppered with variable and attribute names like ‘ds_tfms’, ‘fnames’, ‘pat’, ‘c’, ‘arch’. This is OK, in and of itself, but in this context, targeted toward the novice data scientist student, variables ought to be more self descriptive, and less optimized for loading a microsecond faster on a slim web connection, or for hyper efficient finger speed while coding.

(An Aside: There’s a lot of opinion to variable naming conventions but it tends to be more “Pythonic” to be descriptive in your names. Because, as stated in Data Scientists: Your Variable Names Are Awful. Here’s How to Fix Them: “your code will be read more times than it is written,” so it should be easy to understand.)

A code example from the course

During a collaborative filtering task, using the Movielens 100k dataset from Grouplens (@grouplens), which consists of “100,000 ratings from 1000 users on 1700 movies”, we trained a single neural network layer to predict how a user would rate a particular film (as long as it existed before 1998, because that’s when this dataset was created).

After training our model, Howard walked us through different ways of interpreting and looking at the data. In this case, we were looking for patterns from some bias values generated during training. Here’s a sample of what the code looked like:

g = rating_movie.groupby(title)['rating'].count()
top_movies = g.sort_values(ascending=False).index.values[:1000]
...
item0 = lambda o:o[0]
sorted(movie_ratings, key=item0)[:15]
...
sorted(movie_ratings, key=lambda o: o[0], reverse=True)[:15]
movie_comp = [(f, i) for f,i in zip(fac1, top_movies)]

Remember, this is targeted towards the beginner. Some of these lines could raise an eyebrow even for those who know their way around Python (note: none of the code above is from the fastai library), and it’s difficult to describe what this code is doing, besides sorting something, especially without comments.

Here is the same functionality after cleaning it up a little:

# Group by movie titles
unique_titles = rated_movies.groupby(title)
# Count ratings for each title.
rating_counts = unique_titles['rating'].count()
# Get titles with most user ratings, sort descending
most_rated = (rating_counts.sort_values(ascending=False)
.index.values[:1000])
...
# Sort by bias columns, most negative and most positive
sort_by_bias_col = lambda x: x[2]
neg_bias = sorted(bias_rating_stats, key=sort_by_bias_col)
pos_bias = sorted(bias_rating_stats, key=sort_by_bias_col, reverse=True)
print(neg_bias[:15])
print(pos_bias[:15])

Now, anyone can read through this code and get an idea of what is actually going on, especially useful if we wanted to revisit it later.

In each of the course notebook files, I instinctively found myself cleaning up the code, and although the course is certainly not lacking for excellent content, it’s sometimes painful making sense of it initially. (Not to be a choosing beggar, the content is free, it is extremely challenging, mentally stimulating, and it’s a privilege to have access to this kind of knowledge in the first place!)

Besides, in the end it’s extremely helpful to meticulously tear down and put something back together if you are not really sure of what’s going on. It’s the best way to improve comprehension. So if that’s your strategy, Jeremy, Kudos!

Is it worth it?

I’d say, for what it is intended to be, this is the best way I’ve found to “Train lots of models”, as Jeremy so aptly put it. You really do hit the ground running, and you keep running. One idea builds up to the next, and there isn’t any way you could come out the tail end of this course without having a substantive knowledge of deep learning in Pytorch and Python, not to mention the Fastai library.

However, insults to academia aside, there are times where complicated concepts can be glazed over, or where walls of complexity arise that a student might have to pause and study independently to keep up, somewhere like Khan academy or fast.ai’s Computational Linear Algebra course, which Jeremy plugs now and again (it may be worth checking out if the quality is similar to this course).

These are lessons that you are going to watch twice minimum, some parts take a third and fourth viewing, and some you just have to stop and do some heavy lifting. What’s great about fast.ai is that all the tools you need to figure things out on your own are there (and I mean all of them). For starters, they’ll walk you through setting up your terminal, what even is a GPU, how to get set up on Google’s cloud, Amazon Web Services, Azure, the differences between Colab, Juptyer, and Sagemaker notebooks, how to deploy a model on the web, make sense of documentation, etc. Finally, the forums are densely packed with other useful information from students past and current, and you’ll likely find everything you need with a quick search.

Notable alumni

Additionally, every now and then Jeremy will mention notable alumni that have passed through the course, and some of the impressive things that they’ve done or are currently working on with companies you’ve definitely heard of if you’re into this space (you’ll have to take the course to find out more about them).

If you’re interested in acquiring the practical skills needed to rapidly implement a useful array of deep learning tasks, this course is definitely worth your time. You’ll be lacking in some of the theoretical intuition that you might develop in a more academically focused course such as the Deep Learning Specialization at deeplearning.ai, taught by Andrew Ng (@AndrewYNg), but you might also end up richer in your ability to immediately implement what you just learned. Note: you will still probably need to learn some math.

You’re getting top value

Although you’ll be able to train very basic models from day one, you’ll need heaps of drive and plenty of initiative to come out of this course an expert, but it definitely is possible. Fast.ai takes a lot of the complicated things out of building state-of-the-art neural networks with world class results, but you are not spoon-fed in these lectures. They take dedication to truly understand. Like all things, you’ll get out what you put in. Fast.ai has one of the best, most densely informative courses out there, and for the price: FREE, you absolutely cannot beat the value Jeremy Howard and the fast.ai crew has created for you.

--

--

Jake Krajewski
Jake Krajewski

Written by Jake Krajewski

☧ Cognitive Science Master, Experienced digital product designer. Formerly @GoPro . Exploring the intersection of tech, startups, and a.i./deep learning