A look at MOOCs

I’ve followed a few online courses, or MOOCs, recently and I thought I’d give a quick summary of my experiences.


For the uninitiated, MOOCs are Massive Open Online Courses.  As the name suggests they are courses made available openly over the web.  Although there is a bias towards technical content such as learning to code, they can cover a whole variety of topics from science fiction to the politics of water management.  Teaching is generally via a set of recorded lectures, along with accompanying exercises.  Discussion forums allow for interaction between students and lecturers or teaching assistants.

They have been around for a while now, and a few years ago they gained a lot of media attention and hype suggesting they would seriously change the educational landscape.  That hasn’t really happened, and it’s clear that they are not about to replace traditional class-based university teaching any time soon.  However they can be very useful in the context that I was looking at them, as means of continuing education on technical topics I wanted to learn more about.

There are a number of platforms for the content, but the one I’ve been using is Coursera which is the dominant player.  It partners with a large international set of universities and other institutions.

Courses are free, but if you want to get a certificate at the end you need to pay around $50.  Since course completion rates are very low, typically single figure percentages, it doesn’t make much sense to pay for this until you’ve done the course, if at all.  You can get all the content without paying for it, at least the courses I’ve looked at.  I actually did pay for the certificate on those I’ve completed, partly because it’s evidence to anyone who might want to know that you have an interest in a topic, and partly because I felt if I’d completed the course then it was an endeavour worth supporting for roughly the price of a textbook.

These are the courses I’ve had a look at.

Machine Learning – Stanford University

This is an extremely popular course, and its success is what prompted the course professor Andrew Ng to set up the Coursera platform.

The content covers various supervised and un-supervised learning methods such as logistic regression, neural networks, support vector machines and k-means clustering.  It discusses some of the practical uses of these methods and illustrates their pitfalls and trade-offs.

The content is delivered in the form of 11 weeks worth of video lectures, each week having between one hour and two of content.  Assessment is in the form of a small multiple-choice exam you need to pass each week, plus a coding assignment.  The coding is in Matlab or its open-source equivalent Octave.

The course is very well structured, with lectures often building nicely on what has gone before.  The coding assignments are well thought out.  You don’t really need to have a great deal of coding experience, a large amount of framework code is provided and you just need to fill in the relevant model details, often only requiring two or three lines of matrix manipulation.  I thought this was a very good way to cement an understanding of the topic at hand.

My only slight frustration was with the exam questions each week.  The questions often weren’t fully specified, relying on assumptions or notation that were implicit from the lectures.  Fine if you’d watched the lecture just beforehand, or had taken detailed enough notes, less so otherwise.

Overall I would certainly recommend it to anyone with an interest in the area.

Data Manipulation at Scale: Systems and Algorithms – University of Washington

This is an introduction to data science, and forms part of the university’s “Data Science at Scale” specialization consisting of three courses and a project.

The course starts with an overview of the data science field, the problems it aims to address and what differentiates it from related fields such as statistics and machine learning.  It goes on to cover linear algebra, relational databases, the MapReduce model and NoSQL systems.

It’s a much shorter course, only four weeks, although there is probably more video content per week.  This isn’t great value if you pay for a completion certificate as you’d need to do all four courses, and pay each time, in order to complete the specialization.

There are three programming exercises, two in Python and one just in SQL.  The most interesting was taking a Twitter feed sample and applying sentiment analysis to it, you were able to get some sort of reasonable results very quickly.  However these felt like they were more of a coding challenge than a lesson in data science.  I was able to complete them pretty swiftly as I’ve a Python coding background, but I imagine they would be more time-consuming without this.

I liked the lectures, given by Bill Howe, who I thought was very articulate in his delivery.  While I learned a bit from the course, overall it could do with being a bit more focused.  It spent quite a lot of time discussing themes and the history of certain technologies, and only occasionally drilling into the details.

Computational Investing, Part I – Georgia Tech

Unlike the two courses above, this one is a “self-study” course, meaning that there are no completion deadlines; there are suggested due-dates but they aren’t enforced.  Perhaps partly for this reason, but also because I wasn’t too excited by the content, I stalled after a few weeks.

The content covers equity pricing topics such as the efficient market hypothesis, the efficient frontier and the capital asset pricing model.  There are coding assignments in Python.

As I’ve been exposed to a lot of this elsewhere I didn’t think I was learning too much.  I got the impression that the aim was to teach people how to trade their own money, rather than teach principles at a more theoretical level.

The lecturer, Tucker Balch, has an interesting background and both real-world and academic experience, but I didn’t find his delivery as convincing as the previous two courses I mentioned.  I think it’s unlikely I’ll resume my study.

Econometrics: Methods and Applications – Erasmus University Rotterdam

This course sounded like an opportunity to further my statistical knowledge in an interesting domain, but my enrollment was very short-lived.

In the previously mentioned courses I occasionally became frustrated that the pace was a bit slow.  The opposite was true here.

Rather like the machine learning course, it started out with linear regression.  However, rather than explain the concepts with motivational examples, then introduce the relevant maths step-by-step, it launched headlong into pages of linear algebra that scrolled past the screen almost too quickly to read them.  I would be surprised if anyone who didn’t already have the material at their fingertips could have absorbed the content.


It seems that the jury remains out on the future of MOOCs, but they certainly represent a very valuable learning opportunity.  I’ll definitely be enrolling in further courses, but it is worth doing a bit of research on the full nature of the content before committing the study time.

This entry was posted in Data science, Finance, Machine learning, Python. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *