In this blog post I look in some detail at an article* that analyses the behaviour of some 150,000 registrants for the inaugural edX course — 6.002x: Circuits and Electronics, which was offered in the spring of 2012. What makes the article interesting is that the analysis in it is based on the log files for the course, constituting an exemplary case of the application of learning analytics in action (although the authors don’t use that term at all). First, the authors take the data of all registrants into account, later to focus on those relatively few (about 10,000) who managed to earn a course certificate.
Not surprisingly, the vast majority of the registrants (76%) spent the fewest hours on the course (most of them 1 hour only), whereas a small minority of the registrants (7%) spent the most hours (most of them about 100) (fig. 2a in the article). This suggests that the registrants depart from the course at a constant rate, resulting in a negative exponential distribution of course participation durations (frequency versus total time spent in the course). This is consistent with what we know about MOOC drop-out behaviour.
The authors then focus on the final 7% that manage to stay in the course and earn a certificate. Their study behaviour they analyse in detail, in an effort to find out what works in their course and what less so. These results are of course difficult to generalise as they are specific to the course in question. Nevertheless, there are a few interesting points to note of a general nature. First, activity peaks in a weekly rhythm on weekends (fig. 3a). Second, few students access many resources, most only a few. However, there are marked differences between resources. The labwork and homework is carried out dutifully by the majority of the students (80% does almost all of it), videos and coursebook are consulted thoroughly by a minority only (5% checks them all) , usage of tutorials and lecture questions takes the middle ground (fig. 5a). This suggests that students are guided by the learning activities (lab and homework), using videos and coursebook in case of need. The importance of activities to structure the learning is further supported by the students’ engagement in discussions. This varies in step with the time spent on homework, to level off at about 1 hour per week (homework costs 1 to 3 hours weekly) around the mid-term exam. The importance of the discussion also emerges from the finding that, when doing homework, the most frequent next activity of a student is entering the discussion forum (fig 6a).
Overall, this is an interesting and useful study as my highlights hopefully evidence. I have two minor qualms with it, both concerning missed opportunities. First, the analysis focuses on those registrants who passed the exam and earned a certificate. Although the 10,000 students who managed to do this is a sizable number, it pales with the 150,000 who registred in the first place. Acknowledging the impossibility of turning them all in certificate earners, it would have been interesting to know why they turned away from the course. Since this study focuses on log files only and the other 140,000 registrants out of their nature account for few logs only, the kind of analysis attempted here cannot shed light on this question. That is, other analyses using a different methodology are needed too.
Second, and as far as I am concerned more importantly, no attempts is made to frame the discussion in the context of a particular learning theory. Mastery learning is often cited (mainly by Coursera, I believe) as the underlying philosophy of these kinds of MOOCs. Do the data have anything to say about the viability of it as a learning theory for MOOCs? Perhaps the focus on lab and homework suggests so much (although one would have to know the exact nature of it). However, the apparent importance of the forum discussions in the course suggests that the social construction of knowledge plays an important role too. I am sure analysing the nature of the discourse would reveal what function the discussions have had. Such data are available, but require an entirely different approach then done by the authors.
I should emphasise that these qualms do not detract from the value of this study, it deserves to be widely read, particularly by people who are engaged in learning analytics (who might miss it as that term is not used).
Reference
Seaton, D. T., Bergner, Y., Chuang, I., Mitros, P., & Pritchard, D. E. (2014). Who does what in a Massive Open Online Course? Communications of the ACM, 57(4), 58–65. Retrieved from http://cacm.acm.org/magazines/2014/4/173221-who-does-what-in-a-massive-open-online-course/fulltext