Analysis in the Time of Big Questions

| Comments

Using the Experience API makes previously uncollected, locked up, and distant learning-related data immediate and accessible. Now questions previously futile to ask can be addressed, and even questions previously accessible but unasked are being tackled, as freeing data frees innovation. Many of those questions are straightforward - summarizing, breaking down, or otherwise reorganizing data. “How many people with property X took action Y?” “How many people in time range Q took action Z immediately after action W?” “Who are the ten people who have the highest scores on activity 42?”

These questions are important and useful, but barely skim the possibilities. Generally when you hear “analytics” the speaker is talking about that sort of thing, but the whole space of analysis encompasses so much more than analytics.

Analysis answers questions, and while analytics questions are interesting, in being easily specified they skirt the Big questions, the ones we yearn to tackle, though they often play an important part in the fullness of a Big answer. “Who has achieved mastery?” “Which students are in danger of failing without a course correction?” “What steps will help prevent the loss of institutional capacity?” “Is my organization’s job preparedness improved by this training program?”

Trying to answer Big questions absolutely completely is a fool’s errand, for a human as much as a computer, but a partial answer, or the answer to a narrower but also Big question, can still be a step forward in comprehending a situation.

Returning to the Big questions I listed, Khan Academy took the question of mastery and narrowed it to, roughly: in this subject area, when can we believe with a high probability that a user will get the next question correct? Unlike the more difficult initial question, this one can be tackled using a standard tool from the data analysis toolbox (logistic regression, for the curious; see the resources at the end).

Another problem with tackling Big questions is, the data analysis approaches involved can be complicated and subtle. Fully understanding them might mean a serious background in statistics and other disciplines. Luckily, as with most applications of data analysis, it is not necessary to fully understand an approach to use it, and people with a solid understanding can often create systems that automatically distill key insights for consumption by others.

Again I return to the mastery question and Khan Academy. Most of their students never see anything other than a gauge of mastery as output of the model. That gauge does not move in a way they can fully predict – getting questions right will increase it, but the amount of increase will vary depending on right and wrong answers in the past, and getting questions wrong will decrease it, but by an even less predictable amount. Despite that, switching from a much easier to understand streak-based gauge led to huge improvements: fewer questions answered for the same performance, mastery reached in more areas, and so forth. So, while the math involved in analysis approaches employed to tackle Big questions is more complicated and less intuitive, the result itself can match intuition better than more easily understood analytics-based approaches

This makes sense. Our brains weight, compare, combine, and otherwise employ fantastically complicated sets of present and past and possible future factors to arrive at judgments about Big questions. An appropriate data analysis approach applied to a well chosen, sufficiently narrow question can achieve similar results in an infinitesimal of the time a human would take to assess the same situation.

So, I hope I’ve made the case for data analysis approaches that reach beyond analytics. You can already see these approaches at work when Khan Academy calculates mastery, and behind Purdue University’s red light / green light Signals, and where WaxLRS detects bad questions, and I believe that the arrival of the Experience API means a burgeoning spring for more.

What are some Big questions you think you might be able to start answering with Experience API data?

Resources

Khan Academy Mastery Details

Purdue University Signals Overview

Question Analysis in WaxLRS, documentation to eventually be on this page

Comments