Skip to main content
U.S. flag

An official website of the United States government

From ADL Initiative Team Member... Nikolaus Hruska: What Should I Track in My Learning Experiences to Build Learning Analytics?

January 30, 2012

Nikolaus Hruska

Facebook's new "Graph Search" technology allows queries into the "Social Graph" such as, "Which of my friends like road biking, live in Pittsburgh, and like 'Bianchi' (a bicycle brand with a loyal following)?"

By looking at the questions we need to answer about performance, we can start to determine the data we will NEED to collect in order to make actionable decisions about our learning activities within our organizations.

Where Do I Start?

What is it exactly that your organization DOES – what do you produce (or consume)? What are your metrics or key performance indicators and how will you use these to make decisions going forward? For simplicity here, we'll say your learning organization's purpose is to create and consume all types of digital media resources – books, articles, websites, elearning courses, checklists, mobile and web applications, videos, podcasts, etc. You will need to collect information about the ways in which your users interact with these tools in order to get meaningful learning analytics.

You can use semantic web specifications to determine the type of media being consumed. You can track the creation, sharing, curation and consumption of different media using the Experience API (). By gathering the performance data deliberately, you'll be able to apply learning analytics to affect the performance of your users.

Don't Judge a "Book" by its Cover…

The following is a simplified example book listing for the Steve Jobs Biography:

It may look simple, but when testing the page using Google's Rich Snippets Testing Tool (, we see that the page author has supplied additional semantic information for the "book" (as defined by in the form of a "name," "publisher," and "ISBN" property. In addition, there is an embedded "offer" to buy the book, two 5 star "ratings" with the reviewer's comments, and an "aggregateRating" of "4" from 179 reviewers.

Read the Book

Now let's imagine that I have a news reader-type application on my tablet for all of my assignments. I was just assigned a book to read from my professor via my class's RSS feed. Given just a single URL, my mobile reader loads the semantic information and determines that the particular URL is describing a "book." After I have finished reading the book, my tablet reports the following activity stream statement to my Learning Record Store (LRS): "Nikolaus read [Steve Jobs Biography]." Due to the content-defined semantics, my reader app was able to infer the appropriate verb ("read") to correspond with the specified activityType (""). If our learning system stores the semantic information for the book as well, we can do some interesting data mining (more on this later).

The Lifecycle of a Book

My tablet's reader app reported that "Nikolaus read [Steve Jobs Biography]." However, in the entire lifecycle of a book, there are several different activities which could be reported on, each corresponding to the ways in which users interact with the book:

Walter authored [Steve Jobs Biography]

David proofread [Steve Jobs Biography]

Margaret approved [Steve Jobs Biography]

Simon & Schuster published [Steve Jobs Biography]

John purchased [Steve Jobs Biography]

Philip shared [Steve Jobs Biography]

Steven liked [Steve Jobs Biography]

Harry reviewed [Steve Jobs Biography]

Getting Answers with Analytics

In all of these activity stream statements, the [Steve Jobs Biography] is a single objectType ( with the same URL ( The statements differ only in the actor and specific verb being reported. In order to return the entire stream above, we would query all statements with an objectType of "" and the book's URL. When we want to distill our results to users that have actually read the book, we can filter the list on the verb "read."

Which books are my coworkers reading?

We could query on the objectType "Book" only (without filtering by URL), and return all of the book titles which are being tracked in our system(s).

Which books has Steven read? We could filter the list by both actor and verb to get the list of books that a single user has read.

Which books has Jane written? We could filter on the verb "authored" to find all of the books someone in my organization has written in the past year.

Which books were shared the most? Which books are the highest rated? We could filter on the verbs "liked," "shared," and/or "rated" to find the most popular books among our users.

Which books should I be reading? If we stored the semantic data about each book, we could start to filter or sort books based on the "aggregateRating" and the "reviewCount" properties of the book. Additionally, we could offer book recommendations based on the ratings and reviews you give when compared with the behavior of similar users in the system, or we could fetch the book's description page again to display the latest ratings and reviews.

We can use these same concepts to answer a wide range of questions within our learning organization: Who authors [/xapi/authors] the most articles ( in my department? Who authored the highest rated video in my biology course? What other videos do they have? Who reads the most books on lizards? Which book is the most common amongst my peers from college? Which blogs do the most talented software developers read? Which industry conferences do my peers attend? Which videos from the Khan Academy are rated highest by my peers? Who wrote the article with the most comments? Who has the highest rated article in their blog? What other articles has that person authored?

Online reputation systems are currently doing a similar type of data mining on activities performed on social media websites to find "experts" in specific topic areas. For example, by looking at who tweets the most content about #iPhone in your Twitter stream, these systems attempt to rank individuals based on the engagement they produce from each tweet – in the form of retweets, favorites, and mentions.

How Does This Fit into the Total Learning Architecture (TLA)?

Each of these search questions are essentially queries and each returns a list of, which is the base class for every type. All "Things" have a URL property and could easily be published as RSS feeds for consumption into my peers' reader apps. My Personal Assistant for Learning (PAL) will need to curate streams like these to push me critical, just-in-time information – at the exact moment of need.

My PAL will need to constantly craft different queries and mine streams for new information in real time. Something as simple as RSS could be the input and output, with the framework being supported by the other web standards and community-defined vocabularies. The RSS publication and subscription infrastructure could be utilized to assign (and even sequence) activities to users. In fact, collaborative reading apps like Flipboard, Zite, and Google Currents are essentially RSS readers at their core. However, they become ultra powerful content consumption and curation tools once connected to your personal and professional learning networks.

Exposing the necessary semantic information about your learning activities can enable a wide range of use cases for search and discovery. Technologies such as Open Graph ( make your pages ready for sharing across your networks while specifications like Microdata ( and help you agree upon a common semantic vocabulary.

This example is in the domain of consuming digital media. When you move into new domains, you'll need to adjust the vocabulary to fit the activities and objects. You can use the same concepts presented here to determine the data you will need (and will use) for decisions moving ahead, so that you can start to ask the questions that will positively affect the outcomes for your learners.