Skip to main content
U.S. flag

An official website of the United States government

DATASIM Produces Billions of Simulated xAPI Statements for System Testing

July 01, 2021

Contemporary digital learning technologies generate, store, and share terabytes of learner data—which must flow seamlessly and securely across systems. To enable interoperability and ensure systems can perform at-scale, the ADL Initiative is developing the Data and Training Analytics Simulated Input Modeler (DATASIM), a tool for producing simulated learner data that can mimic millions of diverse user interactions.

DATASIM web application screenshot with credentials, profiles, alignments, run options, and parameters
DATASIM application screen capture.

DATASIM is an open-source platform for generating realistic Experience Application Programming Interface (xAPI) data at very large scale. The xAPI statements model realistic behaviors for a cohort of simulated learner/users, producing tailorable streams of data that can be used to benchmark and stress-test systems. DATASIM requires no specialized hardware, and it includes a user-friendly graphical interface that allows precise control over the simulation parameters and learner attributes.

“DATASIM has proven its capability to match or exceed the real-world production of diverse types of xAPI data,” said Shelly Blake-Plock of Yet Analytics, an ADL Initiative vendor supporting DATASIM’s development. “In recent tests we were able to produce over three billion xAPI statements, with a top speed of 107,000 per second. In a test for the ADL Initiative’s Master Object Model xAPI profile, DATASIM generated one billion statements in 3.5 hours.”

Until now, the limited datasets available for testing have posed a barrier for large-scale system implementation. Additionally, available testing datasets are typically not consistent with the emerging data specifications required by the Total Learning Architecture (TLA). DATASIM fills these gaps. It can help stakeholders work with xAPI data, including high volume and velocity data and xAPI profile design—all without the risk of exposing real learners’ personally identifiable information (PII).

In collaboration with Yet Analytics, the ADL Initiative has matured the DATASIM prototype to Technology Readiness Level 5 (TRL-5), and upcoming demonstrations are planned in support of the US Army’s Synthetic Training Environment Enhanced Learning for Readiness (STEEL-R) project. The Army will use DATASIM to test the STEEL-R data architecture and to develop and validate new xAPI profiles.

The ADL Initiative will host a webinar on 21 July 2021 to provide a project update and tutorial for configuring DATASIM. The information from this webinar will also help organizations that are building, testing, and evaluating new TLA-compliant education/training applications. Information on how to participate in the webinar can be found at the Designing and Testing xAPI Profiles Using the xAPI Profile Server and DATASIM webinar web page. Visit the DATASIM project page for more information on the DATASIM project, including links to the open-source software on GitHub.

Related Project