StatPlayground

Project details

During my early years as a HCI researcher, I observed that choosing the appropriate statistical procedure is a task many fellow researchers and students have to contend with. I wanted to look into what resources are used to help make this decision.

 

Formative study

Data collection

We interviewed twelve students (undergraduate, graduate, and doctoral) in our HCI lab. Additionally, since three of these students had on-going analysis during our study, we observed them when they were preparing to perform statistical analysis. Interviews lasted 45 minutes on average, and observations lasted one hour on average.

Analysis 

We gathered insights from our study sessions, and constructed an affinity diagram:

Findings

The two main findings are as follows:

  • Formal education is not adequate, and does not prepare students for real-world analysis.
  • Current resources, such as books, lecture notes, websites, and tutorials, do not help learn statistical know-how (i.e., the practical knowledge that bridges theory and tool usage).

 

Solution: StatPlayground

As a potential solution, we developed StatPlayground. StatPlayground is a web-application that allows users to “play” with hypothetical data in order to learn statistical concepts. Users can use direct manipulation techniques, such as click-and-drag a data point, on visualizations to change the underlying data characteristics. A change in one data characteristic affects other data characteristics, as well as the resulting statistical procedure. StatPlayground determines these changes and visualizes them in real-time. This way, users can see the consequence of their actions, and associate causality between data characteristics and statistics.

Given below is a screenshot of StatPlayground. User can create various distributions (a). Data is visualized as box plots. The characteristics of the dataset, such as the histogram of the distribution (d), experimental design (c), and homoscedasticity (b) are visualized. StatPlayground selects the appropriate statistical test, performs it, and visualizes the results at the bottom pane (e).

To implement StatPlayground, we used several key design principles and interaction mechanisms, such as progressive disclosure, fine-grained control of data characteristics, and feedforward (shown below). The feedforward feature, for examples, allows users to preview the resulting change during the interaction.

 

Validation with users

We evaluated StatPlayground in a holistic manner to see if it would help our target users learn statistics. We recruited 13 participants to take part in our study. All participants have taken an introductory statistics course before and/or perform an analysis earlier.

We gave our participants a hypothetical dataset and some questions that they need to answer by using StatPlayground. These questions were story-driven, e.g., “assume that of all participants in a text-entry experiment, one participant has practiced touch-typing; how does she (the outlier) affect your analysis?” We used a question-driven approach to motivate our participants.

We find several instances where StatPlayground helped our participants learn statistical know-how, e.g., how the measures of central tendency (mean, median) and spread (variance) affect the shape of the distribution. StatPlayground was received well among our participants. They considered the software to be fun and easy to use. However, some users had difficulties using certain features of StatPlayground.