'Dynamic Difficulty Adjustment of Graph Puzzles' (AsPredicted #114800)
Author(s) Anan Schütt (University of Augsburg) - anan.schuett@uni-a.de Tobias Huber (University of Augsburg) - tobias.huber@thi.de
Pre-registered on 2022/12/01 - 01:25 AM (PT)
1) Have any data been collected for this study already? It's complicated. We have already collected some data but explain in Question 8 why readers may consider this a valid pre-registration nevertheless.
2) What's the main question being asked or hypothesis being tested in this study? Intelligent tutoring systems usually come with a pool of exercises that students can go through. It is not obvious how to sequence these exercises to optimize learning. We implement three methods of exercise selection based on their difficulty level to measure the effect these methods have on learning gain and the affective state of the student.
We compare the conditions constant-increase, self-determined and a Dynamic Difficulty Adjustment (DDA) algorithm we developed (see 4).
In a previous study we already saw that there were no significant differences between constant-increase and self-determined difficulty.
In this study, we expect higher learning gain and flow for the DDA algorithm.
3) Describe the key dependent variable(s) specifying how they will be measured. We will perform this study by using graph puzzles as our exercises. The participants need to select vertices (points) in a graph (network) such that they don't have an edge (line) between them, This is commonly known as the maximum independent set problem. We generate ca. 200 of such puzzles and calculate their difficulty ratings using a solver algorithm.
Participants will solve graph puzzles in different stages: pre-test, training, post-test. The pre- and post-tests are the same for all participants (only the order is randomized), but the training phase is different in each condition. Participants are monetarily incentivised to submit the pre- and post-tests exercises correctly and quickly. After the post-test stage there will be a questionnaire asking about the experience during puzzling.
To see how well our Dynamic Difficulty Adjustment method adapts to the participants we will measure two things:
If there is a significant difference between the desired success rate (60%) of our DDA algorithm and the actual success rate of the participants in this condition during training.
If there is a correlation between the ability level that our DDA algorithm assigns to each participant after training and their performance in the post-test.
To see how our DDA Algorithm compares to the other conditions we use:
Normalized learning gain (NLG): Calculated from pre- and post-test scores. Each puzzle in the tests counts as 1 point. NLG is calculated for each student, then averaged for each condition.
Flow: Measured in a questionnaire from Engesser & Rheinberg "Flow, performance and moderators of challenge-skill balance"
4) How many and which conditions will participants be assigned to? Three conditions: constant-increase in difficulty, self-determined difficulty and Dynamic Difficulty Adjustment.
constant-increase: The first puzzle of the training phase will be easy, then gradually rise up to the difficult ones.
self-determined: The first puzzle of the training phase has mid-range difficulty. After each puzzle, the participant is asked which difficulty level they want next.
Dynamic Difficulty Adjustment: Each puzzle is chosen by an algorithm that we designed with the goal of selecting puzzles that each participant has a 60% success rate for each puzzle.
5) Specify exactly which analyses you will conduct to examine the main question/hypothesis. To measure if participants with our DDA algorithm achieved the correct success rate, we use a one sample t-test.
To analyze the correlation of the estimated user ability level by our DDA algorithm and post-test score we use Spearman's rank correlation.
For Flow and NLG, we will first perform Shapiro-Wilk tests for normality. If the test fails (= not normally distributed) we will use one-tailed Mann-Whitney tests, otherwise Student's one-tailed independent t-test will be used.
6) Describe exactly how outliers will be defined and handled, and your precise rule(s) for excluding observations. We remove participants if they:
- failed the attention test
- failed to solve every single exercise during training phase
- had a contiguous gap of inactivity of more than 3 minutes during the puzzle phases
7) How many observations will be collected or what will determine sample size? No need to justify decision, but be precise about exactly how the number will be determined. Data will be collected from 30 participants in each condition, outliers will be excluded afterwards
8) Anything else you would like to pre-register? (e.g., secondary analyses, variables collected for exploratory purposes, unusual analyses planned?) We will ask people about their field of study, educational level, their experience/skill with strategy games, puzzles and graph theory, their motivation in the task, and need for cognition. We would be interested in exploratively seeing correlation of these values with their initial performance and learning.
We will also exploratively examine the time and number of extra clicks required to solve the puzzles. These values will be viewed separately for easy, medium, and hard puzzles. We are of the opinion that the measured values from the different levels aren't directly comparable.
In addition we will exploratively look at learning gain compared between different pre-test scores, and also compare outcome success rate with performance during training. Further we are exploratively interested in how success rate and difficulty change during the training phase.
We already collected the data for the self-determination and the constant-increase condition in a previous experiment. We use the data from this experiment to train our DDA algorithm. We also already sampled 5 persons with our DDA algorithm to verify that the behavior of our algorithm is reasonable.