#89,392 | AsPredicted

'Online text-picture validation study with longer text-segments'
(AsPredicted #89,392)


Author(s)
Anne Schüler (Leibniz-Institut für Wissensmedien Tübingen) - a.schueler@iwm-tuebingen.de
Pauline Frick ( Leibniz-Institut für Wissensmedien) - pauline.frick@uni-tuebingen.de
Pre-registered on
2022/02/28 01:29 (PT)

1) Have any data been collected for this study already?
No, no data have been collected for this study yet.

2) What's the main question being asked or hypothesis being tested in this study?
This study is closely related to the study "Online text-picture validation study" (#73263). We will again investigate whether text-picture combinations are validated automatically, however, this time using longer text segments as in the previous study.
Participants see either valid text-picture combinations (a short story consisting of four sentences ending with a sentence matching the presented picture) or invalid text-picture combinations (a short story consisting of four sentences ending with a sentence not matching the presented picture). After that, participants have to react to the probe word "wrong" by pressing the key D or to the probe word "right" by pressing the key K. Importantly, participants are instructed to react only to the probe word, regardless of the previous text-picture combination.
We hypothesize that the reaction time after congruent stimuli (text-picture combination is valid, probe word is right or text-picture combination is invalid, probe word is wrong) is faster than after incongruent stimuli (text-picture combination is invalid, probe word is right or text-picture combination is valid, probe word is wrong). This should result in a significant interaction between validity and probe word meaning that for valid sentence-picture combinations faster reaction times regarding the reaction to the probe word "right" than to the probe word "wrong" are expected, whereas for invalid sentence-picture combinations faster reaction times for the probe word "wrong" than for the probe word "right" are expected.
Moreover, we hypothesize that the error rate after congruent stimuli (text-picture combination is valid, probe word is right or text-picture combination is invalid, probe word is wrong) is lower than after incongruent stimuli (text-picture combination is invalid, probe word is right or text-picture combination is valid, probe word is wrong). This should result in a significant interaction between validity and probe word, meaning that for valid sentence-picture combinations less errors regarding the reaction to the probe word "right" than to the probe word "wrong" are expected, whereas for invalid sentence-picture combinations less errors for the probe word "wrong" than for the probe word "right" are expected.

3) Describe the key dependent variable(s) specifying how they will be measured.
Reaction time: Time until a participant presses the key D or K after seeing the probe word measured in milliseconds.
Error rate: How many times a participant presses the key D after seeing the probe word right and presses the key K after seeing the probe word wrong.

4) How many and which conditions will participants be assigned to?
There are four within-subjects conditions resulting from crossing the factor text-picture validity (valid vs. invalid) and the factor probe word (right vs. wrong).
We will use four counterbalanced list of items: Participants are randomly assigned to one of those 4 lists. Every list contains 60 text-picture combinations followed by the probe word (right or wrong). The number of valid and invalid text-picture combinations, the number of the probe words, and the number of congruent and incongruent trials are the same in all lists. Moreover, after 16 randomly selected trials participants have to answer either a yes-no question regarding the stories content (text attention check) or decide if they have just seen this picture (picture attention check). The list will be presented in two blocks à 30 trials each.

5) Specify exactly which analyses you will conduct to examine the main question/hypothesis.
Reaction time: Linear mixed effects model with item and participant as random effects, probe word and validity as fixed effects will be conducted. If the interaction of validity * probe word is significant, pairwise single comparisons will be conducted. We will compare reaction times for probe words (wrong vs. right) for invalid as well as valid stimuli.
Error rate: General linear mixed effects model with item and participant as random effects, probe word and validity as fixed effects. If the interaction of validity * probe word is significant, pairwise single comparisons will be conducted. We will compare error rates for probe words (wrong vs. right) for invalid as well as valid stimuli.
In case that the interactions are not significant, we will run additional analyses to test whether the single comparisons reach significance.

6) Describe exactly how outliers will be defined and handled, and your precise rule(s) for excluding observations.
Exclusions of participants: Participants who do not agree to have their data processed will be deleted. Participants have to speak German fluently and be right-handed. Participants who report serious technical issues during the study will be excluded as well. Moreover, participants with an error rate above 40% in the Stroop task will be excluded. Participants with an error rate above 40% in either the picture or the text attention check will be excluded as well.
Exclusion of trials: Trials with a reaction time under 10ms or above 5s will be excluded. For reaction time analysis trials with incorrect responses will be excluded.

7) How many observations will be collected or what will determine sample size?
No need to justify decision, but be precise about exactly how the number will be determined.

We conducted a power simulation based on the data of the previous study (#73263). Validity and probe word are contrast coded (invalid = 1, valid = -1; wrong = 1, right = -1) and reaction time data is log transformed.
The simulations revealed that 200 participants lead to a sufficient power. Due to possible data loss, we aim for 250 participants.
The parameters for the reaction time analysis were set as follows:
Fixed effects: estimate intercept = 6.397 estimate probe word = 0.01, estimate validity = 0.005, estimate probe word * Validity = -0.008
SD of random effects: SD participant = 0.2, SD item = 0.02, SD residual = 0.3
Note that we estimated a smaller interaction effect of probe word * validity and a higher residual variance compared to the results of the previous study
Based on 2000 simulations, 60 Items, and 200 participants the estimated power is 0.835
Error rate: The parameters were set as follows:
Fixed effects: estimate intercept = -3.79, estimate probe word = -0.077, estimate validity = 0.153, estimate probe word * Validity = -0.13
SD of random effects: SD participant = 0.944, SD item = 0.0011
Based on 2000 simulations, 60 Items, and 200 participants the estimated power is 0.708

8) Anything else you would like to pre-register?
(e.g., secondary analyses, variables collected for exploratory purposes, unusual analyses planned?)

Participants will be recruited via Prolific. Screening criteria will be language (first language = German, German as a fluent language, Germany as current place of residence), age (between 18 and 35), no reading related disorder, right-handedness and no participation in our previous related studies.
Transformation of data: Since reaction time data are often not normally distributed the data will be log transformed. If this transformation is not adequate, another or no transformation will be chosen.
Additional analysis: The 60 Items are presented in two separate blocks (30 items each). If the interaction of validity * probe word * block reaches significance, the two blocks will be analysed separately. Moreover, as exploratory analysis we will compare reaction times/error rates for validity (valid vs. invalid) for the probe word right as well as the probe word wrong.
We will additionally measure participants' reaction time to the probe words (wrong and right) before and after the experimental blocks. This reaction time will serve as a baseline and will be used for exploratory analyses only.

Version of AsPredicted Questions: 2.00