'Preliminary study on how people use ChatGPT for a collaborative writing task' (AsPredicted #132,502)
Author(s) This pre-registration is currently anonymous to enable blind peer-review. It has 3 authors.
Pre-registered on 2023/05/17 07:56 (PT)
1) Have any data been collected for this study already? No, no data have been collected for this study yet.
2) What's the main question being asked or hypothesis being tested in this study? Overview:
Main question: This study investigates how people interact with the OpenAI-developed chatbot ChatGPT when collaboratively writing an opinionated text. Of interest are also potential influencing factors on the part of the users.
Design: After an explanation of what ChatGPT is and how the chatbot can be used, and after asking participants for some demographic information and completing the Affinity for Technology Interaction (ATI) Scale (Franke et al., 2019), a short statement on risky alcohol consumption is presented to the participants. They are instructed to discuss a prohibition of alcohol in public by using ChatGPT to write a text of at least 600 to a maximum of 1000 words. Moreover, they are instructed to imagine while writing that their text will be published in the comment section of a local daily newspaper and that that they should both provide information on the topic and take a personal position on the topic themselves with their text. Following text creation, the participants are asked questions about their general usage behavior regarding text- and voice-based dialog systems (type and frequency of use, familiarity, self-assessed competence in using such systems) and about their usage behavior specifically regarding ChatGPT. Finally, they answer questions and complete validated questionnaires about their experience of working with ChatGPT: (a) Bot Usability Scale (BUS-11; Borsci et al., 2022), (b) description of their own approach to the interaction, (c) confidence in the functions of ChatGPT, (d) willingness to collaborate with ChatGPT again, (e) perceived competence of ChatGPT, (f) perceived contribution of self to the text, (g) trust in ChatGPT responses: Human-Computer Trust Questionnaire (Madsen & Gregor, 2000) and analysis of the amount of text copied from ChatGPT, (h) perceived increase in knowledge about their own usage behavior and about the functioning of ChatGPT, (i) Robotic Social Attributes Scale (RoSAS; Carpinella et al., 2017). After the main part of the study, a sub-sample will participate in a survey of about 10 minutes duration conducted by the experimenter. This survey comprises questions regarding participants' approach to the task, and a question on whether they would have approached the task the same way if they had worked on the task alone, i.e., without using ChatGPT. The study will be conducted on site. The participants' entire screen activity will be logged during the writing task.
Hypotheses:
The study is exploratory in nature. We will examine relationships among user characteristics, variables related to users' experience of ChatGPT, and measures related to the text composition (e.g., text length, behavioral measures from conversation log).
3) Describe the key dependent variable(s) specifying how they will be measured. Behavioral measures: proportion of copied text from ChatGPT, number of prompts to ChatGPT, duration of the conversation
Self-report measures:
- usefulness of ChatGPT/satisfaction: BUS-11 (Borsci et al., 2022), items rated on a 5-point Likert scale (1 = strongly disagree; 5 = strongly agree); mean of all items (percentage)
- approach: "How did you go about composing the text? Did you use any particular strategy?"
- trust in ChatGPT: text adapted from ChatGPT and Human-Computer Trust Questionnaire (Madsen & Gregor, 2000); 25 items rated on a 7-point Likert scale (1 = do not agree at all; 7 = completely agree); ratings of the five facets of trust are averaged
- future interaction with ChatGPT: "Would you collaborate with ChatGPT again for a writing task?"; "Would you collaborate with ChatGPT again for a writing task on the same topic?", both rated on a 5-point Likert scale (1 = extremely unlikely; 2 = somewhat unlikely; 3 = don't know; 4 = somewhat likely; 5 = extremely likely)
- metaknowledge: "How competent do you rate ChatGPT for the following aspects of a writing task?" each aspect rated on a 5-point Likert scale: 1 = not at all competent, 2 = mostly not competent, 3 = neither incompetent nor competent, 4 = mostly competent, 5 = very competent. "For which aspects of a writing task do you rate ChatGPT's competence higher than your own competence?" (multiple selections possible), also option "for none of the listed aspects" and possibility to enter text.
- perceived contribution of self: "How low/high do you estimate your contribution to the finished text?", scale from 0%-100%
- gain in knowledge: "How would you rate your level of knowledge about banning alcohol in public after collaborative writing with ChatGPT compared to before collaborative writing with ChatGPT?", on 5-point Likert scale (1 = much worse; 2 = somewhat worse; 3 = about the same; 4 = somewhat better; 5 = much better); "Were you able to learn anything about your own user behavior in using the chatbot by collaborating with ChatGPT?", "Were you able to learn anything about how ChatGPT works by collaborating with it?", both on 5-point Likert scale (1 = nothing; 2 = rather little; 3 = somewhat; 4 = rather a lot; 5 = a lot)
- "In retrospect, would you do anything differently in writing the text with ChatGPT?" (Yes /No), if yes: "Please describe in keywords how you would proceed differently."
- Sub-sample: "How did you go about the task?"; "Would you describe your approach as more trial and error, or did you apply a particular strategy? If so, which one?"; "Would you proceed the same way on your own (without using ChatGPT) when working on the task?"; "Reflecting back on working on the task: Did ChatGPT convince you over time or cause skepticism on the contrary? If you agree with any of these, what did this cause?"; "In retrospect, would you do anything differently? If so, what?"
4) How many and which conditions will participants be assigned to? There will be no different conditions in the study. Except for the final survey, all participants will undergo the same study procedure.
5) Specify exactly which analyses you will conduct to examine the main question/hypothesis. We are planning to look for correlative relationships in the data by performing correlation analyses, mainly considering variables related to user characteristics, user's experience of ChatGPT, and measures related to the text composition. Additionally, we plan to perform regression analysis and conduct linear mixed effects models.
Furthermore, we plan to perform qualitative text analyses, such as qualitative content analysis.
6) Describe exactly how outliers will be defined and handled, and your precise rule(s) for excluding observations. We will exclude from the analyses participants who write less than the required minimum of 600 words of text. Furthermore, we will include attention checks. Participants who fail to complete the attention checks will be excluded from analyses.
7) How many observations will be collected or what will determine sample size? No need to justify decision, but be precise about exactly how the number will be determined. Using similar studies for orientation, we plan to recruit 80-100 participants. For the sub-sample we plan at least 15-20 participants.
8) Anything else you would like to pre-register? (e.g., secondary analyses, variables collected for exploratory purposes, unusual analyses planned?) -