#130613 | AsPredicted

'Judgments of research from large language models'
(AsPredicted #130613)


Author(s)
This pre-registration is currently anonymous to enable blind peer-review.
It has 2 authors.
Pre-registered on
04/29/2023 05:27 AM (PT)

1) Have any data been collected for this study already?
No, no data have been collected for this study yet.

2) What's the main question being asked or hypothesis being tested in this study?
Large language models (LLMs) are increasingly often used by researchers to conduct part of their research. It is yet unclear how laypeople are shifting this change. We conduct an experiment, in which a researcher decides to delegate part of the research process to a person or LLM.

We hypothesize that delegating to a LLM will be judged less favourably than delegating to a human.

3) Describe the key dependent variable(s) specifying how they will be measured.
We will have two dependent variables.
The first DV will be determined on (a) ratings of how morally acceptable it is to delegate part of a research project, (b) how much trustworthy a person that delegates is. As the DV we will use the average rating from these two items, as well as answers to individual items.
The second DV will be ratings of whether delegating will lead to correct output that would stand up to scientific scrutiny. This will be used for exploratory purposes.

Each participant will rate five forms of research activity:
1. Idea generation
2. Prior literature synthesis
3. Data identification and preparation
4. Testing and interpreting the theoretical framework
5. Statistical result analysis

4) How many and which conditions will participants be assigned to?
Participants will be randomly allocated to one of two conditions, where the research assistant is either human (a PhD student) or a machine (a large language model).

5) Specify exactly which analyses you will conduct to examine the main question/hypothesis.
To test the hypothesis, we will conduct a mixed-model analysis, where the type of delegatee (human or LLM) will be the between-subjects factor, and part of the research process will be the within-subjects factor. We will include controls that are described in Section 8 of this document.

6) Describe exactly how outliers will be defined and handled, and your precise rule(s) for excluding observations.
We will exclude participants that incorrectly answer either of our attention check questions. We expect a ~5% exclusion rate.

7) How many observations will be collected or what will determine sample size?
No need to justify decision, but be precise about exactly how the number will be determined.

We will collect data from 438 participants, which will give us 90% power to detect a small (d = 0.2) effect.

8) Anything else you would like to pre-register?
(e.g., secondary analyses, variables collected for exploratory purposes, unusual analyses planned?)

As an exploratory analysis, we will investigate in which part of the research processes the differences between type of delegated (human or LLM) is the greatest. As mentioned earlier, we will use ratings of whether delegating leads to accurate and high-quality output in an exploratory analysis.

We will ask the following questions - answers to them will be used as controls.
"Have you heard about ChatGPT, a conversational bot (chatbot)?", "How many studies on Prolific that you participated in the past concerned (or explicitly used) ChatGPT or GPT? Select 0 if you haven't participated in any study (apart from this one) or select the appropriate option otherwise.", "Have you interacted with ChatGPT?"

We will also ask participants about their gender and age.

Version of AsPredicted Questions: 2.00