'Do species named after celebrities receive more attention on Wikipedia?'
(AsPredicted #76224)


Author(s)
Katie Blake (University of Exeter) - klblake25@gmail.com
Diogo VerĂ­ssimo (University of Oxford) - diogo.gasparverissimo@zoo.ox.ac.uk
Adam Gleave (Independent Researcher) - adamgleave97@gmail.com
Pre-registered on
10/06/2021 05:32 AM (PT)

1) Have any data been collected for this study already?
It's complicated. We have already collected some data but explain in Question 8 why readers may consider this a valid pre-registration nevertheless.

2) What's the main question being asked or hypothesis being tested in this study?
This study aims to explore whether naming a species after a celebrity can increase the amount of attention it receives on Wikipedia.
We have three hypotheses:
1: Species whose scientific names are etymologically related to celebrities will receive more attention on Wikipedia than their closest relatives who are not named after celebrities
2. The average daily Wikipedia page views of celebrities will influence the difference in received attention for species named after this celebrity and their closest relatives not named after celebrities
3. The taxonomic groups species named after celebrities belong to will influence the difference in attention they receive compared to their closest relatives who are not named after celebrities

3) Describe the key dependent variable(s) specifying how they will be measured.
The dependent variable in our research is daily Wikipedia page views: the average daily page view for each species is calculated by the overall total page view to date divided over the number of days passed between 1st January 2015 and 1st July 2021 (this period of time is used because it is the amount of data we have access to using the Wikipedia web API).

4) How many and which conditions will participants be assigned to?
There are no human participants in our study. Species are assigned to one of two groups: named after celebrity, not named after celebrity.
Furthermore, species are categorised into a series of 6 taxonomic groups: Amphibian, Bird, Fish, Invertebrate, Mammal, and Reptile.

5) Specify exactly which analyses you will conduct to examine the main question/hypothesis.
Generalized linear mixed model analyses using R, with fixed and random variables (named versus not named after a celebrity, and which taxonomic groups species are assigned to, respectively). We will also investigate the need for a log transformation of the response variable if the distribution is highly skewed.

6) Describe exactly how outliers will be defined and handled, and your precise rule(s) for excluding observations.
The following will be excluded for celebrity-related species: subspecies and genera named after celebrities, species where only common names are etymologically derived from celebrity names or it is unclear how their scientific name is related to a celebrity's name, species whose scientific name is etymologically derived from multiple celebrities, and any species which are not taxonomically valid.
For non-celebrity-related species, only those clearly not named after celebrities will be used, and we will not compare two species both named after a celebrity.

7) How many observations will be collected or what will determine sample size?
No need to justify decision, but be precise about exactly how the number will be determined.

The final sample size for included pairs of celebrity and non-celebrity-related species will be determined by the number of species which have available Wikipedia pages and are included in the Open Tree of Life database available at opentreeoflife.org. Both types of species are required to have a Wikipedia page to compare page views, and the celebrities whom species are named after are required to have at least 1 view on average per day for their etymologically-related species to be included for analysis. It is anticipated that there will be at least 4000 pairs of species to compare.

8) Anything else you would like to pre-register?
(e.g., secondary analyses, variables collected for exploratory purposes, unusual analyses planned?)

Data has already begun to be collected, but it is not finished and no analyses have taken place. Therefore, this is still a valid pre-registration as we have not yet tested our hypotheses.

Version of AsPredicted Questions: 2.00