Wikipedia’s relationship to academia and academics

I was recently quoted in an article for Science News about the relationship between academia and Wikipedia by Bethany Brookshire. I was asked to comment on a recent paper by MIT Sloan‘s Neil Thompson and Douglas Hanley who investigated the relationship between Wikipedia articles and scientific papers using examples from chemistry and econometrics. There are a bunch of studies on a similar topic (if you’re interested, here is a good place to start) and I’ve been working on this topic – but from a very different angle – for a qualitative study to be published soon. I thought I would share my answers to the interview questions here since many of them are questions that friends and colleagues ask regularly about citing Wikipedia articles and about quality issues on Wikipedia.

Have you ever edited Wikipedia articles?  What do you think of the process?

Some, yes. Being a successful editor on English Wikipedia is a complicated process, particularly if you’re writing about topics that are either controversial or outside the purview of the majority of Western editors. Editing is complicated not only because it is technical (even with the excellent new tools that have been developed to support editing without having to learn wiki markup) – most of the complications come with knowing the norms, the rules and the power dynamics at play.

You’ve worked previously with Wikipedia on things like verification practices. What are the verification practices currently?

That’s a big question 🙂 Verification practices involve a complicated set of norms, rules and technologies. Editors may (or may not) verify their statements by checking sources, but the power of Wikipedia’s claim-making practice lies in the norms of questioning  unsourced claims using the “citation needed” tag and by any other editor being able to remove claims that they believe to be incorrect. This, of course, does not guarantee that every claim on Wikipedia is factually correct, but it does enable the dynamic labelling of unverified claims and the ability to set verification tasks in an iterative fashion.

Many people in academia view Wikipedia as an unreliable source and do not encourage students to use it. What do you think of this?

Academic use of sources is a very contextual practice. We refer to sources in our own papers and publications not only when we are supporting the claims they contain, but also when we dispute them. That’s the first point: even if Wikipedia was generally unreliable, that is not a good reason for denying its use. The second point is that Wikipedia can be a very reliable source for particular types of information. Affirming the claims made in a particular article, if that was our goal in using it, would require verifying the information that we are reinforcing through citation and in citing the particular version (the “oldid” in Wikipedia terms) that we are referring to. Wikipedia can be used very soundly by academics and students – we just need to do so carefully and with an understanding of the context of citation – something we should be doing generally, not only on Wikipedia.

You work in a highly social media savvy field, what is the general attitude of your colleagues toward Wikipedia as a research resource? Do you think it differs from the attitudes of other academics?

I would say that Wikipedia is widely recognized by academics, including those of my colleagues who don’t specifically conduct Wikipedia research, as a source that is fine to visit but not to cite.

What did you think of this particular paper overall?

I thought that it was a really good paper. Excellent research design and very solid analysis. The only weakness, I would argue, would be that there are quite different results for chemistry and econometrics and that those differences aren’t adequately accounted for. More on that below.

The authors were attempting a causational study by adding Wikipedia articles (while leaving some written but unadded) and looking at how the phrases translated to the scientific literature six months later. Is this a long enough period of time?

This seems to be an appropriate amount of time to study, but there are probably quite important differences between fields of study that might influence results. The volume of publication (social scientists and humanities scholars tend to produce much lower volumes of publications and publications thus tend to be extended over time than natural science and engineering subjects, for example), the volume of explanatory or definitional material in publications (requiring greater use of the literature), the extent to which academics in the particular field consult and contribute to Wikipedia – all might affect how different fields of study influence and are influenced by Wikipedia articles.

Do you think the authors achieved evidence of causation here?

Yes. But again, causation in a single field i.e. chemistry.

It is important to know whether Wikipedia is influencing the scientific literature? Why or why not?

Yes. It is important to know whether Wikipedia is influencing scientific literature – particularly because we need to know where power to influence knowledge is located (in order to ensure that it is being fairly governed and maintained for the development of accurate and unbiased public knowledge).

Do you think papers like this will impact how scientists view and use Wikipedia?

As far as I know, this is the first paper that attributes a strong link between what is on Wikipedia and the development of science. I am sure that it will influence how scientists and other academic view and use Wikipedia – particularly in driving initiatives where scientists contribute to Wikipedia either directly or via initiatives such as PLoS’s Topic Pages.

Is there anything especially important to emphasize?

The most important thing is to emphasize the differences between fields that I think needs to be better explained. I definitely think that certain types of academic research are more in line with Wikipedia’s way of working, forms and styles of publication and epistemology and that it will not have the same influence on other fields.

What I’m talking about in 2016

Authority and authoritative sources, critical data studies, digital methods, the travel of facts online, bot politics and social media and politics. These are some of the things I’m talking about in 2016. (Just in case you thought the #sunselfies only indicated fun and aimless loafing).  

15 January Fact factories: How Wikipedia’s logics determine what facts are represented online. Wikipedia 15th birthday event, Oxford Internet Institute. [Webcast, OII event page, OII’s Medium post, The Conversation article]

29 January Wikipedia and me: A story in four acts. TEDx Leeds University. [Video, TEDx Leeds University site]

Abstract: This is a story about how I came to be involved in Wikipedia and how I became a critic. It’s a story about hope and friendship and failure, and what to do afterwards. In many ways this story represents the relationship that many others like me have had with the Internet: a story about enormous hope and enthusiasm followed by disappointment and despair. Although similar, the uniqueness of these stories is in the final act – the act where I tell you what I now think about the future of the Internet after my initial despair. This is my Internet love story in four acts: 1) Seeing the light 2) California rulz 3) Doubting Thomas 4) Critics unite. 

17 February. Add data to methods and stir. Digital Methods Summer School. CCI, Queensland University of Technology, Brisbane [QUT Digital Methods Summer School website]

Abstract: Are engagements with real humans necessary to ethnographic research? In this presentation, I argue for methods that connect data traces to the individuals who produce them by exploring examples of experimental methods featured on the site ‘EthnographyMatters.net’, such as live fieldnoting, collaborative mapmaking and ‘sensory postcards’.  This presentation will serve as an inspiration for new work that expands beyond disciplinary and methodological boundaries and connects the stories we tell about our things with the humans who create them.  

Continue reading “What I’m talking about in 2016”

How Wikipedia’s Dr Jekyll became Mr Hyde: Vandalism, sock puppetry and the curious case of Wikipedia’s decline

This is a (very) short paper that I will be presenting at Internet Research in Denver this week. I want to write something longer about the story because I feel like it represents in many ways what I see as emblematic of so many of us who lived through our own Internet bubble: when everything seemed possible and there was nothing to lose. This is (a small slice of) Drork’s story. 

Richard Mansfield starring in The Strange Case of Dr. Jekyll and Mr. Hyde. Wikipedia. Public Domain.
Richard Mansfield starring in The Strange Case of Dr. Jekyll and Mr. Hyde. Wikipedia. Public Domain.

Abstract This paper concerns the rise and fall of Wikipedia editor, ‘drork’ who was blocked indefinitely from the English version of the encyclopedia after seven years of constructive contributions, movement leadership and intense engagement. It acts as a companion piece to the recent statistical analyses of patterns of conflict and vandalism on Wikipedia to reflect on the questions of why someone who was once committed to the encyclopedia may want to vandalize it. The paper compares two perspectives on the experience of being a Wikipedian: on the other hand, a virtuous experience that enables positive character formation as more commonly espoused, and alternatively as an experience dominated by in-fighting, personal attacks and the use of Wikipedia to express political goals. It concludes by arguing that the latter behavior is necessary in order to survive as a Wikipedian editing in these highly conflict-ridden areas.

Introduction

Recent scholarship has painted two competing pictures of what Wikipedia and Wikipedians are “like” and what they are motivated by. On the one hand, Benkler and Nissenbaum argue that because people contribute to projects like Wikipedia with motivations “ranging from the pure pleasure of creation, to a particular sense of purpose, through to the companionship and social relations that grow around a common enterprise”, the practice of commons-based peer production fosters virtue and enables “positive character formation” (Benkler and Nissenbaum, 2006). On the other hand, we have heard more recently about how “free and open” communities like Wikipedia have become a haven for aggressive, intimidating behavior (Reagle, 2013) and that reversions of newcomers’ contributions has been growing steadily and may be contributing to Wikipedia’s decline (Halfaker, Geiger, Morgan, & Riedl, in-press).   Continue reading “How Wikipedia’s Dr Jekyll became Mr Hyde: Vandalism, sock puppetry and the curious case of Wikipedia’s decline”

Why Wikipedia is no ‘proxy for culture’ (Part 1 of 3)

First posted at EthnographyMatters.net

Last month’s Wired magazine showed an infographic with a headline that read: ‘History’s most influential people, ranked by Wikipedia reach’ with a group of 20 men arranged in hierarchical order — from Jesus at number 1 to Stalin at number 20. Curious, I wondered how ‘influence’ and ‘Wikipedia reach’ was being decided. According to the article, ‘Rankings (were) based on parameters such as the number of language editions in which that person has a page, and the number of people known to speak those languages’. What really surprised me was not the particular arrangement of figures on this page but the conclusions that were being drawn from it.

According to the piece, César Hidalgo, head of the Media Lab’s Macro Connections group, who researched the data, made the following claims about the data gathered from Wikipedia:

a) “It shows you how the world perceives your own national culture.

b) “It’s a socio-cultural mirror.

c) “We use historical characters as proxies for culture.

And finally, perhaps most surprising is this final line in the story:

Using this quantitative approach, Hidalgo is now testing hypotheses such as whether cultural development is structured or random. “Can you have a Steve Jobs in a country that has not generated enough science or technology?” he wonders. “Ultimately we want to know how culture assembles itself.”

It is difficult to comment on the particular method used by this study because there is little more than the diagram and a few paragraphs of analysis, and the journalist may have misquoted him, but I wanted to draw attention to the statements being made because I think it represents the growing phenomenon of big data analysts using Wikipedia data to make assumptions about ‘culture’. Continue reading “Why Wikipedia is no ‘proxy for culture’ (Part 1 of 3)”

WikiSym Redefined

Ward Cunningham, inventor of the wiki, at the first WikiSym in 2005 which was co-located with ACM OOPSLA in San Diego, California. Pic by Peter Kaminski CC BY on Flickr.
Ward Cunningham, inventor of the wiki, at the first WikiSym in 2005 which was co-located with ACM OOPSLA in San Diego, California. Pic by Peter Kaminski CC BY on Flickr.

There has been much reflecting and soul-searching about the future of WikiSym in the past year (and probably before that as well). Many felt that the conference was becoming dominated by Wikipedia research and that it needed to grow to encompass more research in the open source, open data and open content realm. I felt that the conference needed to attract more social scientists and qualitative researchers in order to reach more detailed understanding of Wikipedia is being integrated into everyday life.

Despite the negatives, everyone felt that WikiSym was and still is the best place for people who do research about Wikipedia and other wikis to gather and that there was a lot of promise in broadening our mandate. This is why I feel so excited about co-chairing a new dedicated Wikipedia track at next year’s WikiSym in Hong Kong along with Mark Graham, also at the Oxford Internet Institute. And that’s why I was also happy that Dirk Riehle, veteren of WikiSym, is at the helm again next year, leading an effort to redesign the event around a changing research landscape.

There are a few key differences to next year’s event:

1. WikiSym 2013 will be held jointly with a new conference called ‘OpenSym’ and the entire event will consist of four tracks dedicated to different research trajectories:

  • Open collaboration (wikis, social media, etc.) research (WikiSym 2013), chaired by Jude Yew of National University of Singapore
  • Wikipedia research (WikiSym 2013), chaired jointly by myself and Mark Graham of the Oxford Internet Institute at the University of Oxford
  • Free, libre, and open source software research (OpenSym 2013), chaired jointly by Jesus M. Gonzalez-Barahona and Gregorio Robles of Universidad Rey Juan Carlos
  • Open access, data, and government research (OpenSym 2013), chaired by Anne Fitzgerald of Queensland University of Technology

This means that Mark and I can focus on getting the very best of Wikipedia research to WikiSym and in thinking hard about what is missing and what needs to be encouraged in the years to come. Continue reading “WikiSym Redefined”

Language, identity and Wikipedia: Some perspectives from the Cairo “Wikipedia in the Arab World” workshop

Mark Graham talks about the stated goal of Wikipedia to become the “sum of all human knowledge” while Ahmed Medat waits to translate into Arabic

It was the end of the final day of our workshop on the outskirts of Cairo and we were all feeling that curious mixture of inspiration, energy and exhaustion that follows those meetings where a world of ideas and people and things are thrown together in a concentrated few days. Mark Graham asked each of us if we’d like to say a few parting words and the participants spoke about how they enjoyed meeting Wikipedians from so many places in the Middle East, that they were happy to come to an event with academics and that they were excited about doing something to make a change in the real world. The majority of participants spoke in English – what was for many of them a third or fourth language – while some had their Arabic translated on the fly by other participants.

I was surprised when we got round to Mohamed Amarochan, Wikipedian, Mozilla hacker and blogger from Morocco, when he said that he would like to speak in Arabic. I knew that Mohamed had a really good command of English because I’d spent a fascinating ride with him from the airport on the way to the workshop where we commiserated with one another about visa hardships. When he chose to speak in Arabic and allowed others to translate into English, I realized that Mohammed was making an important statement about how small decisions like which language you choose to speak in a conversation like this one has big consequences.

As Clive Holes writes, ‘How we speak is an important part of who we are: in a sense, speech is the oral counterpart of how we dress. Both are intimately linked to our sense of self, and of how we prsent ourselves to, and are seen by, others.’ (Holes, 2011) Continue reading “Language, identity and Wikipedia: Some perspectives from the Cairo “Wikipedia in the Arab World” workshop”

The politics of truth: Who wins on Wikipedia? A study of what Wikipedia deletes and who it bans

Below is the research proposal that I wrote when I applied to the Oxford Internet Institute (OII) DPhil Programme in November last year. I’m guessing it’s going to evolve some (especially since I’m wanting to add some statistical work surrounding citations and translations between languages), but I’m really excited about it as it stands. The wonderful Dr Mark Graham is my supervisor at the OII and I’m lucky to also have Dr Chris Davies as my college advisor (I’m at Kellogg College here). Thank you to the OII for putting me forward for the Clarendon Award and to one of my heros, Bishop Desmond Tutu, for inspiring part of the award that got me here. Thanks, lastly and mostly, to Dror for inspiring me 🙂 With all these thanks it sounds like I’m at the end. But it’s only the beginning. I’m looking forward to comments and suggestions on how I might discover the answers to this question. I think I’ll certainly hear them in the months and years to come.

Download as PDF

Abstract: Wikipedia is, in many ways, the poster child of the Internet Age. It has been singled out as the ultimate working example of the collaborative power of the Internet (Shirky, Tapscott) and what Yochai Benkler calls ‘commons-based peer production’ to describe how the Internet has created radical new opportunities for how we make and exchange information, knowledge, and culture (Benkler, 2009). Part of its popularity comes from its power to influence and inform. As the sixth largest website in the world, with over million users and 90,000 active editors, Wikipedia is becoming one of the most influential reference works in history.

For every broad statement about Wikipedia, however, there are examples on the ground that hint at an alternative reality. The ideal that commentators (many of whom are not involved in editing the encyclopaedia on a daily basis) project is of a unified group of rational, detached, individual editors building a neutral, free encyclopaedia that is “the sum of all human knowledge”. But the organic nature of the encyclopaedia, its culture, politics and architecture have produced and continue to produce an encyclopaedia in which particular tactics, identities and relationships, many of which are in defiance of original rules, often prevail over reasoned and rational dialogue. Wikipedia still has a number of “dark spots”: from uneven geographies of articles written about places (Graham, 2011), to low numbers of female contributors (Lam et al, 2011) and vastly different levels of quality (Duguid, 2006). But there are other dark spots too – spots within the encyclopaedia itself: knowledges that are silenced, perspectives that are marginalised and people that are banned.

Who wins and who loses in this open environment? How do culture, politics, regulations, architecture and identity influence who wins or loses? And what does this mean for the way we think about online collaboration, its power and pitfalls? Continue reading “The politics of truth: Who wins on Wikipedia? A study of what Wikipedia deletes and who it bans”