How Wikipedia’s Dr Jekyll became Mr Hyde: Vandalism, sock puppetry and the curious case of Wikipedia’s decline

This is a (very) short paper that I will be presenting at Internet Research in Denver this week. I want to write something longer about the story because I feel like it represents in many ways what I see as emblematic of so many of us who lived through our own Internet bubble: when everything seemed possible and there was nothing to lose. This is (a small slice of) Drork’s story. 

Richard Mansfield starring in The Strange Case of Dr. Jekyll and Mr. Hyde. Wikipedia. Public Domain.
Abstract This paper concerns the rise and fall of Wikipedia editor, ‘drork’ who was blocked indefinitely from the English version of the encyclopedia after seven years of constructive contributions, movement leadership and intense engagement. It acts as a companion piece to the recent statistical analyses of patterns of conflict and vandalism on Wikipedia to reflect on the questions of why someone who was once committed to the encyclopedia may want to vandalize it. The paper compares two perspectives on the experience of being a Wikipedian: on the other hand, a virtuous experience that enables positive character formation as more commonly espoused, and alternatively as an experience dominated by in-fighting, personal attacks and the use of Wikipedia to express political goals. It concludes by arguing that the latter behavior is necessary in order to survive as a Wikipedian editing in these highly conflict-ridden areas.


Recent scholarship has painted two competing pictures of what Wikipedia and Wikipedians are "like" and what they are motivated by. On the one hand, Benkler and Nissenbaum argue that because people contribute to projects like Wikipedia with motivations "ranging from the pure pleasure of creation, to a particular sense of purpose, through to the companionship and social relations that grow around a common enterprise", the practice of commons-based peer production fosters virtue and enables "positive character formation" (Benkler and Nissenbaum, 2006). On the other hand, we have heard more recently about how "free and open" communities like Wikipedia have become a haven for aggressive, intimidating behavior (Reagle, 2013) and that reversions of newcomers' contributions has been growing steadily and may be contributing to Wikipedia's decline (Halfaker, Geiger, Morgan, & Riedl, in-press).

Why Wikipedia is no ‘proxy for culture’ (Part 1 of 3)

Last month’s Wired magazine showed an infographic with a headline that read: ‘History’s most influential people, ranked by Wikipedia reach’ with a group of 20 men arranged in hierarchical order — from Jesus at number 1 to Stalin at number 20. Curious, I wondered how ‘influence’ and ‘Wikipedia reach’ was being decided. According to the article, ‘Rankings (were) based on parameters such as the number of language editions in which that person has a page, and the number of people known to speak those languages’. What really surprised me was not the particular arrangement of figures on this page but the conclusions that were being drawn from it.

According to the piece, César Hidalgo, head of the Media Lab’s Macro Connections group, who researched the data, made the following claims about the data gathered from Wikipedia:

a) “It shows you how the world perceives your own national culture.

b) “It’s a socio-cultural mirror.

c) “We use historical characters as proxies for culture.

And finally, perhaps most surprising is this final line in the story:

Using this quantitative approach, Hidalgo is now testing hypotheses such as whether cultural development is structured or random. “Can you have a Steve Jobs in a country that has not generated enough science or technology?” he wonders. “Ultimately we want to know how culture assembles itself.”

It is difficult to comment on the particular method used by this study because there is little more than the diagram and a few paragraphs of analysis, and the journalist may have misquoted him, but I wanted to draw attention to the statements being made because I think it represents the growing phenomenon of big data analysts using Wikipedia data to make assumptions about 'culture'.

WikiSym Redefined

Ward Cunningham, inventor of the wiki, at the first WikiSym in 2005 which was co-located with ACM OOPSLA in San Diego, California. Pic by Peter Kaminski CC BY on Flickr.
There has been much reflecting and soul-searching about the future of WikiSym in the past year (and probably before that as well). Many felt that the conference was becoming dominated by Wikipedia research and that it needed to grow to encompass more research in the open source, open data and open content realm. I felt that the conference needed to attract more social scientists and qualitative researchers in order to reach more detailed understanding of Wikipedia is being integrated into everyday life.

Despite the negatives, everyone felt that WikiSym was and still is the best place for people who do research about Wikipedia and other wikis to gather and that there was a lot of promise in broadening our mandate. This is why I feel so excited about co-chairing a new dedicated Wikipedia track at next year’s WikiSym in Hong Kong along with Mark Graham, also at the Oxford Internet Institute. And that’s why I was also happy that Dirk Riehle, veteren of WikiSym, is at the helm again next year, leading an effort to redesign the event around a changing research landscape.

There are a few key differences to next year’s event:

1. WikiSym 2013 will be held jointly with a new conference called ‘OpenSym’ and the entire event will consist of four tracks dedicated to different research trajectories:

  • Open collaboration (wikis, social media, etc.) research (WikiSym 2013), chaired by Jude Yew of National University of Singapore
  • Wikipedia research (WikiSym 2013), chaired jointly by myself and Mark Graham of the Oxford Internet Institute at the University of Oxford
  • Free, libre, and open source software research (OpenSym 2013), chaired jointly by Jesus M. Gonzalez-Barahona and Gregorio Robles of Universidad Rey Juan Carlos
  • Open access, data, and government research (OpenSym 2013), chaired by Anne Fitzgerald of Queensland University of Technology

This means that Mark and I can focus on getting the very best of Wikipedia research to WikiSym and in thinking hard about what is missing and what needs to be encouraged in the years to come.

Language, identity and Wikipedia: Some perspectives from the Cairo “Wikipedia in the Arab World” workshop

Mark Graham talks about the stated goal of Wikipedia to become the “sum of all human knowledge” while Ahmed Medat waits to translate into Arabic

It was the end of the final day of our workshop on the outskirts of Cairo and we were all feeling that curious mixture of inspiration, energy and exhaustion that follows those meetings where a world of ideas and people and things are thrown together in a concentrated few days. Mark Graham asked each of us if we’d like to say a few parting words and the participants spoke about how they enjoyed meeting Wikipedians from so many places in the Middle East, that they were happy to come to an event with academics and that they were excited about doing something to make a change in the real world. The majority of participants spoke in English – what was for many of them a third or fourth language – while some had their Arabic translated on the fly by other participants.

I was surprised when we got round to Mohamed Amarochan, Wikipedian, Mozilla hacker and blogger from Morocco, when he said that he would like to speak in Arabic. I knew that Mohamed had a really good command of English because I’d spent a fascinating ride with him from the airport on the way to the workshop where we commiserated with one another about visa hardships. When he chose to speak in Arabic and allowed others to translate into English, I realized that Mohammed was making an important statement about how small decisions like which language you choose to speak in a conversation like this one has big consequences.

As Clive Holes writes, 'How we speak is an important part of who we are: in a sense, speech is the oral counterpart of how we dress. Both are intimately linked to our sense of self, and of how we prsent ourselves to, and are seen by, others.' (Holes, 2011)

The politics of truth: Who wins on Wikipedia? A study of what Wikipedia deletes and who it bans

Below is the research proposal that I wrote when I applied to the Oxford Internet Institute (OII) DPhil Programme in November last year. I’m guessing it’s going to evolve some (especially since I’m wanting to add some statistical work surrounding citations and translations between languages), but I’m really excited about it as it stands. The wonderful Dr Mark Graham is my supervisor at the OII and I’m lucky to also have Dr Chris Davies as my college advisor (I’m at Kellogg College here). Thank you to the OII for putting me forward for the Clarendon Award and to one of my heros, Bishop Desmond Tutu, for inspiring part of the award that got me here. Thanks, lastly and mostly, to Dror for inspiring me 🙂 With all these thanks it sounds like I’m at the end. But it’s only the beginning. I’m looking forward to comments and suggestions on how I might discover the answers to this question. I think I’ll certainly hear them in the months and years to come.

Download as PDF

Abstract: Wikipedia is, in many ways, the poster child of the Internet Age. It has been singled out as the ultimate working example of the collaborative power of the Internet (Shirky, Tapscott) and what Yochai Benkler calls ‘commons-based peer production’ to describe how the Internet has created radical new opportunities for how we make and exchange information, knowledge, and culture (Benkler, 2009). Part of its popularity comes from its power to influence and inform. As the sixth largest website in the world, with over million users and 90,000 active editors, Wikipedia is becoming one of the most influential reference works in history.

For every broad statement about Wikipedia, however, there are examples on the ground that hint at an alternative reality. The ideal that commentators (many of whom are not involved in editing the encyclopaedia on a daily basis) project is of a unified group of rational, detached, individual editors building a neutral, free encyclopaedia that is “the sum of all human knowledge”. But the organic nature of the encyclopaedia, its culture, politics and architecture have produced and continue to produce an encyclopaedia in which particular tactics, identities and relationships, many of which are in defiance of original rules, often prevail over reasoned and rational dialogue. Wikipedia still has a number of “dark spots”: from uneven geographies of articles written about places (Graham, 2011), to low numbers of female contributors (Lam et al, 2011) and vastly different levels of quality (Duguid, 2006). But there are other dark spots too – spots within the encyclopaedia itself: knowledges that are silenced, perspectives that are marginalised and people that are banned.

Who wins and who loses in this open environment? How do culture, politics, regulations, architecture and identity influence who wins or loses? And what does this mean for the way we think about online collaboration, its power and pitfalls?

Where does ethnography belong? Thoughts on WikiSym 2012

On the first day of WikiSym last week, as we started preparing for the open space track and the crowd was being petitioned for new sessions over lunch, I suddenly thought that it might be a good idea for researchers who used ethnographic methods to get together to talk about the challenges we were facing and the successes we were having. So I took the mic and asked how many people used ethnographic methods in their research. After a few raised their hands, I announced that lunch would be spent talking about ethnography for those who were interested. Almost a dozen people – many of whom are big data analysts – came to listen and talk at a small Greek restaurant in the center of Linz. I was impressed that so many quantitative researchers came to listen and try to understand how they might integrate ethnographic methods into their research. It made me excited about the potential of ethnographic research methods in this community, but by the end of the conference, I was worried about the assumptions on which much of the research on Wikipedia is based, and at what this means for the way that we understand Wikipedia in the world. 

WikiSym (Wiki Symposium) is the annual meeting of researchers, practitioners and wiki engineers to talk about everything to do with wikis and open collaboration. Founded by the father of the wiki, Ward Cunningham and others, the conference started off as a place where wiki engineers would gather to advance the field. Seven years later, WikiSym is dominated by big data quantitative analyses of English Wikipedia.

Some participants were worried about the movement away from engineering topics (like designing better wiki platforms), while others were worried about the fact that Wikipedia (and its platform, MediaWiki) dominates the proceedings, leaving other equally valuable sites like Wikia and platforms like TikiWiki under-studied.

So, in the spirit of the times, I drew up a few rough analyses of papers presented.

It would be interesting to look at this for other years to see whether the recent Big Data trend is having an impact on Wikipedia research and whether research related to Wikipedia (rather than other open collaboration communities) is on the rise. One thing I did notice was that the demo track was a lot larger this year than the previous two years. Hopefully that is a good sign for the future because it is here that research is put into practice through the design of alternative tools. A good example is Jodi Schneider's research on Wikipedia deletions that she then used to conceptualize alternative interfaces  that would simplify the process and help to ensure that each article would be dealt with more fairly.

“Writing up rather than writing down”: Becoming Wikipedia Literate

Fail Whale by Flickr CC BY NC SA

Stuart Geiger and I will be presenting our paper about Wikipedia literacy in Linz, Austria for WikiSym 2012 (link below). It’s in the short paper series in which we introduce the concept of of “trace literacy”, a multi-faceted theory of literacy that sheds light on what new knowledges and organizational forms are required to improve participation in Wikipedia’s communities. The paper focuses on three short case studies about the misunderstandings resulting from article deletions in the past year and relate them to three key problems that literacy practitioner and scholar, Richard Darville outlined in his English literacy research. Two of the case studies are from interviews that we did with Kenyan Wikipedians, and the other concerns the Haymarket affair article controversy. Literacy, we believe, has a lot more to do with users being able to understand the complex traces left by experienced editors and how, where and when to argue their case, than simply learning how MediaWiki syntax works.

“Writing up rather than writing down”: Becoming Wikipedia Literate H. Ford and S. Geiger, WikiSym ’12, Aug 27–29, 2012, Linz, Austria