Below is the research proposal that I wrote when I applied to the Oxford Internet Institute (OII) DPhil Programme in November last year. I’m guessing it’s going to evolve some (especially since I’m wanting to add some statistical work surrounding citations and translations between languages), but I’m really excited about it as it stands. The wonderful Dr Mark Graham is my supervisor at the OII and I’m lucky to also have Dr Chris Davies as my college advisor (I’m at Kellogg College here). Thank you to the OII for putting me forward for the Clarendon Award and to one of my heros, Bishop Desmond Tutu, for inspiring part of the award that got me here. Thanks, lastly and mostly, to Dror for inspiring me :) With all these thanks it sounds like I’m at the end. But it’s only the beginning. I’m looking forward to comments and suggestions on how I might discover the answers to this question. I think I’ll certainly hear them in the months and years to come.
Abstract: Wikipedia is, in many ways, the poster child of the Internet Age. It has been singled out as the ultimate working example of the collaborative power of the Internet (Shirky, Tapscott) and what Yochai Benkler calls ‘commons-based peer production’ to describe how the Internet has created radical new opportunities for how we make and exchange information, knowledge, and culture (Benkler, 2009). Part of its popularity comes from its power to influence and inform. As the sixth largest website in the world, with over million users and 90,000 active editors, Wikipedia is becoming one of the most influential reference works in history.
For every broad statement about Wikipedia, however, there are examples on the ground that hint at an alternative reality. The ideal that commentators (many of whom are not involved in editing the encyclopaedia on a daily basis) project is of a unified group of rational, detached, individual editors building a neutral, free encyclopaedia that is “the sum of all human knowledge”. But the organic nature of the encyclopaedia, its culture, politics and architecture have produced and continue to produce an encyclopaedia in which particular tactics, identities and relationships, many of which are in defiance of original rules, often prevail over reasoned and rational dialogue. Wikipedia still has a number of “dark spots”: from uneven geographies of articles written about places (Graham, 2011), to low numbers of female contributors (Lam et al, 2011) and vastly different levels of quality (Duguid, 2006). But there are other dark spots too – spots within the encyclopaedia itself: knowledges that are silenced, perspectives that are marginalised and people that are banned.
Who wins and who loses in this open environment? How do culture, politics, regulations, architecture and identity influence who wins or loses? And what does this mean for the way we think about online collaboration, its power and pitfalls?
I hope to answer these questions in an ethnographic study of Wikipedia’s marginalised knowledges, its deleted pages and banned users. Ethnography, a rich bouquet of methods that stresses the importance of theory grounded in the everyday experience of users, offers an opportunity for methodological innovation and a new lens for looking at Wikipedia’s missing pieces. Using Burrell’s “field site as network” (2009) I will construct a field site that retains coherence even though it traverses multiple language editions and places. Beginning with the story of banned editor of Hebrew, English and Arabic versions, “drork”, I will go on to using “trace ethnography” (Geiger and Ribes, 2011) to analyse and visualize trends in deleted articles, history pages, banned users, related talk pages, bots and mailing lists, and iteratively zoom back in to the ground-level reality to interview other banned users and those involved in deleted pages discussions.
In doing so, I hope to shed light on what I believe is one of the most important questions of our time: does new technology offer an opportunity for people to collaboratively develop mutually accommodating truths? What new power relations are being built around these new knowledges and this new “visibility” (Tkacz, 2007) of marginalised knowledges that Wikipedia architecture represents? Understanding where Wikipedia is failing, who it bans and what it deletes as Wikipedians who edit multiple versions of the encyclopedia navigate its rules, architecture, norms and market to build pages together is a critical piece in that puzzle.
Rationale: In the introduction to “Critical Point of View: A Wikipedia Reader” published earlier this year, Geert Lovink and Nathaniel Tkacz reflect on Wikipedia’s tenth year anniversary celebrations declaring that, although there might be new voices that comment on Wikipedia, the terms of debate about the encyclopaedia is still very narrow (Lovink and Tkacz, 2011, p.10). They write that at this junction in Wikipedia’s history what is missing is an informed, radical critique from the inside. ‘What does Wikipedia research look like when the focus is no longer on the novelties of (open) collaboration or on whether Wikipedia is trustworthy and accurate?’ (Lovink and Tkacz, 2011, p.11)
Much of the research on Wikipedia has, until fairly recently, focused on the encyclopaedia as a novel phenomenon – a surprising project in which thousands of volunteers contribute their labour for free and in which anyone can edit a page and contribute to its growth. The question seemed to be: why does Wikipedia work? Or what Yochai Benkler asks: ‘Why can fifty thousand  volunteers successfully coauthor Wikipedia, the most serious online alternative to the Encyclopedia Britannica, and then turn around and give it away for free?’ (Benkler, 2007, p.5)
But Wikipedia is no longer novel: it has become deeply ingrained into everyday life. And while the question about why Wikipedia works seems to have been answered in the light of such novelty, the question about why and where it doesn’t work seems to have been less well analysed.
The entry point to this research investigating Wikipedia’s dark spots is the story about a banned editor, drork. drork is one of the thousands of editors that Wikipedia loses each month, a number that is not being replaced as rapidly as it was in earlier years. In November, 2009, Felipe Ortega published his Ph.D thesis in which he found that the English Wikipedia had lost 49,000 editors during the first three months of 2009 as opposed to 4,900 editors during the same period in 2008  prompting calls that Wikipedia may be in decline. Recent numbers have confirmed this slow decline but we still know very little about why Wikipedia is losing more editors than it is gaining.
A long-time Wikipedia editor of the English, Hebrew and Arabic Wikipedias on Middle East topics, one of the founding members and previous board member of Wikimedia Israel, drork was for many years a model Wikipedian. A linguist by training, drork epitomized many of the ideals of Wikipedia: he was committed to the ideals of the encyclopedia, his language skills enabled him to edit across three different language versions in an area rife with disputes, he was transparent and open in his editing (drork is one of the few editors who uses his full name on his user page) and played a strong leadership role beyond editing, speaking to the media about his experiences editing Middle East topics and assisting with outreach projects to bring offline Wikipedia to countries in Africa.
After about seven years editing, drork has come to believe that Wikipedia is no longer concerned with the “truth” in the tradition of the Enlightenment that it had originally purported to serve. Losing patience with the edit wars, cabals, ganging up and victimization, drork’s passion and impatience finally got the better of him. In early 2010, in a series of Wikipedia “trials”, he was banned from editing for 24 hours on the English encyclopedia and then six months for disruptive behaviour. Unable to keep from what he admits had become an obsession, drork kept editing under a number of pseudonyms (what Wikipedians call “sockpuppets”) and was repeatedly banned until his recent lifetime ban.
drork’s story is a rich point of entry for researching how consensus is reached and where it seems never to be reached, how deadlocks are exacerbated by the norms and architecture of Wikipedia as well as what kinds of tactics are used to win in articles covering political disputes. His story also sheds light on a very different reality from the picture of the objective, detached, individual editor to show how important identity, place and tactics are to who succeeds in establishing the predominant version of an article on Wikipedia.
This research will be novel not only because it centers on the grounded stories of its editors. It will be novel also because I will traverse a number of different language versions in trying to understand power relations between the different encyclopaedias (most of the previous studies have focused on one version of the encyclopaedia or dealt with the main versions separately). It will be novel because of its focus on deleted articles, marginalised knowledges and banned users, an area that, to my knowledge, has not been studied in depth.
Methodology: At this year’s annual WikiSym, Wikipedia researcher, Cliff Lampe critiqued current research initiatives. ‘We’re studying sites, we should be studying people. Wikipedians didn’t come to Wikipedia without baggage. Their cultural and ethnic background is important but the problem is that it’s messy and it’s hard to get that data.’
Many have used statistics and high-level visualizations to try to understand Wikipedia’s missing pieces and people (Ortega, Chi) but there are still no comprehensive on-the-ground studies about why they leave and why articles are deleted. With much of the current research trying to obtain a coherent, all-encompassing view of the encyclopedia, many of the details and divergences have been lost. Up until fairly recently, research on Wikipedia has assessed and analysed the encyclopaedia as a single, monolithic community that behaves in certain ways, is motivated by specific variables, and acts according to rules set out in its policies.
But on closer inspection, Wikipedia is more of a city than a single community. Like a city, there is constant migration from some areas to highly populated ones: while overall growth of editors is stagnating, editors editing trending news topics is on the rise, being attributed to a large portion of all traffic and edits to Wikipedia. Like a city, there are places where rules are strictly enforced and places where one can treat a red light as a yield sign: where some editors are banned because of verbal abuse but where others who do the same go about unabated. Like a city, Wikipedia needs to be understood in terms of where principles are flourishing and producing and where editors are struggling to keep a clean house, or where all the sane people have left and militia is holding down the neighbourhood.
Understanding what Wikipedia deletes and who it bans is critical to understanding what it takes to win and have your edits prevail on Wikipedia. In my recent work in this area, I have written about the experiences of Kenyan Wikipedians attempting to write an article about a local superhero in ‘The Missing Wikipedians’ (Ford, 2011) and being continuously reverted. More recently, I undertook a study of deleted articles on the English Wikipedia with Stuart Geiger, in which we found that the vast majority of the hundreds of articles that are deleted by Wikipedia administrators are not spam, vandalism or “patent nonsense” but rather articles which could be considered encyclopedic but do not fit the project’s standards (Geiger and Ford, 2011). The next step is to drill down into understanding, through a comprehensive on-the-ground study, why Wikipedians leave and why articles are deleted.
I would begin this project by setting the boundaries of the field site and defining the subjects and objects of research. But constructing a field-site becomes problematic when studying online communities like Wikipedia where subjects are not conveniently co-located in a consistent physical space. How does one study multi-sited online communities while still taking account of the powerful role of place in defining how technologies are adopted, rejected, absorbed, reflected and shaped?
Some of the most interesting new methods to solve this methodological conundrum have emerged from the field of ethnography which Jenna Burrell defines as “a complex of epistemological framings, methodological techniques, and writing practices (that) has spread into many domains and disciplines beyond its roots in cultural anthropology.” (Burrell, 2009, p.181) In order to reconcile such spatial complexities, Burrell ‘conceived of (her) field site as a network composed of fixed and moving points including space, people and objects’ (Burrell, 2009, p.189).
According to Burrell, there are two key advantages to conceptualizing the field site in this way. It enables one to develop unconventional understandings of social practices because it is a structure that can be constructed using the observable connections performed by participants. And secondly, the “field site as network” produces a continuous space that does not presume proximity or even spatiality in a physical sense.
‘Continuity does not imply homogeneity or unity; it implies connection. The continuity of a network is evident in the way that one point can (through one or more steps) connect to any other point. In a “field site as network,” the point of origin, the destination(s), the space between, and what moves or is carried along these paths is of interest.’ (Burrell, 2009, p.190)
Using Burrell’s field site as network and Marcus’s idea to “follow the person” (Marcus, 1998) I will begin the research by conceptualizing of my field site as a network composed of fixed and moving points including spaces, people and objects and Burrell suggests. I will use grounded theory, an iterative methodology that emphasizes the generation of theory from data in the process of conducting research, as a way of staying close to the lived experience of the community, building levels of abstraction directly from the data, developing theories iteratively and then gathering further data to check and refine the emerging analytic categories (Charmaz, 2006).
The story of banned editor, drork is my starting point. From there, I will conduct (further) interviews with him and those he has interacted with, using “trace ethnography” (Geiger and Riber, 2011) to analyze the pages that played a part in his banning, moving on to analyze and visualize the corpus of deleted pages on Wikipedia and interviewing other banned editors and editors involved in deleted pages.
The people under study include those who have been banned or who have left Wikipedia as well as those involved in deletion discussions (AFDs) and speedily deleted pages (CSDs). Spaces include those in which banned editors have been engaged (talk pages from articles related to the Middle East, arbitration and sockpuppet investigations pages) and those that have been deleted (AFD discussions, CDSs, talk pages). Objects include the traces left behind by bots involved in administrative tasks (Geiger, 2011) as well as on Twitter as people discuss what is deleted. I am particularly interested, here, in new thinking around the particular “agency” of non-humans on Wikipedia and their role in directing what is not seen.
Conclusion: Wikipedia, by creating a new knowledge form, is also creating new politics. In a 2007 article entitled ‘Power, visibility, Wikipedia’, Nathaniel Tkacz argues that a new power relation, that of “visibility” is at play in Wikipedia. Drawing from Geleuze’s (2006) reading of the work of Michel Foucault, Tkacz shows how the edit and history functions within Wikipedia’s architecture reveal what is being silenced, and how the discussion pages become ‘a place for marginalized knowledges’ (Tkacz, 2007, p.14).
‘The task that lies ahead,’ he writes, ‘is to map the relations of power in greater detail and thus provide a new diagram of power to match this new visibility.’ (Tkacz, 2007, p.17) Understanding what Wikipedia silences is critical to understanding these new power relations and answering this question has important implications for questions we still have about peer produced or crowdsourced information. What types of personalities and identities prevail in structureless (Freeman, 1970) organisations? What is the role of architecture, identity, norms and regulations in these new relations of power? And what does this mean for the way we think about online collaboration, its power and pitfalls?
I believe that my current research and rich experience as an “insider” working in the open content and open source software communities for the past decade stands me in good stead to try to answer these questions and to present ground breaking research that enables us to move beyond statements about the novelty of peer production towards a new generation of research that has Wikipedia users and other knowledge stakeholders at their center.
Benkler, Y. (2007). The Wealth of Networks: How Social Production Transforms Markets and Freedom (p. 528). Yale University Press. Retrieved from http://www.amazon.com/Wealth-Networks-Production-Transforms-Markets/dp/0300125771
Charmaz, K. (2006). Constructing Grounded Theory: A Practical Guide through Qualitative Analysis (1st ed.). Sage Publications Ltd.
Duguid, P. (2006). Limits of self-organization: Peer production and “laws of quality.” First Monday, 11(10). Retrieved from http://frodo.lib.uic.edu/ojsjournals/index.php/fm/article/view/1405
Ford, H. (2011) ‘The Missing Wikipedians’ in Geert Lovink and Nathaniel Tkacz (eds), Critical Point of View: A Wikpedia Reader, Amsterdam: Institute of Network Cultures, 2011. ISBN: 978-90-78146-13-1.
Freeman, J. (1970). The tyranny of structurelessness. Retrieved from http://www.midiaindependente.org/media/2001/07/203242.pdf
Geiger, R.S. (2011). ‘The lives of bots’ in Geert Lovink and Nathaniel Tkacz (eds), Critical Point of View: A Wikpedia Reader, Amsterdam: Institute of Network Cultures, 2011. ISBN: 978-90-78146-13-1.
Geiger, R. S., & Ford, H. (2011). Participation in Wikipedia’s Article Deletion Processes. WikiSym 7th international symposium on wikis and open collaboration.
Geiger, R.S., & Ribes, D. (2011). Trace Ethnography: Following Coordination Through Documentary Practices. In Proceedings of the 44th Annual Hawaii International Conference on Systems Sciences. Retrieved from http://www.stuartgeiger.com/trace-ethnography-hicss-geiger-ribes.pdf
Lih, A. (2009). The Wikipedia Revolution: How a Bunch of Nobodies Created the World’s Greatest Encyclopedia [Hardcover] (p. 272). Hyperion. Retrieved from http://www.amazon.com/Wikipedia-Revolution-Nobodies-Greatest-Encyclopedia/dp/1401303714
Ortega, F. (2007). Quantitative analysis of the Wikipedia community of users. Proceedings of the 2007, 75-86. New York, New York, USA: ACM Press. doi:10.1145/1296951.1296960
Reagle, J. (2010). Good Faith Collaboration: The Culture of Wikipedia (p. 264). The MIT Press. Retrieved from http://www.amazon.com/Good-Faith-Collaboration-Foundations-Information/dp/0262014475/ref=sr_1_1?s=books&ie=UTF8&qid=1321268259&sr=1-1
Tapscott, D. (2010). Wikinomics: How Mass Collaboration Changes Everything (p. 368). Portfolio Trade; Expanded edition. Retrieved from http://www.amazon.com/Wikinomics-Mass-Collaboration-Changes-Everything/dp/B004J8HXOA/ref=pd_sim_b_7
Tkacz, Nathaniel. (2007) Power, Visibility, Wikipedia [online]. Southern Review: Communication, Politics & Culture, Vol. 40, No. 2, 2007: 5-19. Retrieved from http://search.informit.com.au/documentSummary;dn=946871107175453;res=IELHSS. ISSN: 0038-4526.
Wikimedia project at a glance. (n.d.). Retrieved November 14, 2011, from http://stats.wikimedia.org/EN/SummaryEN.htm
 Now about 90,000 active editors
 in comparison, the project lost only 4,900 editors during the same period in 2008
 Interview, 16 October, 2011
 WikiSym 2011, 7th International Symposium on Wikis and Open Collaboration