Crowd Wisdom

I just posted the article about Ushahidi and its future challenges that was published in the Index on Censorship last month (‘Crowd Wisdom’ by Heather Ford in Index on Censorship December 2012, vol. 41, no. 4 33-39 doi: 10.1177/0306422012465800) . I wrote about Ushahidi’s emergence as a powerful tool used in countries around the world to document elections, disasters and food – among others – and the coming challenges as the majority of Ushahidi implementations remain ‘small data’ projects and as tools move towards automatic verification, something only possible with ‘Big Data’.

Why Wikipedia is no ‘proxy for culture’ (Part 1 of 3)

First posted at EthnographyMatters.net

Last month’s Wired magazine showed an infographic with a headline that read: ‘History’s most influential people, ranked by Wikipedia reach’ with a group of 20 men arranged in hierarchical order — from Jesus at number 1 to Stalin at number 20. Curious, I wondered how ‘influence’ and ‘Wikipedia reach’ was being decided. According to the article, ‘Rankings (were) based on parameters such as the number of language editions in which that person has a page, and the number of people known to speak those languages’. What really surprised me was not the particular arrangement of figures on this page but the conclusions that were being drawn from it.

According to the piece, César Hidalgo, head of the Media Lab’s Macro Connections group, who researched the data, made the following claims about the data gathered from Wikipedia:

a) “It shows you how the world perceives your own national culture.

b) “It’s a socio-cultural mirror.

c) “We use historical characters as proxies for culture.

And finally, perhaps most surprising is this final line in the story:

Using this quantitative approach, Hidalgo is now testing hypotheses such as whether cultural development is structured or random. “Can you have a Steve Jobs in a country that has not generated enough science or technology?” he wonders. “Ultimately we want to know how culture assembles itself.”

It is difficult to comment on the particular method used by this study because there is little more than the diagram and a few paragraphs of analysis, and the journalist may have misquoted him, but I wanted to draw attention to the statements being made because I think it represents the growing phenomenon of big data analysts using Wikipedia data to make assumptions about ‘culture’. Continue reading

WikiSym Redefined

Ward Cunningham, inventor of the wiki, at the first WikiSym in 2005 which was co-located with ACM OOPSLA in San Diego, California. Pic by Peter Kaminski CC BY on Flickr.

Ward Cunningham, inventor of the wiki, at the first WikiSym in 2005 which was co-located with ACM OOPSLA in San Diego, California. Pic by Peter Kaminski CC BY on Flickr.

There has been much reflecting and soul-searching about the future of WikiSym in the past year (and probably before that as well). Many felt that the conference was becoming dominated by Wikipedia research and that it needed to grow to encompass more research in the open source, open data and open content realm. I felt that the conference needed to attract more social scientists and qualitative researchers in order to reach more detailed understanding of Wikipedia is being integrated into everyday life.

Despite the negatives, everyone felt that WikiSym was and still is the best place for people who do research about Wikipedia and other wikis to gather and that there was a lot of promise in broadening our mandate. This is why I feel so excited about co-chairing a new dedicated Wikipedia track at next year’s WikiSym in Hong Kong along with Mark Graham, also at the Oxford Internet Institute. And that’s why I was also happy that Dirk Riehle, veteren of WikiSym, is at the helm again next year, leading an effort to redesign the event around a changing research landscape.

There are a few key differences to next year’s event:

1. WikiSym 2013 will be held jointly with a new conference called ‘OpenSym’ and the entire event will consist of four tracks dedicated to different research trajectories:

  • Open collaboration (wikis, social media, etc.) research (WikiSym 2013), chaired by Jude Yew of National University of Singapore
  • Wikipedia research (WikiSym 2013), chaired jointly by myself and Mark Graham of the Oxford Internet Institute at the University of Oxford
  • Free, libre, and open source software research (OpenSym 2013), chaired jointly by Jesus M. Gonzalez-Barahona and Gregorio Robles of Universidad Rey Juan Carlos
  • Open access, data, and government research (OpenSym 2013), chaired by Anne Fitzgerald of Queensland University of Technology

This means that Mark and I can focus on getting the very best of Wikipedia research to WikiSym and in thinking hard about what is missing and what needs to be encouraged in the years to come. Continue reading

Language, identity and Wikipedia: Some perspectives from the Cairo “Wikipedia in the Arab World” workshop

Mark Graham talks about the stated goal of Wikipedia to become the “sum of all human knowledge” while Ahmed Medat waits to translate into Arabic

It was the end of the final day of our workshop on the outskirts of Cairo and we were all feeling that curious mixture of inspiration, energy and exhaustion that follows those meetings where a world of ideas and people and things are thrown together in a concentrated few days. Mark Graham asked each of us if we’d like to say a few parting words and the participants spoke about how they enjoyed meeting Wikipedians from so many places in the Middle East, that they were happy to come to an event with academics and that they were excited about doing something to make a change in the real world. The majority of participants spoke in English – what was for many of them a third or fourth language – while some had their Arabic translated on the fly by other participants.

I was surprised when we got round to Mohamed Amarochan, Wikipedian, Mozilla hacker and blogger from Morocco, when he said that he would like to speak in Arabic. I knew that Mohamed had a really good command of English because I’d spent a fascinating ride with him from the airport on the way to the workshop where we commiserated with one another about visa hardships. When he chose to speak in Arabic and allowed others to translate into English, I realized that Mohammed was making an important statement about how small decisions like which language you choose to speak in a conversation like this one has big consequences.

As Clive Holes writes, ‘How we speak is an important part of who we are: in a sense, speech is the oral counterpart of how we dress. Both are intimately linked to our sense of self, and of how we prsent ourselves to, and are seen by, others.’ (Holes, 2011) Continue reading

The politics of truth: Who wins on Wikipedia? A study of what Wikipedia deletes and who it bans

Below is the research proposal that I wrote when I applied to the Oxford Internet Institute (OII) DPhil Programme in November last year. I’m guessing it’s going to evolve some (especially since I’m wanting to add some statistical work surrounding citations and translations between languages), but I’m really excited about it as it stands. The wonderful Dr Mark Graham is my supervisor at the OII and I’m lucky to also have Dr Chris Davies as my college advisor (I’m at Kellogg College here). Thank you to the OII for putting me forward for the Clarendon Award and to one of my heros, Bishop Desmond Tutu, for inspiring part of the award that got me here. Thanks, lastly and mostly, to Dror for inspiring me :) With all these thanks it sounds like I’m at the end. But it’s only the beginning. I’m looking forward to comments and suggestions on how I might discover the answers to this question. I think I’ll certainly hear them in the months and years to come.

Download as PDF

Abstract: Wikipedia is, in many ways, the poster child of the Internet Age. It has been singled out as the ultimate working example of the collaborative power of the Internet (Shirky, Tapscott) and what Yochai Benkler calls ‘commons-based peer production’ to describe how the Internet has created radical new opportunities for how we make and exchange information, knowledge, and culture (Benkler, 2009). Part of its popularity comes from its power to influence and inform. As the sixth largest website in the world, with over million users and 90,000 active editors, Wikipedia is becoming one of the most influential reference works in history.

For every broad statement about Wikipedia, however, there are examples on the ground that hint at an alternative reality. The ideal that commentators (many of whom are not involved in editing the encyclopaedia on a daily basis) project is of a unified group of rational, detached, individual editors building a neutral, free encyclopaedia that is “the sum of all human knowledge”. But the organic nature of the encyclopaedia, its culture, politics and architecture have produced and continue to produce an encyclopaedia in which particular tactics, identities and relationships, many of which are in defiance of original rules, often prevail over reasoned and rational dialogue. Wikipedia still has a number of “dark spots”: from uneven geographies of articles written about places (Graham, 2011), to low numbers of female contributors (Lam et al, 2011) and vastly different levels of quality (Duguid, 2006). But there are other dark spots too – spots within the encyclopaedia itself: knowledges that are silenced, perspectives that are marginalised and people that are banned.

Who wins and who loses in this open environment? How do culture, politics, regulations, architecture and identity influence who wins or loses? And what does this mean for the way we think about online collaboration, its power and pitfalls? Continue reading

A new chapter: hFord in oxFord

After four months of travel to visit friends in amazing places and visiting some wild places on my own, I have at last settled down in Oxford for my next adventure: three or four years doing my DPhil here at Oxford University. Sometimes I have to pinch myself to believe it!

This was my itinerary from June to October:

San Francisco – Johannesburg (with family) – Cape Town (with Liv) – Johannesburg – Rome (with Steph)- Falerone (with Steph and James and Jon) – Naples – Ravello – Vescovado di Murlo (with Sarah and Eric and Ellie and Helena) – Rome – Washington D.C. (for Wikimania) – Rome – Tel Aviv (with Elad) – Jerusalem – Tiberias – Ashdod – Tel Aviv – Berlin (with Vicky and Alex) – Münster (with Judy and Meinfred) – Baden-Baden – Berlin – Linz (for WikiSym) – Johannesburg – Exeter (with mom) – Padstow – Penzance – Torquay – Oxford – Painswick – Oxford (me, just me)

So many adventures were had. It wasn’t easy (it’s no surprise that the word ‘travel’ comes from the word ‘travail’, to toil, or labor) but I was surprised at how I felt like I could do this forever – wander from one place to the next, visiting friends and peeking in on their lives. Because of the visa insanity and the fact that I need a lobotomy, I didn’t have a camera (not even my iPhone!) for most of the trip. I really wanted to capture everything and so I drew a lot. This, below, was one of my favorite moments:

Image

Continue reading

Where does ethnography belong? Thoughts on WikiSym 2012

First posted at Ethnographymatters

On the first day of WikiSym last week, as we started preparing for the open space track and the crowd was being petitioned for new sessions over lunch, I suddenly thought that it might be a good idea for researchers who used ethnographic methods to get together to talk about the challenges we were facing and the successes we were having. So I took the mic and asked how many people used ethnographic methods in their research. After a few raised their hands, I announced that lunch would be spent talking about ethnography for those who were interested. Almost a dozen people – many of whom are big data analysts – came to listen and talk at a small Greek restaurant in the center of Linz. I was impressed that so many quantitative researchers came to listen and try to understand how they might integrate ethnographic methods into their research. It made me excited about the potential of ethnographic research methods in this community, but by the end of the conference, I was worried about the assumptions on which much of the research on Wikipedia is based, and at what this means for the way that we understand Wikipedia in the world. 

WikiSym (Wiki Symposium) is the annual meeting of researchers, practitioners and wiki engineers to talk about everything to do with wikis and open collaboration. Founded by the father of the wiki, Ward Cunningham and others, the conference started off as a place where wiki engineers would gather to advance the field. Seven years later, WikiSym is dominated by big data quantitative analyses of English Wikipedia.

Some participants were worried about the movement away from engineering topics (like designing better wiki platforms), while others were worried about the fact that Wikipedia (and its platform, MediaWiki) dominates the proceedings, leaving other equally valuable sites like Wikia and platforms like TikiWiki under-studied.

So, in the spirit of the times, I drew up a few rough analyses of papers presented.

It would be interesting to look at this for other years to see whether the recent Big Data trend is having an impact on Wikipedia research and whether research related to Wikipedia (rather than other open collaboration communities) is on the rise. One thing I did notice was that the demo track was a lot larger this year than the previous two years. Hopefully that is a good sign for the future because it is here that research is put into practice through the design of alternative tools. A good example is Jodi Schneider’s research on Wikipedia deletions that she then used to conceptualize alternative interfaces  that would simplify the process and help to ensure that each article would be dealt with more fairly. Continue reading

“Writing up rather than writing down”: Becoming Wikipedia Literate

Fail Whale by Flickr CC BY NC SA

Stuart Geiger and I will be presenting our paper about Wikipedia literacy in Linz, Austria for WikiSym 2012 (link below). It’s in the short paper series in which we introduce the concept of of “trace literacy”, a multi-faceted theory of literacy that sheds light on what new knowledges and organizational forms are required to improve participation in Wikipedia’s communities. The paper focuses on three short case studies about the misunderstandings resulting from article deletions in the past year and relate them to three key problems that literacy practitioner and scholar, Richard Darville outlined in his English literacy research. Two of the case studies are from interviews that we did with Kenyan Wikipedians, and the other concerns the Haymarket affair article controversy. Literacy, we believe, has a lot more to do with users being able to understand the complex traces left by experienced editors and how, where and when to argue their case, than simply learning how MediaWiki syntax works.

“Writing up rather than writing down”: Becoming Wikipedia Literate H. Ford and S. Geiger, WikiSym ’12, Aug 27–29, 2012, Linz, Austria

Beyond reliability: An ethnographic study of Wikipedia sources

First published on Ethnographymatters.net and Ushahidi.com 

Almost a year ago, I was hired by Ushahidi to work as an ethnographic researcher on a project to understand how Wikipedians managed sources during breaking news events. Ushahidi cares a great deal about this kind of work because of a new project called SwiftRiver that seeks to collect and enable the collaborative curation of streams of data from the real time web about a particular issue or event. If another Haiti earthquake happened, for example, would there be a way for us to filter out the irrelevant, the misinformation and build a stream of relevant, meaningful and accurate content about what was happening for those who needed it? And on Wikipedia’s side, could the same tools be used to help editors curate a stream of relevant sources as a team rather than individuals?

Original designs for voting a source up or down in order to determine “veracity”

When we first started thinking about the problem of filtering the web, we naturally thought of a ranking system which would rank sources according to their reliability or veracity. The algorithm would consider a variety of variables involved in determining accuracy as well as whether sources have been chosen, voted up or down by users in the past, and eventually be able to suggest sources according to the subject at hand. My job would be to determine what those variables are i.e. what were editors looking at when deciding whether to use a source or not? Continue reading

What does it mean to be a participant observer in a place like Wikipedia?

This post first appeared on Ethnography Matters on May 1.

The vision of an ethnographer physically going to a place, establishing themselves in the activities of that place, talking to people and developing deeper understandings seems so much simpler than the same activities in multifaceted spaces like Wikipedia. Researching how Wikipedians manage and verify information in rapidly evolving news articles in my latest ethnographic assignment, I sometimes wish I could simply to go the article as I would to a place, sit down and have a chat to the people around me.

Wikipedia conversations are asynchronous (sometimes with whole weeks or months between replies among editors) and it has proven extremely complicated to work out who said what when, let alone contact and to have live conversations with the editors. I’m beginning to realise how much physical presence is a part of the trust building exercise. If I want to connect with a particular Wikipedia editor, I can only email them or write a message on their talk page, and I often don’t have a lot to go on when I’m doing these things. I often don’t know where they’re from or where they live or who they really are beyond the clues they give me on their profile pages. Continue reading