February 2013: The Openness Edition

windows2

First published on ethnographymatters.net.

Last month on Ethnography Matters, we started a monthly thematic focus where each of the EM contributing editors would elicit posts about a particular theme. I kicked us off with the theme entitled ‘The Openness Edition’ where we investigated what openness means for the ethnographic community. I ended up editing some wonderful posts on the topic of openness last month – from Rachelle Annechino’s great post questioning what “informed consent” means in health research, to Jenna Burrell’s post about openaccess journals related to ethnography and Sarah Kendzior’s stimulating piece about by legitimacy and place of Internet research by anthropologists. We also had two really wonderful pieces sharing methods for more open, transparent research by Juliano Spyer (YouTube “video tags” as an open survey tool) and by Jeff Hall, Elizabeth Gin and An Xiao in their inspiring piece about how they facilitated story-building exercises with Homeless Youth in Boyle Heights (complete with PDF instructions!) Below is the editorial that I wrote at the beginning of the month where I try to tease out some of the complexities of my own relationship with the open access/open content movement. Comments welcome!

On Saturday the 12th of January, almost a month ago, I woke to news of Aaron Swartz’s death the previous day. In the days that followed, I experienced the mixed emotions that accompany such horrific moments: sadness for him and the pain he must have gone through in struggling with depression and anxiety, anger at those who had waged an exaggerated legal campaign against him, uncertainty as I posted about his death on Facebook and felt like I was trying to claim some part of him and his story, and finally resolution that I needed to clarify my own policy on open access. Continue reading

Why Wikipedia is no ‘proxy for culture’ (Part 1 of 3)

First posted at EthnographyMatters.net

Last month’s Wired magazine showed an infographic with a headline that read: ‘History’s most influential people, ranked by Wikipedia reach’ with a group of 20 men arranged in hierarchical order — from Jesus at number 1 to Stalin at number 20. Curious, I wondered how ‘influence’ and ‘Wikipedia reach’ was being decided. According to the article, ‘Rankings (were) based on parameters such as the number of language editions in which that person has a page, and the number of people known to speak those languages’. What really surprised me was not the particular arrangement of figures on this page but the conclusions that were being drawn from it.

According to the piece, César Hidalgo, head of the Media Lab’s Macro Connections group, who researched the data, made the following claims about the data gathered from Wikipedia:

a) “It shows you how the world perceives your own national culture.

b) “It’s a socio-cultural mirror.

c) “We use historical characters as proxies for culture.

And finally, perhaps most surprising is this final line in the story:

Using this quantitative approach, Hidalgo is now testing hypotheses such as whether cultural development is structured or random. “Can you have a Steve Jobs in a country that has not generated enough science or technology?” he wonders. “Ultimately we want to know how culture assembles itself.”

It is difficult to comment on the particular method used by this study because there is little more than the diagram and a few paragraphs of analysis, and the journalist may have misquoted him, but I wanted to draw attention to the statements being made because I think it represents the growing phenomenon of big data analysts using Wikipedia data to make assumptions about ‘culture’. Continue reading

Beyond reliability: An ethnographic study of Wikipedia sources

First published on Ethnographymatters.net and Ushahidi.com 

Almost a year ago, I was hired by Ushahidi to work as an ethnographic researcher on a project to understand how Wikipedians managed sources during breaking news events. Ushahidi cares a great deal about this kind of work because of a new project called SwiftRiver that seeks to collect and enable the collaborative curation of streams of data from the real time web about a particular issue or event. If another Haiti earthquake happened, for example, would there be a way for us to filter out the irrelevant, the misinformation and build a stream of relevant, meaningful and accurate content about what was happening for those who needed it? And on Wikipedia’s side, could the same tools be used to help editors curate a stream of relevant sources as a team rather than individuals?

Original designs for voting a source up or down in order to determine “veracity”

When we first started thinking about the problem of filtering the web, we naturally thought of a ranking system which would rank sources according to their reliability or veracity. The algorithm would consider a variety of variables involved in determining accuracy as well as whether sources have been chosen, voted up or down by users in the past, and eventually be able to suggest sources according to the subject at hand. My job would be to determine what those variables are i.e. what were editors looking at when deciding whether to use a source or not? Continue reading

What does it mean to be a participant observer in a place like Wikipedia?

This post first appeared on Ethnography Matters on May 1.

The vision of an ethnographer physically going to a place, establishing themselves in the activities of that place, talking to people and developing deeper understandings seems so much simpler than the same activities in multifaceted spaces like Wikipedia. Researching how Wikipedians manage and verify information in rapidly evolving news articles in my latest ethnographic assignment, I sometimes wish I could simply to go the article as I would to a place, sit down and have a chat to the people around me.

Wikipedia conversations are asynchronous (sometimes with whole weeks or months between replies among editors) and it has proven extremely complicated to work out who said what when, let alone contact and to have live conversations with the editors. I’m beginning to realise how much physical presence is a part of the trust building exercise. If I want to connect with a particular Wikipedia editor, I can only email them or write a message on their talk page, and I often don’t have a lot to go on when I’m doing these things. I often don’t know where they’re from or where they live or who they really are beyond the clues they give me on their profile pages. Continue reading

Update on the Wikipedia sources project

This post first appeared on the Ushahidi blog.

Last month I presented the first results of the WikiSweeper project, an ethnographic research project to understand how Wikipedia editors track, evaluate and verify sources on rapidly evolving pages of Wikipedia, the results of which will inform ongoing development of the SwiftRiver (then Sweeper) platform. Wikipedians are some of the most sophisticated managers of online sources and we were excited to learn how they collaboratively decide which sources to use and which to dismiss in the first days of the 2011 Egyptian Revolution. In the past few months, I’ve interviewed users from the Middle East, Kenya, Mexico and the United States, studied hundreds of ‘talk pages’ from the article and analysed edits, users and references from the article, and compared these findings to what Wikipedia policy says about sources. In the end, I came up with four key findings that I’m busy refining for the upcoming report:

1.The source <original version of the article and its author> of the page can play a significant role: Wikipedia policy indicates that characteristics of the book, author and publishers of an article’s citations all affect reliability. But the 2011 Egyptian Revolution article showed how influential the Wikipedia editor who edits the first version of the page can be. Making Wikipedia editors’ reputation, edit histories etc more easily readable is a critical component to understanding points of view while editing and reading rapidly evolving Wikipedia articles. Continue reading