What does it mean to be a participant observer in a place like Wikipedia?

This post first appeared on Ethnography Matters on May 1.

The vision of an ethnographer physically going to a place, establishing themselves in the activities of that place, talking to people and developing deeper understandings seems so much simpler than the same activities in multifaceted spaces like Wikipedia. Researching how Wikipedians manage and verify information in rapidly evolving news articles in my latest ethnographic assignment, I sometimes wish I could simply to go the article as I would to a place, sit down and have a chat to the people around me.

Wikipedia conversations are asynchronous (sometimes with whole weeks or months between replies among editors) and it has proven extremely complicated to work out who said what when, let alone contact and to have live conversations with the editors. I’m beginning to realise how much physical presence is a part of the trust building exercise. If I want to connect with a particular Wikipedia editor, I can only email them or write a message on their talk page, and I often don’t have a lot to go on when I’m doing these things. I often don’t know where they’re from or where they live or who they really are beyond the clues they give me on their profile pages. Continue reading

Update on the Wikipedia sources project

This post first appeared on the Ushahidi blog.

Last month I presented the first results of the WikiSweeper project, an ethnographic research project to understand how Wikipedia editors track, evaluate and verify sources on rapidly evolving pages of Wikipedia, the results of which will inform ongoing development of the SwiftRiver (then Sweeper) platform. Wikipedians are some of the most sophisticated managers of online sources and we were excited to learn how they collaboratively decide which sources to use and which to dismiss in the first days of the 2011 Egyptian Revolution. In the past few months, I’ve interviewed users from the Middle East, Kenya, Mexico and the United States, studied hundreds of ‘talk pages’ from the article and analysed edits, users and references from the article, and compared these findings to what Wikipedia policy says about sources. In the end, I came up with four key findings that I’m busy refining for the upcoming report:

1.The source <original version of the article and its author> of the page can play a significant role: Wikipedia policy indicates that characteristics of the book, author and publishers of an article’s citations all affect reliability. But the 2011 Egyptian Revolution article showed how influential the Wikipedia editor who edits the first version of the page can be. Making Wikipedia editors’ reputation, edit histories etc more easily readable is a critical component to understanding points of view while editing and reading rapidly evolving Wikipedia articles. Continue reading

DataEDGE: A conversation about the future of data science

First posted at the Google Policy blog.

With all the hype around “Big Data” lately, you may be inclined to shrug it off as a business fad. But there is more to it than a buzzword. Data science is emerging as a new field, changing the ways that companies get to know their customers, governments their citizens, and relief organizations their constituents. It is a field which will demand entirely new skill sets and information professionals trained to collect, curate, combine, and analyze massive amounts of data.

Today, we create data both actively—as we socialize, conduct business, and organize online—and passively—via a host of remote sensing devices. McKinsey projects a 40% growth in global data generated annually. Companies and organizations are racing to find new ways to make sense of this data and use it to drive decision-making. In the health sector, that includes investigating the clinical and cost effectiveness of new drugs using large datasets. (McKinsey estimates that the efficient and effective use of data could provide as much as $300 billion in value to the United States healthcare sector.) In the public sector, it could mean using historical unemployment data to reduce the amount of time it takes unemployed workers to find new employment. And in the retail sector, it leads to tools that helps suppliers understand demand in stores so they know when they should restock items. Continue reading

A sociologist’s guide to trust and design

This post first appeared on Ethnography Matters

Trust. The word gets bandied about a lot when talking about the Web today. We want people to trust our systems. Companies are supposedly building “trusted computing” and “designing for trust”.

But, as sociologist Coye Cheshire, Professor at the School of Information at UC Berkeley will tell you, trust is a thing that happens between people not things. When we talk about trust in systems, we’re actually often talking about the related concepts of reliability or credibility.

Designing for trustworthiness

Take trustworthiness, for example. Trustworthiness is a characteristic that we infer based on other characteristics. It’s an assessment of a person’s future behaviour and it’s theoretically linked to concepts like perceived competence and motivations. When we think about whom to ask to watch our bags at the airport, for example, we look around and base our decision to trust someone on perceived competence (do they look like they could apprehend someone if someone tried to steal something?) and/or motivation (do they look like they need my bag or the things inside it?) Continue reading

Online reputation: it’s contextual

This post was the first in a new category for Ethnography Matters called “A day in the life”. In it, I describe a day at a workshop on online reputation that I attended, reporting on presentations and conversations with folks from Reddit and Stack Overflow, highlighting four key features of successful online reputation systems that came out of their talks.

A screenshot from Reddit.com’s sub-Redit, “SnackExchange” showing point system

We want to build a reputation system for our new SwiftRiver product at Ushahidi where members can vote on bits of relevant content related to a particular event. This meant that I was really excited about being able to spend the day yesterday at the start of a fascinating workshop on online reputation organised by a new non-profit organisation called Hypothesis. It seems that Hypothesis is attempting to build a layer on top of the Web that enables users, when encountering new information, to be able to immediately find the best thinking about that information. In the words of Hypothesis founder, Dan Whaley, “The idea is to develop a system that let’s us see quality insights and information” in order to “improve how we make decisions.” So, for example, when visiting the workshop web page, you might be able to see that people like me (if I “counted” on the reputation quality scale) have written something about that workshop or about very specific aspects of the workshop and be able to find out what they (and perhaps even I) think about it. Continue reading

Can Ushahidi Rely on Crowdsourced Verifications?

First published on PBS Idea Lab

During the aftermath of the Chilean earthquake last year, the Ushahidi-Chile team received two reports — one through the platform, the other via Twitter — that indicated an English-speaking foreigner was trapped under a building in Santiago.

“Please send help,” the report read. “i am buried under rubble in my home at Lautaro 1712 Estación Central, Santiago, Chile. My phone doesnt work.”

A few hours later, a second, similar report was sent to the platform via Twitter: “RT @biodome10: plz send help to 1712 estacion central, santiago chile. im stuck under a building with my child. #hitsunami #chile we have no supplies.”

earthquake.jpg

An investigation a few days later revealed that both reports were false and that the Twitter user was impersonating a journalist working for the Dallas Morning News. But this revelation was not in time to stop two police deployments in Santiago that leaped to the rescue before they realized that the area had not been affected by the quake and that the couple living there was alive and well.

Is false information like this one just a necessary by-product of “crowdsourced” environments like Ushahidi? Or do we need to do more to help deployment teams, emergency personnel and users better assess the accuracy of reports hosted on our platform?

Ushahidi is a non-profit tech company that develops free and open-source software for information collection, visualization and interactive mapping. We’ve just published an initial study of how Ushahidi deployment teams manage and understand verification on the platform. Doing this research has surfaced a couple of key challenges about the way that verification currently works, as well as a few easy wins that might add some flexibility into the system. It’s also revealed some questions as we look to improve the platform’s ability to do verification on large quantities of data in the future.

What We’ve Learned

We’ve learned that we need to add more flexibility into the system, enabling deployment teams to choose whether they want to use the “verified” and “unverified” tagging functionality or not. We’ve learned that the binary terms we’re currently using don’t capture other attributes of reports that are necessary to establishing both trust and “actionability” (i.e., the ability to act on the information). For example, the “unverified” tag does not capture whether a report is considered to be an act of “misinformation” or just incomplete, lacking contextual clues necessary to determine whether it is accurate or not.

We need to develop more flexibility to accommodate these different attributes, but we also need to think beyond these final determinations and understand that users might want contextual information (rather than a final determination on its verification status) to determine for themselves whether a report is trustworthy or not. After all, verification tags mean nothing unless those who must make decisions based on that information trust the team doing the verification.

The fact that many deployments are set up by teams of concerned citizens who may have never worked together before and who are therefore unknown to the user organizations makes this an important requirement. Here, we’re thinking of the job of the administering deployment team providing information about the context of a report (answering the who, what, where, when, how and why of traditional journalism perhaps) and inviting others to help flesh out this information, rather than being a “black box” in which the process for determining whether something is verified or not is opaque to users.

As an organization that is all about “crowdsourcing,” we’re taking a step back and thinking about how the crowd (i.e., people who are not known to the system) might assist in either providing more context for reports or verifying unverified reports. When I talk about the “crowd” here I’m referring to a system that’s permeable to interactions by those we don’t yet know. It’s important to note here that, although Ushahidi is talked about as an example of crowdsourcing, this doesn’t mean that the entire process of submission, publishing, tagging and commenting is open for all. Although anyone can start a map and send a report to the map, only administrators can approve and publish reports or tag a report as “verified.”

How Will Crowdsourcing Verification Work?

If we had to open up this process to “the crowd” we’d have to think really carefully about the options we might have in facilitating verification by the crowd — many of which won’t work in every deployment. Variables like scale, location and persistence differ in each deployment and can affect where and when crowdsourcing of verification will work and where it will do more harm than good.

Crowdsourcing verification can mean many different things. It could mean flagging reports that need more context and asking for more information from the crowd. But who makes the final decision that enough information has been provided to change the status of that information?

We could think of using the crowd to determine when a statistically significant portion of a community agrees with changing the status of a report to “verified.” But is this option limited to cases where a large volume of people are interested (and informed) about an issue, and could a volume-based indicator like this be gamed especially in political contexts?

Crowdsourcing verification could also mean providing users with the opportunity of using free-form tags to highlight the context of the data and then surfacing tags that are popular. But again, might this only be accurate when large numbers of users are involved and where the numbers of reports are low? Do we employ an algorithm to rank the quality of reports based on the history of their authors? It’s tempting to imagine that an algorithm alone will solve the data volume challenges, but algorithms do not work in many cases (especially when reports may be sent by people who don’t have a history of using these tools) and if they’re untrusted, they might force users to hack the system to enable their own processes.

An Enduring Question

Verification by the crowd is indeed a large and enduring question for all crowdsourced platforms, not just Ushahidi. The question is how we can facilitate better quality information in a way that reduces harms. One thing is certain: The verification challenge is both technical and social, and no algorithm, however clever, will entirely solve the problem of inaccurate or falsified information.

Thinking about the ecosystem of deployment teams, emergency personnel, users and concerned citizens and how they interact — rather than merely about a monolithic crowd — is the first place to look in understanding what verification strategy makes the most sense. After all, verification is not the ultimate goal here. Getting the right information to the right people at the right time is.

chile1.png

Image of the Basílica del Salvador in the aftermath of the Chilean earthquake courtesy of flickr user b1mbo.

Why the muggle doesn’t like the term “bounded crowdsourcing”

Patrick Meier just wrote a post explaining why the term he coined, “bounded crowdsourcing” is ‘important for crisis mapping and beyond’. He likens “bounded crowdsourcing” to “snowball sampling”, where a few trusted individuals invite other individuals who they ‘fully trust and can vouch for… And so on and so forth at an exponential rate if desired’.

I like the idea of trusted networks of people working together (actually, it seems that this technique has been used for decades in the activism community) but I have some problems with the term that has been “coined”. I guess I will be called a “muggle” but I am willing to take the plunge because a) I have never been called a “muggle” and I would like to know what it feels like and b) the “crowdsourcing” term is one I feel is worthy of a duel.

Firstly, I don’t agree with the way that Meier likens “crowdsourcing” work like Ushahidi to statistical methods. I see why he’s trying to make the comparison (to prove crowdsourcing’s value, perhaps?) but I think that it is inaccurate and actually de-values the work involved in building an Ushahidi instance. Working on an Ushahidi deployment is not the same as answering a question through statistical methods. With statistical methods, a researcher (or group of researchers) tries to answer a question or test a hypothesis. ‘Do the majority of Hispanic Americans want Obama to win a second term?’ for example. Or ‘What do Kenyans think is the best place to go on holiday?’

But Ushahidi has never been about gaining a statistically significant understanding of a question or hypothesis. It has been designed as a way for a group of concerned citizens to provide a platform for people to report on what was happening to them or around them. Sure, in many cases, we can get a general feel about the mood of a place by looking at reports, but the lack of a single question (and the power differential between those asking and those being asked), the prevalence of unstructured reports and the skewed distribution of reporters towards those most likely to reply using the technology (or attempting to game the system) make the differences much greater than the similarities.

The other problem is that the term lacks a useful definition. Meier seems to suggest that the “bounded” part refers to the fact that the work is not completely open and is limited to a network of trusted individuals. More useful would be to understand under what conditions and for what types of work different levels of openness are useful, because no crowdsourcing project is entirely “unbounded”. Meier says that he ‘introduced the concept of bounded crowdsourcing to the field of crisis mapping in response to concerns over the reliability of crowd sourced information.’ But if this means that “crowdsourced” information is unreliable, then it would be useful to understand how and when it is unreliable.

If we take the very diverse types of work required of an Ushahidi deployment, we might say that they include the need to customize the design, build the channels (sms short codes, twitter hashtags, etc), designate the themes, advertise the map, curate the reports, verify the reports, find related media reports, among others. Once we’ve broken down the different types of work, we can then decide what level of openness is required for each of these job types. I certainly don’t want to restrict the advertising of my map to the world, so I want to keep that as “unbounded” as possible. I want to ensure that there are enough people with some “ownership” of the map to keep them supporting and talking about it, so I want to give them some jobs that keep them involved. Tagging reports as “verified” is probably a more sensitive activity because it requires a set of transparent rulesets and is one of the key ways that others come to trust the map or not. So I want to ensure that trusted people, or at least those over whom I have some recourse, do this type of work. I also want to get feedback on themes and hashtags to keep it close to the people, since in the end, a map is only as good as the network that supports it. Now if I have different levels of openness for different areas of work, is my project an example of “bounded” or “unbounded” crowdsourcing?

Although I am always in favor of adding new words to the English language, I feel that the term “unbounded crowdsourcing” is unhelpful in leading us towards any greater understanding of the nuances of online work like this. Actually, I’m always surprised at the use of the term “crowdsourcing” over “peer production” in the crisis mapping community since crowdsourcing implies monetary or commercial incentivized work rather than the non-monetary incentives that characterised peer production projects like Wikipedia (see an expanded definition + examples here). I can’t imagine anyone ever “coining” the term “unbounded peer production” (but I seem to be continually surprised, so I should completely discount it from happening) and I think that this is indicative of the problems with the term.

So, yes, if we’re talking about different ways of improving the reliability of information produced on the Ushahidi platform, I’m excited to learn more about using trusted networks. I just think that if a term is being coined, it should be one that advances our understanding of what the theory is here. Is it that: if you restrict the numbers of people who can take part in writing reports, you get a more reliable result? Where do you restrict? What kind of work should be open? What do we mean by open? Automatic acceptance of Twitter reports with a certain hashtag? Or an email address that you can use to request membership? Is there a certain number that you should limit a team to (as the Skype example suggests)?

This “muggle” thinks that the term doesn’t get us any further towards understanding these (really important) questions. The “muggle” will now squeeze her eyes shut and duck.