I just posted the article about Ushahidi and its future challenges that was published in the Index on Censorship last month (‘Crowd Wisdom’ by Heather Ford in Index on Censorship December 2012, vol. 41, no. 4 33-39 doi: 10.1177/0306422012465800) . I wrote about Ushahidi’s emergence as a powerful tool used in countries around the world to document elections, disasters and food – among others – and the coming challenges as the majority of Ushahidi implementations remain ‘small data’ projects and as tools move towards automatic verification, something only possible with ‘Big Data’.
First published on PBS Idea Lab
During the aftermath of the Chilean earthquake last year, the Ushahidi-Chile team received two reports — one through the platform, the other via Twitter — that indicated an English-speaking foreigner was trapped under a building in Santiago.
“Please send help,” the report read. “i am buried under rubble in my home at Lautaro 1712 Estación Central, Santiago, Chile. My phone doesnt work.”
A few hours later, a second, similar report was sent to the platform via Twitter: “RT @biodome10: plz send help to 1712 estacion central, santiago chile. im stuck under a building with my child. #hitsunami #chile we have no supplies.”
An investigation a few days later revealed that both reports were false and that the Twitter user was impersonating a journalist working for the Dallas Morning News. But this revelation was not in time to stop two police deployments in Santiago that leaped to the rescue before they realized that the area had not been affected by the quake and that the couple living there was alive and well.
Is false information like this one just a necessary by-product of “crowdsourced” environments like Ushahidi? Or do we need to do more to help deployment teams, emergency personnel and users better assess the accuracy of reports hosted on our platform?
Ushahidi is a non-profit tech company that develops free and open-source software for information collection, visualization and interactive mapping. We’ve just published an initial study of how Ushahidi deployment teams manage and understand verification on the platform. Doing this research has surfaced a couple of key challenges about the way that verification currently works, as well as a few easy wins that might add some flexibility into the system. It’s also revealed some questions as we look to improve the platform’s ability to do verification on large quantities of data in the future.
What We’ve Learned
We’ve learned that we need to add more flexibility into the system, enabling deployment teams to choose whether they want to use the “verified” and “unverified” tagging functionality or not. We’ve learned that the binary terms we’re currently using don’t capture other attributes of reports that are necessary to establishing both trust and “actionability” (i.e., the ability to act on the information). For example, the “unverified” tag does not capture whether a report is considered to be an act of “misinformation” or just incomplete, lacking contextual clues necessary to determine whether it is accurate or not.
We need to develop more flexibility to accommodate these different attributes, but we also need to think beyond these final determinations and understand that users might want contextual information (rather than a final determination on its verification status) to determine for themselves whether a report is trustworthy or not. After all, verification tags mean nothing unless those who must make decisions based on that information trust the team doing the verification.
The fact that many deployments are set up by teams of concerned citizens who may have never worked together before and who are therefore unknown to the user organizations makes this an important requirement. Here, we’re thinking of the job of the administering deployment team providing information about the context of a report (answering the who, what, where, when, how and why of traditional journalism perhaps) and inviting others to help flesh out this information, rather than being a “black box” in which the process for determining whether something is verified or not is opaque to users.
As an organization that is all about “crowdsourcing,” we’re taking a step back and thinking about how the crowd (i.e., people who are not known to the system) might assist in either providing more context for reports or verifying unverified reports. When I talk about the “crowd” here I’m referring to a system that’s permeable to interactions by those we don’t yet know. It’s important to note here that, although Ushahidi is talked about as an example of crowdsourcing, this doesn’t mean that the entire process of submission, publishing, tagging and commenting is open for all. Although anyone can start a map and send a report to the map, only administrators can approve and publish reports or tag a report as “verified.”
How Will Crowdsourcing Verification Work?
If we had to open up this process to “the crowd” we’d have to think really carefully about the options we might have in facilitating verification by the crowd — many of which won’t work in every deployment. Variables like scale, location and persistence differ in each deployment and can affect where and when crowdsourcing of verification will work and where it will do more harm than good.
Crowdsourcing verification can mean many different things. It could mean flagging reports that need more context and asking for more information from the crowd. But who makes the final decision that enough information has been provided to change the status of that information?
We could think of using the crowd to determine when a statistically significant portion of a community agrees with changing the status of a report to “verified.” But is this option limited to cases where a large volume of people are interested (and informed) about an issue, and could a volume-based indicator like this be gamed especially in political contexts?
Crowdsourcing verification could also mean providing users with the opportunity of using free-form tags to highlight the context of the data and then surfacing tags that are popular. But again, might this only be accurate when large numbers of users are involved and where the numbers of reports are low? Do we employ an algorithm to rank the quality of reports based on the history of their authors? It’s tempting to imagine that an algorithm alone will solve the data volume challenges, but algorithms do not work in many cases (especially when reports may be sent by people who don’t have a history of using these tools) and if they’re untrusted, they might force users to hack the system to enable their own processes.
An Enduring Question
Verification by the crowd is indeed a large and enduring question for all crowdsourced platforms, not just Ushahidi. The question is how we can facilitate better quality information in a way that reduces harms. One thing is certain: The verification challenge is both technical and social, and no algorithm, however clever, will entirely solve the problem of inaccurate or falsified information.
Thinking about the ecosystem of deployment teams, emergency personnel, users and concerned citizens and how they interact — rather than merely about a monolithic crowd — is the first place to look in understanding what verification strategy makes the most sense. After all, verification is not the ultimate goal here. Getting the right information to the right people at the right time is.
Image of the Basílica del Salvador in the aftermath of the Chilean earthquake courtesy of flickr user b1mbo.
Cross-posted from blog.ushahidi.com
As Ushahidi ethnographer, my job is to do on-the-ground research on users’ experience with our technology in particular contexts. Something that we’ve been thinking about a great deal as we develop SwiftRiver is the process of verification, the ways in which technology and society work together to create useful, trustworthy and actionable information, as well as where the technology in particular contexts might be failing.
With over 20,000 installations of Ushahidi and Crowdmap since January, 2009, Ushahidi has been used in a number of different contexts – from earthquake support in Haiti, to reports of sexism in Egypt, to election monitoring in the Sudan. In each of these cases, a map is publicized and individuals are encouraged to send reports to it. The process of verifying information reported by the crowd has taken on a variety of different forms depending on the needs and affordances of the environment and the community supporting it.
The memo I just published on scribd introduces the concept of verification, how it has evolved at Ushahidi and in sample deployments, alternative ways of thinking about verification and some suggestions for further research. Its goal is to inform developers and designers as they develop the next generation of Ushahidi and SwiftRiver software to meet the needs of our users rather than prescribing what should be done.
Ushahidi support for verification has until now been limited to a fairly simple backend categorisation system by which administrators tag reports as “verified” or “unverified”. But this is proving unmanageable for large quantities of data and may not be the most effective way of portraying the nuanced levels of verification that can practically be achieved with crowdsourced data.
What research needs to be done to test verification alternatives so that Ushahidi and Crowdmap deployers are provided with due diligence tools that can advance trust in their deployments? Can we do this in a way that doesn’t add any new barriers to entry to those who need to have their voice heard on Ushahidi? How can we ensure that this solution is as close as possible to the needs, incentive systems and motivations of deployers and users? What is the next step for Ushahidi verification?