What does it mean to be a participant observer in a place like Wikipedia?

This post first appeared on Ethnography Matters on May 1.

The vision of an ethnographer physically going to a place, establishing themselves in the activities of that place, talking to people and developing deeper understandings seems so much simpler than the same activities in multifaceted spaces like Wikipedia. Researching how Wikipedians manage and verify information in rapidly evolving news articles in my latest ethnographic assignment, I sometimes wish I could simply to go the article as I would to a place, sit down and have a chat to the people around me.

Wikipedia conversations are asynchronous (sometimes with whole weeks or months between replies among editors) and it has proven extremely complicated to work out who said what when, let alone contact and to have live conversations with the editors. I’m beginning to realise how much physical presence is a part of the trust building exercise. If I want to connect with a particular Wikipedia editor, I can only email them or write a message on their talk page, and I often don’t have a lot to go on when I’m doing these things. I often don’t know where they’re from or where they live or who they really are beyond the clues they give me on their profile pages. Continue reading

Update on the Wikipedia sources project

This post first appeared on the Ushahidi blog.

Last month I presented the first results of the WikiSweeper project, an ethnographic research project to understand how Wikipedia editors track, evaluate and verify sources on rapidly evolving pages of Wikipedia, the results of which will inform ongoing development of the SwiftRiver (then Sweeper) platform. Wikipedians are some of the most sophisticated managers of online sources and we were excited to learn how they collaboratively decide which sources to use and which to dismiss in the first days of the 2011 Egyptian Revolution. In the past few months, I’ve interviewed users from the Middle East, Kenya, Mexico and the United States, studied hundreds of ‘talk pages’ from the article and analysed edits, users and references from the article, and compared these findings to what Wikipedia policy says about sources. In the end, I came up with four key findings that I’m busy refining for the upcoming report:

1.The source <original version of the article and its author> of the page can play a significant role: Wikipedia policy indicates that characteristics of the book, author and publishers of an article’s citations all affect reliability. But the 2011 Egyptian Revolution article showed how influential the Wikipedia editor who edits the first version of the page can be. Making Wikipedia editors’ reputation, edit histories etc more easily readable is a critical component to understanding points of view while editing and reading rapidly evolving Wikipedia articles. Continue reading

DataEDGE: A conversation about the future of data science

First posted at the Google Policy blog.

With all the hype around “Big Data” lately, you may be inclined to shrug it off as a business fad. But there is more to it than a buzzword. Data science is emerging as a new field, changing the ways that companies get to know their customers, governments their citizens, and relief organizations their constituents. It is a field which will demand entirely new skill sets and information professionals trained to collect, curate, combine, and analyze massive amounts of data.

Today, we create data both actively—as we socialize, conduct business, and organize online—and passively—via a host of remote sensing devices. McKinsey projects a 40% growth in global data generated annually. Companies and organizations are racing to find new ways to make sense of this data and use it to drive decision-making. In the health sector, that includes investigating the clinical and cost effectiveness of new drugs using large datasets. (McKinsey estimates that the efficient and effective use of data could provide as much as $300 billion in value to the United States healthcare sector.) In the public sector, it could mean using historical unemployment data to reduce the amount of time it takes unemployed workers to find new employment. And in the retail sector, it leads to tools that helps suppliers understand demand in stores so they know when they should restock items. Continue reading

A sociologist’s guide to trust and design

This post first appeared on Ethnography Matters

Trust. The word gets bandied about a lot when talking about the Web today. We want people to trust our systems. Companies are supposedly building “trusted computing” and “designing for trust”.

But, as sociologist Coye Cheshire, Professor at the School of Information at UC Berkeley will tell you, trust is a thing that happens between people not things. When we talk about trust in systems, we’re actually often talking about the related concepts of reliability or credibility.

Designing for trustworthiness

Take trustworthiness, for example. Trustworthiness is a characteristic that we infer based on other characteristics. It’s an assessment of a person’s future behaviour and it’s theoretically linked to concepts like perceived competence and motivations. When we think about whom to ask to watch our bags at the airport, for example, we look around and base our decision to trust someone on perceived competence (do they look like they could apprehend someone if someone tried to steal something?) and/or motivation (do they look like they need my bag or the things inside it?) Continue reading

Online reputation: it’s contextual

This post was the first in a new category for Ethnography Matters called “A day in the life”. In it, I describe a day at a workshop on online reputation that I attended, reporting on presentations and conversations with folks from Reddit and Stack Overflow, highlighting four key features of successful online reputation systems that came out of their talks.

A screenshot from Reddit.com’s sub-Redit, “SnackExchange” showing point system

We want to build a reputation system for our new SwiftRiver product at Ushahidi where members can vote on bits of relevant content related to a particular event. This meant that I was really excited about being able to spend the day yesterday at the start of a fascinating workshop on online reputation organised by a new non-profit organisation called Hypothesis. It seems that Hypothesis is attempting to build a layer on top of the Web that enables users, when encountering new information, to be able to immediately find the best thinking about that information. In the words of Hypothesis founder, Dan Whaley, “The idea is to develop a system that let’s us see quality insights and information” in order to “improve how we make decisions.” So, for example, when visiting the workshop web page, you might be able to see that people like me (if I “counted” on the reputation quality scale) have written something about that workshop or about very specific aspects of the workshop and be able to find out what they (and perhaps even I) think about it. Continue reading

Can Ushahidi Rely on Crowdsourced Verifications?

First published on PBS Idea Lab

During the aftermath of the Chilean earthquake last year, the Ushahidi-Chile team received two reports — one through the platform, the other via Twitter — that indicated an English-speaking foreigner was trapped under a building in Santiago.

“Please send help,” the report read. “i am buried under rubble in my home at Lautaro 1712 Estación Central, Santiago, Chile. My phone doesnt work.”

A few hours later, a second, similar report was sent to the platform via Twitter: “RT @biodome10: plz send help to 1712 estacion central, santiago chile. im stuck under a building with my child. #hitsunami #chile we have no supplies.”

earthquake.jpg

An investigation a few days later revealed that both reports were false and that the Twitter user was impersonating a journalist working for the Dallas Morning News. But this revelation was not in time to stop two police deployments in Santiago that leaped to the rescue before they realized that the area had not been affected by the quake and that the couple living there was alive and well.

Is false information like this one just a necessary by-product of “crowdsourced” environments like Ushahidi? Or do we need to do more to help deployment teams, emergency personnel and users better assess the accuracy of reports hosted on our platform?

Ushahidi is a non-profit tech company that develops free and open-source software for information collection, visualization and interactive mapping. We’ve just published an initial study of how Ushahidi deployment teams manage and understand verification on the platform. Doing this research has surfaced a couple of key challenges about the way that verification currently works, as well as a few easy wins that might add some flexibility into the system. It’s also revealed some questions as we look to improve the platform’s ability to do verification on large quantities of data in the future.

What We’ve Learned

We’ve learned that we need to add more flexibility into the system, enabling deployment teams to choose whether they want to use the “verified” and “unverified” tagging functionality or not. We’ve learned that the binary terms we’re currently using don’t capture other attributes of reports that are necessary to establishing both trust and “actionability” (i.e., the ability to act on the information). For example, the “unverified” tag does not capture whether a report is considered to be an act of “misinformation” or just incomplete, lacking contextual clues necessary to determine whether it is accurate or not.

We need to develop more flexibility to accommodate these different attributes, but we also need to think beyond these final determinations and understand that users might want contextual information (rather than a final determination on its verification status) to determine for themselves whether a report is trustworthy or not. After all, verification tags mean nothing unless those who must make decisions based on that information trust the team doing the verification.

The fact that many deployments are set up by teams of concerned citizens who may have never worked together before and who are therefore unknown to the user organizations makes this an important requirement. Here, we’re thinking of the job of the administering deployment team providing information about the context of a report (answering the who, what, where, when, how and why of traditional journalism perhaps) and inviting others to help flesh out this information, rather than being a “black box” in which the process for determining whether something is verified or not is opaque to users.

As an organization that is all about “crowdsourcing,” we’re taking a step back and thinking about how the crowd (i.e., people who are not known to the system) might assist in either providing more context for reports or verifying unverified reports. When I talk about the “crowd” here I’m referring to a system that’s permeable to interactions by those we don’t yet know. It’s important to note here that, although Ushahidi is talked about as an example of crowdsourcing, this doesn’t mean that the entire process of submission, publishing, tagging and commenting is open for all. Although anyone can start a map and send a report to the map, only administrators can approve and publish reports or tag a report as “verified.”

How Will Crowdsourcing Verification Work?

If we had to open up this process to “the crowd” we’d have to think really carefully about the options we might have in facilitating verification by the crowd — many of which won’t work in every deployment. Variables like scale, location and persistence differ in each deployment and can affect where and when crowdsourcing of verification will work and where it will do more harm than good.

Crowdsourcing verification can mean many different things. It could mean flagging reports that need more context and asking for more information from the crowd. But who makes the final decision that enough information has been provided to change the status of that information?

We could think of using the crowd to determine when a statistically significant portion of a community agrees with changing the status of a report to “verified.” But is this option limited to cases where a large volume of people are interested (and informed) about an issue, and could a volume-based indicator like this be gamed especially in political contexts?

Crowdsourcing verification could also mean providing users with the opportunity of using free-form tags to highlight the context of the data and then surfacing tags that are popular. But again, might this only be accurate when large numbers of users are involved and where the numbers of reports are low? Do we employ an algorithm to rank the quality of reports based on the history of their authors? It’s tempting to imagine that an algorithm alone will solve the data volume challenges, but algorithms do not work in many cases (especially when reports may be sent by people who don’t have a history of using these tools) and if they’re untrusted, they might force users to hack the system to enable their own processes.

An Enduring Question

Verification by the crowd is indeed a large and enduring question for all crowdsourced platforms, not just Ushahidi. The question is how we can facilitate better quality information in a way that reduces harms. One thing is certain: The verification challenge is both technical and social, and no algorithm, however clever, will entirely solve the problem of inaccurate or falsified information.

Thinking about the ecosystem of deployment teams, emergency personnel, users and concerned citizens and how they interact — rather than merely about a monolithic crowd — is the first place to look in understanding what verification strategy makes the most sense. After all, verification is not the ultimate goal here. Getting the right information to the right people at the right time is.

chile1.png

Image of the Basílica del Salvador in the aftermath of the Chilean earthquake courtesy of flickr user b1mbo.

Why the muggle doesn’t like the term “bounded crowdsourcing”

Patrick Meier just wrote a post explaining why the term he coined, “bounded crowdsourcing” is ‘important for crisis mapping and beyond’. He likens “bounded crowdsourcing” to “snowball sampling”, where a few trusted individuals invite other individuals who they ‘fully trust and can vouch for… And so on and so forth at an exponential rate if desired’.

I like the idea of trusted networks of people working together (actually, it seems that this technique has been used for decades in the activism community) but I have some problems with the term that has been “coined”. I guess I will be called a “muggle” but I am willing to take the plunge because a) I have never been called a “muggle” and I would like to know what it feels like and b) the “crowdsourcing” term is one I feel is worthy of a duel.

Firstly, I don’t agree with the way that Meier likens “crowdsourcing” work like Ushahidi to statistical methods. I see why he’s trying to make the comparison (to prove crowdsourcing’s value, perhaps?) but I think that it is inaccurate and actually de-values the work involved in building an Ushahidi instance. Working on an Ushahidi deployment is not the same as answering a question through statistical methods. With statistical methods, a researcher (or group of researchers) tries to answer a question or test a hypothesis. ‘Do the majority of Hispanic Americans want Obama to win a second term?’ for example. Or ‘What do Kenyans think is the best place to go on holiday?’

But Ushahidi has never been about gaining a statistically significant understanding of a question or hypothesis. It has been designed as a way for a group of concerned citizens to provide a platform for people to report on what was happening to them or around them. Sure, in many cases, we can get a general feel about the mood of a place by looking at reports, but the lack of a single question (and the power differential between those asking and those being asked), the prevalence of unstructured reports and the skewed distribution of reporters towards those most likely to reply using the technology (or attempting to game the system) make the differences much greater than the similarities.

The other problem is that the term lacks a useful definition. Meier seems to suggest that the “bounded” part refers to the fact that the work is not completely open and is limited to a network of trusted individuals. More useful would be to understand under what conditions and for what types of work different levels of openness are useful, because no crowdsourcing project is entirely “unbounded”. Meier says that he ‘introduced the concept of bounded crowdsourcing to the field of crisis mapping in response to concerns over the reliability of crowd sourced information.’ But if this means that “crowdsourced” information is unreliable, then it would be useful to understand how and when it is unreliable.

If we take the very diverse types of work required of an Ushahidi deployment, we might say that they include the need to customize the design, build the channels (sms short codes, twitter hashtags, etc), designate the themes, advertise the map, curate the reports, verify the reports, find related media reports, among others. Once we’ve broken down the different types of work, we can then decide what level of openness is required for each of these job types. I certainly don’t want to restrict the advertising of my map to the world, so I want to keep that as “unbounded” as possible. I want to ensure that there are enough people with some “ownership” of the map to keep them supporting and talking about it, so I want to give them some jobs that keep them involved. Tagging reports as “verified” is probably a more sensitive activity because it requires a set of transparent rulesets and is one of the key ways that others come to trust the map or not. So I want to ensure that trusted people, or at least those over whom I have some recourse, do this type of work. I also want to get feedback on themes and hashtags to keep it close to the people, since in the end, a map is only as good as the network that supports it. Now if I have different levels of openness for different areas of work, is my project an example of “bounded” or “unbounded” crowdsourcing?

Although I am always in favor of adding new words to the English language, I feel that the term “unbounded crowdsourcing” is unhelpful in leading us towards any greater understanding of the nuances of online work like this. Actually, I’m always surprised at the use of the term “crowdsourcing” over “peer production” in the crisis mapping community since crowdsourcing implies monetary or commercial incentivized work rather than the non-monetary incentives that characterised peer production projects like Wikipedia (see an expanded definition + examples here). I can’t imagine anyone ever “coining” the term “unbounded peer production” (but I seem to be continually surprised, so I should completely discount it from happening) and I think that this is indicative of the problems with the term.

So, yes, if we’re talking about different ways of improving the reliability of information produced on the Ushahidi platform, I’m excited to learn more about using trusted networks. I just think that if a term is being coined, it should be one that advances our understanding of what the theory is here. Is it that: if you restrict the numbers of people who can take part in writing reports, you get a more reliable result? Where do you restrict? What kind of work should be open? What do we mean by open? Automatic acceptance of Twitter reports with a certain hashtag? Or an email address that you can use to request membership? Is there a certain number that you should limit a team to (as the Skype example suggests)?

This “muggle” thinks that the term doesn’t get us any further towards understanding these (really important) questions. The “muggle” will now squeeze her eyes shut and duck.

What is the next step for Ushahidi verification?

Cross-posted from blog.ushahidi.com

As Ushahidi ethnographer, my job is to do on-the-ground research on users’ experience with our technology in particular contexts. Something that we’ve been thinking about a great deal as we develop SwiftRiver is the process of verification, the ways in which technology and society work together to create useful, trustworthy and actionable information, as well as where the technology in particular contexts might be failing.

With over 20,000 installations of Ushahidi and Crowdmap since January, 2009, Ushahidi has been used in a number of different contexts – from earthquake support in Haiti, to reports of sexism in Egypt, to election monitoring in the Sudan. In each of these cases, a map is publicized and individuals are encouraged to send reports to it. The process of verifying information reported by the crowd has taken on a variety of different forms depending on the needs and affordances of the environment and the community supporting it.

The memo I just published on scribd introduces the concept of verification, how it has evolved at Ushahidi and in sample deployments, alternative ways of thinking about verification and some suggestions for further research. Its goal is to inform developers and designers as they develop the next generation of Ushahidi and SwiftRiver software to meet the needs of our users rather than prescribing what should be done.

Ushahidi support for verification has until now been limited to a fairly simple backend categorisation system by which administrators tag reports as “verified” or “unverified”. But this is proving unmanageable for large quantities of data and may not be the most effective way of portraying the nuanced levels of verification that can practically be achieved with crowdsourced data.

What research needs to be done to test verification alternatives so that Ushahidi and Crowdmap deployers are provided with due diligence tools that can advance trust in their deployments? Can we do this in a way that doesn’t add any new barriers to entry to those who need to have their voice heard on Ushahidi? How can we ensure that this solution is as close as possible to the needs, incentive systems and motivations of deployers and users? What is the next step for Ushahidi verification?

Wikipedia Isn’t Journalism, But Are Wikipedians Reluctant Journalists?

Cross-posted from PBS Idea Lab

Wikipedia articles on breaking news stories dominate page views on the world’s sixth-largest website. Perhaps more importantly, these articles drive the most significant editor contribution — especially among new editors.

WikipediaLogo.jpg

In the first three months of this year, English Wikipedia articles with the most contributors were the 2011 Tucson shooting, the 2011 Egyptian revolution and the 2011 Tōhoku earthquake and tsunami articles with 460, 405 and 785 editors contributing to the growth of the article respectively.

Interestingly, a number of Wikipedia policies discourage writing articles on breaking news. One of Wikipedia’s 42 policies, titled “What Wikipedia is not” (or WP:NOT), highlights that the site is, above all, an encyclopedia, not a newspaper (Wikipedia:NotNewspaper). The policy states that although the encyclopedia needs to include current and up-to-date information as well as standalone articles on “significant current events,” not all verifiable events are suitable for inclusion in Wikipedia.

Wikipedia articles are not journalism

According to the policy, “Wikipedia should not offer first-hand news reports on breaking stories” because “Wikipedia is not a primary source.” The encyclopedia has a tenuous relationship with primary sources. Policy states that primary sources, “accounts written by people who are directly involved in an event, offering an insider’s view of an event” are (mostly) inappropriate because Wikipedia strives to represent a “Neutral Point of View” (NPOV), and primary sources can be misused to reflect a fringe theory as mainstream. NPOV is one of the five pillars of Wikipedia and frames to a large degree what is allowed into the encyclopedia and what is left out.

News reports on a breaking news story require that Wikipedians use primary sources to update the rapidly evolving articles on issues like death counts after an earthquake. While journalists are able to use primary sources to make a judgment on the death count at the time of publishing and then do the same using new sources when they write successive stories, Wikipedians must do the same collectively and iteratively as new versions are created every few seconds.

In the Japanese earthquake article, this challenge resulted in contradictory facts about the height of the tsunami and the death tolls in the same article, prompting one editor (“Dcoetzee”) to create templates for the number of missing and the dead casualties that could be edited once with changes immediately reflected in every part of the page (see Keegan, Gergle and Contractor’s Hot off the Wiki: Dynamics, Practices, and Structures in Wikipedia’s Coverage of the Tohoku Catastrophes).

Wikipedia articles are not news reports

The barrier to entry into Wikipedia articles is notability: Subjects must be notable enough to create enduring articles on the encyclopedia. According to policy, while news reporting covers announcements, sports news or celebrities, the fact that something is “in the news” is not a sufficient basis for inclusion in the encyclopaedia. Notability is difficult, perhaps impossible to predict directly after an event, and can result in historical events being described in purely modern terms or an article being created about something noteworthy at a particular time which later might not meet notability requirements.

Wikipedians call this “recentism” and have a tag to make it transparent to readers that the article might be skewed towards “recent perspectives.” In an essay on “recentism,” Wikipedians describe the phenomenon as “writing or editing without a long-term, historical view, thereby inflating the importance of a topic that has received recent public attention.”

Both the “Wikipedia articles are not: Journalism” and “Wikipedia articles are not: News reports” policies recommend moving timely news subjects to WikiNews, a sister project to Wikipedia that allows use of primary sources and is intended to be a primary source. But WikiNews has suffered from a low contributor base and disagreement among contributors about the best way to build the news portal.

In September, a large portion of the Wikinews contributor base announced on the Foundation-l mailing list that they had forked the project and started OpenGlobe” after becoming deeply dissatisfied with Wikinews.”

Wikipedia articles are not who’s who

The third item of “Wikipedia:NotNewspaper” explains that, even when an event is notable, individuals involved in it may not be. This policy speaks to the need for enduring articles that will still be notable in the years after the event. While newspapers are often concerned with explaining events through the people affected by such events, Wikipedia wishes to take the long-term view, attempting to avoid cases that give undue weight to the person or event and thus conflict with NPOV.

japan.jpg 

The first rough draft of history?

It took just 11 minutes for the Japanese Wikipedia to create an article after the 9.0-magnitude undersea megathrust earthquake occurred off the coast of Japan on March 11. Twenty-one minutes later, the English Wikipedia article was created, and although the wire services reported the earthquake within minutes, The New York Times did not file a full story until more than three hours after the earthquake hit.

Despite the distinct discouragement of reporting on current news item for reasons mentioned above, Wikipedia has become the site of major activity around large news events like this one. The ability of anyone to edit the encyclopedia and the lack of any restrictions on editing articles, as well as the fact that notability is a relative concept, means that Wikipedia policy cannot stop the hundreds of editors who flock to the encyclopedia driven by a single purpose to work on a particular page.

But if Wikipedia and not the news media is the first rough draft of history, what does this mean for Neutral Point of View? If Wikipedians are evaluating and synthesizing primary sources rather than sources who have already evaluated the importance of an event, is Wikipedia at the risk of becoming subjective? Consensus may be more easily achieved when the event is a natural disaster, but when it’s a war or a revolution and the editors’ motivations are different, then the same architectural flexibilities can lock articles into disagreement.

Wikipedia may be a reluctant journalist, but its influence on the media landscape is unmistakable.

New geographies

Cross-posted from Ethnography Matters

xkcd’s Updated Map of Online Communities

I arrived in Nairobi last night after an absence of about five years. As I left the plane through the walkway, I took a deep breath and inhaled the familiar southern African smell that I always miss so much living in America. I walked through to customs and baggage claim and to my taxi and hotel and became aware of all the things I was noticing: my slight frustration at the absence of instructions about which line to stand in at the immigration hall; the fact that there was not enough room for my place of birth in the immigration paperwork; the fact that, in stark contrast to the Amsterdam Schiphol Airport that I had come from, this airport seems not to have changed in a decade or so.

I noticed how long we had to wait for our bags to come through, the nationalities of the people coming here, how closely they stood next to one another. And my driver, patiently waiting for me, familiar sign in hand. On the car ride to the hotel, I looked at billboards and noticed what was being advertised and who was being represented, the state of repair of the roads and the roadside flowers and how people drive and the smells of food and industry and bodies.

I thought: Is this the collection of noticings that constitutes a place? And if what defines a place is its signposts, its boundaries, the taken-for-granted ways of doing things, the expected and the unexpected, what are the equivalents in online spaces? How do we know that we have left one space and arrived at another? How does the experience of outsiders (or n00bs) differ from that of locals?

This new way of thinking about social media (new for me, at least) came about when I was asked to speak at a conference about the ‘crucial role of social media’ in the Middle East and elsewhere. Buried in the description of the session was the question: ‘Does what happened in the London Riots diminish the power of social media?’ As I thought about what to say and what was expected of me, it struck me that the problem with the current way questions around social media are framed is that they require defining technological artefacts as good or bad, when it might be more appropriate to talk about technology as a place where good and bad things can, and do, happen.

If we frame social media as places, we can understand more fully the role of people in those places, rather than talking about the technical characteristics of Facebook or Wikipedia as determining a particular type of behaviour. Looking only at the “bad” privacy features of Facebook, for example, we are tempted to assume that “privacy is dead” because of the “forced sharing” that is happening through changes in the technology. But this view fails to represent the ways that people self-censor or move to more intimate spaces in order to protect their privacy, something I noticed in my study of privacy in an educational context, for example.

Framing social media as places enables us to realise how we move between platforms (for example, Facebook and Google+) not only because of the new shiny gadgets we find there, but because of the people who inhabit those spaces. It is the flow of people and practices that defines the place as much as it is its landscape and architectural features. Facebook, for example, is defined by particular boundaries (my page, your page, a photograph that belongs to a particular group), taken-for-granted ways of doing things that define deviance and compliance among particular groups (don’t friend your teacher, don’t send too many updates and flood your friends’ streams, don’t tag drunk pictures of friends) and artefacts (the activity stream, wall and photo albums) that, taken together, define the place.

It seems kind of obvious when you think about it, and it isn’t a new way of thinking about technology: we’ve been talking about going online and migrating from different operating systems for a while. But the fact that we’re surprised that Google+ isn’t currently teeming with people, or that more Kenyans aren’t contributing to Swahili Wikipedia, or that women make up such a small percentage of Wikipedia edits suggests that we are thinking too much of social media as things rather than as places. If we thought about Google+ as a big, shiny, new complex, we’d begin to understand that people won’t necessarily move there just because the technology is better when few of their friends are there.

The key aspect that we miss in thinking of social sites as technological artefacts is that we tend to ignore culture and power – two really big and slippery aspects of what makes certain types of people have certain types of conversations in particular online spaces, and of what defines who feels welcome or unwelcome to participate. It has caused us to define Wikipedia or Facebook at a level of granularity that isn’t deep enough to really get an understanding of what is happening there, where the power is located and how we might engineer to encourage particular creations and conversations. This is not just about understanding the affordances of the software. In order to understand Wikipedia collaboration, I can’t only look at the MediaWiki software – in the same way that to understand Kenya, I couldn’t just read about its legal framework or look at the statistics about the country. Being there, experiencing how people to speak to me, noticing what the signposts say and what they leave out, is part of the necessarily long journey toward a full understanding of the place.

Perhaps most importantly, it is the culture of a place that will dominate my decision to come back or not. And this, in its essence, is at the heart of what every online community seeks, and is the same reason why it’s so hard to control. The government of Kenya can build better roads and speak on television about being welcoming to tourists, for example, but the majority of the experience of being in Kenya as a tourist or a local, is outside of government control. Culture, we find out, is a mysterious mix of so many different qualities in varying proportions that act together to define a place. Understanding culture is probably more art than science, but however we learn about it, it’s an important part of what makes us stay in some places to become loyal nationalists or merely return as tourists.