February 2013: The Openness Edition

windows2

First published on ethnographymatters.net.

Last month on Ethnography Matters, we started a monthly thematic focus where each of the EM contributing editors would elicit posts about a particular theme. I kicked us off with the theme entitled ‘The Openness Edition’ where we investigated what openness means for the ethnographic community. I ended up editing some wonderful posts on the topic of openness last month – from Rachelle Annechino’s great post questioning what “informed consent” means in health research, to Jenna Burrell’s post about openaccess journals related to ethnography and Sarah Kendzior’s stimulating piece about by legitimacy and place of Internet research by anthropologists. We also had two really wonderful pieces sharing methods for more open, transparent research by Juliano Spyer (YouTube “video tags” as an open survey tool) and by Jeff Hall, Elizabeth Gin and An Xiao in their inspiring piece about how they facilitated story-building exercises with Homeless Youth in Boyle Heights (complete with PDF instructions!) Below is the editorial that I wrote at the beginning of the month where I try to tease out some of the complexities of my own relationship with the open access/open content movement. Comments welcome!

On Saturday the 12th of January, almost a month ago, I woke to news of Aaron Swartz’s death the previous day. In the days that followed, I experienced the mixed emotions that accompany such horrific moments: sadness for him and the pain he must have gone through in struggling with depression and anxiety, anger at those who had waged an exaggerated legal campaign against him, uncertainty as I posted about his death on Facebook and felt like I was trying to claim some part of him and his story, and finally resolution that I needed to clarify my own policy on open access. Continue reading

DataEDGE: A conversation about the future of data science

First posted at the Google Policy blog.

With all the hype around “Big Data” lately, you may be inclined to shrug it off as a business fad. But there is more to it than a buzzword. Data science is emerging as a new field, changing the ways that companies get to know their customers, governments their citizens, and relief organizations their constituents. It is a field which will demand entirely new skill sets and information professionals trained to collect, curate, combine, and analyze massive amounts of data.

Today, we create data both actively—as we socialize, conduct business, and organize online—and passively—via a host of remote sensing devices. McKinsey projects a 40% growth in global data generated annually. Companies and organizations are racing to find new ways to make sense of this data and use it to drive decision-making. In the health sector, that includes investigating the clinical and cost effectiveness of new drugs using large datasets. (McKinsey estimates that the efficient and effective use of data could provide as much as $300 billion in value to the United States healthcare sector.) In the public sector, it could mean using historical unemployment data to reduce the amount of time it takes unemployed workers to find new employment. And in the retail sector, it leads to tools that helps suppliers understand demand in stores so they know when they should restock items. Continue reading

Why the muggle doesn’t like the term “bounded crowdsourcing”

Patrick Meier just wrote a post explaining why the term he coined, “bounded crowdsourcing” is ‘important for crisis mapping and beyond’. He likens “bounded crowdsourcing” to “snowball sampling”, where a few trusted individuals invite other individuals who they ‘fully trust and can vouch for… And so on and so forth at an exponential rate if desired’.

I like the idea of trusted networks of people working together (actually, it seems that this technique has been used for decades in the activism community) but I have some problems with the term that has been “coined”. I guess I will be called a “muggle” but I am willing to take the plunge because a) I have never been called a “muggle” and I would like to know what it feels like and b) the “crowdsourcing” term is one I feel is worthy of a duel.

Firstly, I don’t agree with the way that Meier likens “crowdsourcing” work like Ushahidi to statistical methods. I see why he’s trying to make the comparison (to prove crowdsourcing’s value, perhaps?) but I think that it is inaccurate and actually de-values the work involved in building an Ushahidi instance. Working on an Ushahidi deployment is not the same as answering a question through statistical methods. With statistical methods, a researcher (or group of researchers) tries to answer a question or test a hypothesis. ‘Do the majority of Hispanic Americans want Obama to win a second term?’ for example. Or ‘What do Kenyans think is the best place to go on holiday?’

But Ushahidi has never been about gaining a statistically significant understanding of a question or hypothesis. It has been designed as a way for a group of concerned citizens to provide a platform for people to report on what was happening to them or around them. Sure, in many cases, we can get a general feel about the mood of a place by looking at reports, but the lack of a single question (and the power differential between those asking and those being asked), the prevalence of unstructured reports and the skewed distribution of reporters towards those most likely to reply using the technology (or attempting to game the system) make the differences much greater than the similarities.

The other problem is that the term lacks a useful definition. Meier seems to suggest that the “bounded” part refers to the fact that the work is not completely open and is limited to a network of trusted individuals. More useful would be to understand under what conditions and for what types of work different levels of openness are useful, because no crowdsourcing project is entirely “unbounded”. Meier says that he ‘introduced the concept of bounded crowdsourcing to the field of crisis mapping in response to concerns over the reliability of crowd sourced information.’ But if this means that “crowdsourced” information is unreliable, then it would be useful to understand how and when it is unreliable.

If we take the very diverse types of work required of an Ushahidi deployment, we might say that they include the need to customize the design, build the channels (sms short codes, twitter hashtags, etc), designate the themes, advertise the map, curate the reports, verify the reports, find related media reports, among others. Once we’ve broken down the different types of work, we can then decide what level of openness is required for each of these job types. I certainly don’t want to restrict the advertising of my map to the world, so I want to keep that as “unbounded” as possible. I want to ensure that there are enough people with some “ownership” of the map to keep them supporting and talking about it, so I want to give them some jobs that keep them involved. Tagging reports as “verified” is probably a more sensitive activity because it requires a set of transparent rulesets and is one of the key ways that others come to trust the map or not. So I want to ensure that trusted people, or at least those over whom I have some recourse, do this type of work. I also want to get feedback on themes and hashtags to keep it close to the people, since in the end, a map is only as good as the network that supports it. Now if I have different levels of openness for different areas of work, is my project an example of “bounded” or “unbounded” crowdsourcing?

Although I am always in favor of adding new words to the English language, I feel that the term “unbounded crowdsourcing” is unhelpful in leading us towards any greater understanding of the nuances of online work like this. Actually, I’m always surprised at the use of the term “crowdsourcing” over “peer production” in the crisis mapping community since crowdsourcing implies monetary or commercial incentivized work rather than the non-monetary incentives that characterised peer production projects like Wikipedia (see an expanded definition + examples here). I can’t imagine anyone ever “coining” the term “unbounded peer production” (but I seem to be continually surprised, so I should completely discount it from happening) and I think that this is indicative of the problems with the term.

So, yes, if we’re talking about different ways of improving the reliability of information produced on the Ushahidi platform, I’m excited to learn more about using trusted networks. I just think that if a term is being coined, it should be one that advances our understanding of what the theory is here. Is it that: if you restrict the numbers of people who can take part in writing reports, you get a more reliable result? Where do you restrict? What kind of work should be open? What do we mean by open? Automatic acceptance of Twitter reports with a certain hashtag? Or an email address that you can use to request membership? Is there a certain number that you should limit a team to (as the Skype example suggests)?

This “muggle” thinks that the term doesn’t get us any further towards understanding these (really important) questions. The “muggle” will now squeeze her eyes shut and duck.

Why I won’t support Creative Commons or Wikipedia this year

It’s that time of the year again. Creative Commons and Wikipedia are working towards their fundraising goals for the coming year and asking users to donate to support the cause.

I spent the last five years working on building a global perspective on the commons and will probably spend the next working out what I did wrong. I worked directly with both organisations during this time, so it’s really sad for me to say this (and probably not very politically astute) but I feel like the only way we’re ever going to attack the problem of a lack of global agenda and global solidarity is by the funding issue. Here are my reasons in brief:

- Creative Commons (despite pressure from its international volunteers) still has a mostly male, mostly white, almost all American leadership. If CC is really committed to an international agenda, then they must at least attempt to involve a more diverse leadership in planning for the future.

- I know it’s a fundraising campaign but statements like this by Hal Abelson: ‘By supporting Creative Commons, you are helping to realize the promise of the Internet to uplift all of humanity’ leave me speechless. Until we have an international *common* agenda, until ‘all of humanity’ or at least major parts of it have ownership of this agenda (South Africa is the only African country in the CC International stable), we should feel ashamed to make statements like this.

- Wikipedia plans to spend $9.4 million in the 2009-10 financial year (up 53% from last year) and has, at last, a plan for spreading the wealth with a $295,000 new grantmaking program (that’s only 3% of spending that goes to chapters but it’s better than almost 0). Problem is that this money seems to only be going to existing chapters (there are no chapters in Africa). This means that, if you wanted money to go specifically to outreach on the African continent, you couldn’t do it since you can only donate to Wikipedia or to existing Wikipedia chapters.

I think that one of the worst things that organisations who have global goals can do is to stop people from countries who are left out of the agenda from donating money. Even if it’s just a small amount, CC and Wikipedia are perpetuating the myth that we don’t care about these issues in Africa.

My small contribution has, instead, gone to Global Voices. They spread the small amount of money that they receive pretty widely and their leadership team reaches each region at least.

The ‘Digital Open’ is now open

dologo
Friends Jess Hemerly and David Evan Harris have asked Simon Dingle and I (from SA, at least) to be judges in this awesome competition/community initiative from BoingBoing, Sun and the Institute for the Future where they work. As always, the devil is in the detail, and I really love the details of this competition – great social networking features and badges that will be unlocked when users achieve things like writing 10 comments etc. Best among the prizes (gear, tech, bags etc) is that winners in each category will be featured on BoingBoing Video.

Institute for the Future, in partnership with Sun Microsystems and Boing Boing, invites youth worldwide, age 17 and under, to join us as we explore the frontiers of free and open innovation. The Digital Open: An Innovation Expo for Global Youth will celebrate projects in a variety of areas ranging from the environment, art and music to the more traditional open source domains of software and hardware.

From April 15 until August 15, 2009, we’ll accept text, photos, and videos documenting projects from young people around the world who want to contribute to the growing free and open technology community.

But the Digital Open is more than an online competition. By submitting a project, you’ll become a valuable member of a community of creative young innovators working in the exciting world of free and open technology.

Collaboration is encouraged! In addition to a variety of prizes and achievements you can earn through community participation, the top project in each category will earn a fantastic prize pack and be featured on Boing Boing Video!

The future is yours to make! Get started at http://digitalopen.org.

The organisers are looking for stewards to help get the word out and gather submissions in South Africa (one of the target countries). If you’re interested in helping out, please contact me.

The Joburg

Gil Hockman has started a rad project called ‘the joburg‘ – an open calendar for events happening in Johannesburg. It’s a total community-driven, non-commercial project – factors which I think will make it grow exponentially in the future.

According to Gil,

The way it works is a follows:
Google have a very cunning online setup called Google Calender (you can link to it very easily with or without a Gmail account). One of the features of Google Calender is that you can create a calender and share it with a whole bunch of people. Any of these people can then add events to the calender.
So we have created a Google Calender for events that are happening in Joburg and then linked it to this page, which is visible to anyone on the internet. Now, whenever anyone adds and event (a gig, exhibition, show, etc) to the calender, this site is automatically updated.
Easy peasy.
This is a free project. No one is paid and no one makes any money. There is one guy who has registered the Domain name but it only costs about R150 year and he’s cool to pay it.
How to add events:
Step 1) Email thething@thejoburg.co.za to say that you want to add an event
Step 2) You will be linked to The Joburg’s Google Calender (if you do not have a Google Calender account you will need to sign up for one, This it is very easy. All that is required in an email address – and it doesn’t even have to be on Gmail)
Step 3) Login to your Google Calender account
Step 4) Click on the appropriate date on the calender and add your event
If you need any help with this or have any comments or suggestions, email thething@thejoburg.co.za
(ps, this is not a Google project in any way but they do have loads of useful free stuff)

IEC website now available to non-Microsoft users

Great news from Tectonic about the Independent Electoral Commission’s website now being open to non-IE users. Congrats to everyone who made this happen. The hundreds of emails, blog posts and complaints to the South African Human Rights Commission has done the trick.

I love the comment by Friedel Wolff from translate.org.za below:
picture-13

Writing a feature on this for Global Voices.

The relationship between openness, competition and innovation

Stuart Theobald has written a great piece for the Sunday Times yesterday on the leak of confidential sections of the Competition Commission inquiry into the South African banking sector on wikileaks.com (which neither I nor Bekka can get to for some reason – check it out and let me know if you’re also having a problem).

Theobald writes: ‘The irony is that putting such information into the public domain may actually help the cause of competition, as the banks can take each other on, knowing much more about their competitors.’

Theobald discusses some of the information about profit margins from the report and notes that ‘banks which co-operated the most are prejudiced the most.’ I’d be really interested in seeing how the industry and consumers react to this information and what the overall effect is going to be both for those who disclosed details (including FNB and Nedbank) vs those who kept them secret.