Calling for EOIs from research assistants interested in two new projects to study the ethics of knowledge graphs

I’ve just received some funding from UTS for two pilot studies relating to the ethics of knowledge graphs. I’m looking for a research assistant (or two) to work on the two projects below. More details below.

Project 1: Communicating Uncertainty in AI Visualisations: A UTS HASS-Data Science Institute research project (October 1-December 17, 2021)

Uncertainty is inevitable in AI because the historical data used to train algorithmic models are an approximation of facts and algorithmic models cannot perfectly depict events. But AI generated results are often presented to end users as stable and unchallenged. The knowledge graph is a case in point. The knowledge graph is a popular forms of knowledge representation in AI and data science. Popularised by Google but deployed by all major platforms, knowledge graphs power the auto-completion of search results, the generation of facts in knowledge panels in search results and the provision of answers to users’ questions in digital assistants, among others.

Knowledge graphs can serve as bridges between humans and systems and generate human readable explanations. But many knowledge graphs present knowledge as undisputed and unwavering, even when they are founded on approximations. The provenance of claims is often missing, making facts appear more authoritative but meaning that users don’t have the mechanism to trace facts back to their original authors. As a result, the lack of communication of uncertainty to end-users of AI has a significant impact on the responsible use of AI, such as the evaluation of the fairness and transparency of knowledge. Communicating results in ways that accurately reflect the uncertainty of claims and the reasons for that uncertainty must be a significant part of any ethical knowledge graph system.

Yet questions remain. 1) How does uncertainty affect users’ perceptions of fairness and transparency of AI? and 2) What are effective approaches to communicate uncertainty to end-users in AI-informed decision making? This project will investigate the knowledge graph as an approach for communicating uncertainty of AI. Drawing together the expertise of HASS researchers on the social relationships embedded in knowledge systems, and Data Scientists from the DSI on the technical aspects, the project will identify ways to communicate the uncertainty of AI with knowledge graphs, including the provenance of knowledge, and the ways in which this affects user perceptions of fairness and transparency.

The RA will be responsible for working with me on a comparative analysis of knowledge graph results (UX, affordances, policies and practices) in terms of their communication of uncertainty, provenance of data sources etc. We will also collaborate with the UTS Data Science Institute on experiments on AI-informed decision making with different uncertainty and knowledge graph condition (led by the Data Science Institute team including Distinguished Professor Fang Chen and Dr Jianlong Zhou).

Project 2: The ethics of knowledge graphs pilot project (October 4 until September 30, 2022)

Mentored by Professor Mark Graham and Professor Simon Buckingham-Shum

Knowledge graphs are data structures that store facts about the world in order for them to be processed by AI systems. Ordinary people around the world encounter knowledge graphs every day. They power Google’s “knowledge panel”, for example, the fact box on the right hand side of the page that appears after a Google search and lists facts about the entity that a user is searching for. Knowledge graphs are at the centre of the world’s most powerful information retrieval systems, recommender systems and question­-answering systems including Google’s Home, Apple’s Siri and Amazon’s Alexa.

Knowledge graphs represent a significant moment in the history of knowledge technologies because they replaced the platform as a source of possible avenues for answering users’ queries with a source of singular facts. This change has not been adequately understood and its ethical implications are still largely unexamined. I began to explore knowledge graphs in a 2015 study with my PhD supervisor, Professor Mark Graham from the Oxford Internet Institute. We found that Jerusalem was represented unequivocally as the capital of Israel by Google (even though it was not internationally recognised as such by the international community at the time) and that, although Google appeared to be sourcing claims in its knowledge graph from Wikipedia, it was representing Jerusalem very differently from the nuanced view on Wikipedia. I have since been collecting stories of knowledge graph failures informally. There is a significant opportunity in the systematic exploration of such failures in outlining the key ethical challenges of knowledge graphs in ways that will impact on practitioners’ ability to counter them.

This project will result in a public database of knowledge graph failures that can be used in ethical impact assessment work and in computer and information science ethics education. It will empower engineers to develop solutions that stay close to the everyday experiences of ordinary users. This research will positively impact on the user communities that are served by everyday automated knowledge technologies.

There are three goals of this pilot project.

  1. Produce a public database of knowledge graph failures from user and journalistic stories: 
    A website featuring an online database of knowledge graph failure cases from around the world (tentatively titled “knowledgegraphfail.net”) will be produced and published in July 2022. The database will be populated by news articles and social media posts about events in which knowledge graphs that power fact boxes and digital assistants have produced erroneous claims.
  2. Produce three case studies that can be used in computer and information science ethics classes: 
    Three examples in the database will be expanded, developed and illustrated as featured case studies for use in computer and information science ethics university classes and licensed under a Creative Commons Attribution Share­Alike license that will enable onward distribution.
  3. Produce an academic paper about “Ordinary ethics of knowledge graphs for the Web” and present initial findings at a relevant digital media ethics conference. Examples of knowledge graph failures will be classified according to ethical principles identified from the user stories in the database according to the “ordinary ethics” tradition. Ordinary ethics attends to how everyday practices reveal the moral commitments embedded in peoples’ actions, in contrast to the tendency to treat ethics as a form of argument or an abstraction.

I am looking for a Research Assistant to work with me on two projects relating to knowledge and AI from a social science perspective. This will require working two or three days a week from the beginning of October for the next year, with the possibility of concentrating work between terms (but with at least one major output due before the end of the year). The projects require working closely with colleagues in Data Science and in developing materials for use in data ethics classes. There are possibilities for my co-publishing with the Assistant, depending on their skills and interests. 

Skills required:

–          Excellent writing skills; 

–          Experience with qualitative data analysis; 

–          Experience with project management;  

–          Must be organised and a self-starter.

Not necessary but a plus: 

–          Ability to read and summarise technical literature; 

–          Knowledge of critical data studies, algorithm studies and the ethics of AI literature; 

–          Experience working in interdisciplinary teams. 

Please send your CV and a brief email about why you would make a good candidate to me @ Heather.Ford@uts.edu.au by the close of 22 September. Restricted to people with work rights in Australia. 

—————————

What’s Wikipedia’s role in deciding who gets honoured?

by Heather Ford, Tamson Pietsch & Kelly Tall

On 26th of January the 2021 Australian Honours were announced. They are intended to recognise the outstanding service and contributions of Australians from all walks of life. But there continue to be questions about the extent to which the system represents the diversity of the Australian nation. As this debate continues, it is important to examine the role that the Honours system plays in shaping who and what is judged as important and its influence on other systems of recognition.

Who gets recognised in the Australian Honours?

Nominations for the Order of Australia are received from members of the public and these are assessed by a panel of representatives who judge a person’s merit in four levels of award: Companion (AC), Officer (OAC), Member (AM), and Medal (OAM). However, since its establishment in 1975, the awards have attracted criticism. A 1995 review highlighted several problems that continue to endure: from political partisanship and the under-representation of migrant and Indigenous groups, to a poor gender balance and a geographical distribution that is weighted towards urban recipients. 

The Honour a Woman project, founded in 2017, has worked to improve gender equity by supporting nominations and highlighting structural barriers to inclusion, but imbalances remain. The absence of Indigenous design and the continued presentation of the awards on the 26th January, both issues identified in 1995 as likely to “contribute to the alienation of indigenous Australians”, remain unchanged. For historians Karen Fox and Samuel Furphy, questions about “the politics of national recognition” and “what it may mean for honours to be “truly Australian” are inherent in the system itself

These ongoing questions highlight the importance of thinking about how notability and distinction are produced more broadly. The work of the Honour a Woman project points clearly to the interdependence of the Australian Honours on other systems of distinction. As their nomination guide points out, building a case for an Honours recipient requires mobilising other forms of recognition, such as in the media or through previous awards.  

Wikipedia as a recognition platform 

Wikipedia is one of these recognition platforms. Created and maintained by a community of volunteer editors (Wikipedians) using a wiki-based editing system, it is the eighth most popular website in Australia, attracting 200 million page views every month. Although Wikipedia claims to be neutral, it is not free from issues of unequal representation. Research has revealed systemic asymmetries that prioritise men, the Global North (particularly the United States and Western Europe), and those who were born in the last century. Women, minorities and Indigenous knowledges face significant barriers to entry. Between 84 and 92% of Wikipedia’s editors are male, topics with a predominantly female audience are weakly represented, and female editors have to endure a high degree of emotional labour when they encounter Wikipedia’s macho editing culture. 

Wikipedia judges a person’s notability according to external signals. According to policy, people are notable when they “have gained sufficiently significant attention by the world at large and over a period of time”, and when that attention can be verified according to what editors regard as “reliable sources”. Rather than deliberating in a panel about who should be recognised, Wikipedia editors determine notability individually and try to reach consensus with others when there are disputes. While the Order of Australia focuses on those who have rendered service to the nation, Wikipedia has a wider latitude to also include people who are notable for other reasons.

Creating recognition 

Given the problems of unequal representation faced by both the Order of Australia and Wikipedia, understanding the processes through which notability is created and projected is crucial. Given the power and influence of WIkipedia as a source of information, understanding its relationship to the Order of Australia is critical.

Analysis shows that most Honours recipients are not represented in Wikipedia.  89% of those with an Order of Australia do not have a Wikipedia page. But the higher the level of award, the more likely a recipient is to also have a page on Wikipedia. 85% of AC recipients have a page, but only 4% of those with an OAM have a page on Wikipedia (Fig. 1). The overwhelming majority of those judged notable by the Australian Honours system are unable to be found on the world’s most used encyclopedia.

Despite this disparity, the Australian Honours system does influence Wikipedia content creation in a powerful way. The announcements of the awards on the 26 January and in June every year act as direct triggers for page creation. The heat map (Fig. 2) shows page creation activity for the weeks leading up to and after the announcement week for each level of the honours. It is notable that this effect is most evident at the AO and AM levels. Recognition by the Order of Australia, acts as a stimulus for recognition on Wikipedia. 

Finally, it is clear that the Wikipedia pages created as a consequence of the Order announcements are for a very particular kind of recipient. Analysis of citation text shows that women who have received an Order of Australia for service in politics, the military, the media, academia and the law are likely to have pages on Wikipedia before the announcement of their award. However, those women who receive an honour for services to disability support, aged care, nursing and Aboriginal affairs have Wikipedia pages created after the announcement of their award. For these forms of work, which are disproportionately accorded less societal recognition in terms of pay and status, the Order of Australia is crucial in acting as a stimulus for wider recognition on Wikipedia. 

As we examine the asymmetries that characterise both the Order of Australia and Wikipedia, it is important to recognise that they are both systems that produce rather than merely reflect notability. Recognition on one platform can produce recognition on another. Whilst this can often work to reinforce existing inequalities, it is an insight that can potentially be employed to address imbalances and under-representation. The Honours system can be used as a tool that draws attention to those individuals that Wikipedia has not hitherto recognised as notable, just as Wikipedia can be a tool to accord recognition to those the Honours system leaves out.

Ultimately these tools lie in the hands of Wikipedia editors and members of the Australian public, any of whom can submit a nomination to the Order of Australia or add an article to Wikipedia. However, as the Honour a Woman initiative points out, and as Wikipedia researchers have highlighted, the ability to effectively use these tools requires understanding the processes, procedures and policies of these recognition platforms. 

Two courses of action might follow on from this. Campaigns, similar to Honour A Woman, that encourage nominations in the Order of Australia for under-represented groups might also consider authoring Wikipedia pages for nominees. At the same time, Wikipedia editors wishing to increase the diversity of representation, might look to those already recognised within the Australian Honours system (particularly at the AO, AM and OAM levels) and create Wikipedia pages for them.

__

Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International

This post first appeared on UTS’s website. A summary is also on Wikimedia’s blog, written by Pru Mitchell. This article highlights selected results from a larger study on the relationship between Wikipedia and the Order of Australia, undertaken by UTS researchers, Heather Ford and Tamson Pietsch, data analyst, Kelly Tall, and Wikimedia Australia volunteers, Alex Lum, Toby Hudson and Pru Mitchell. We acknowledge the generous support of the UTS School of Communication and Wikimedia Australia in funding this research. It was undertaken on Gadigal lands of the Eora Nation. 

Towards a “knowledge gap index” for Wikipedia

In January 2019, prompted by the Wikimedia Movement’s 2030 strategic direction, the Research team at the Wikimedia Foundation identified the need to develop a knowledge gaps index—a composite index to support the decision makers across the Wikimedia movement by providing: a framework to encourage structured and targeted brainstorming discussions; data on the state of the knowledge gaps across the Wikimedia projects that can inform decision making and assist with measuring the long term impact of large scale initiatives in the Movement.

Below is the text of my response to the first draft of the release. We were encouraged to submit examples of our own research to support our critique. The Research team responded to many of my comments in their final version, as the first step towards building the knowledge gap index.

This is a really great effort and it’s no small feat to gather all this research together in one frame. My comments all surround one missing piece in this, and that is the issue of power. For me, it is power that is missing from the document as a whole – the recognition that editing Wikipedia is as much about power as it is about internet access or demographic features. Framing Wikipedia’s problems as gaps that need to be filled is a mistake because it doesn’t enable us to see how Wikipedia is a system governed by unequal power dynamics that determine who is able to be an effective contributor. More specific comments below:

  • In Section 3, you leave out technical contributors from your definition of contributor. I understand why you might do this but I think it is a mistake as you note: “software and choices made in its design certainly are highly impactful on what types of contributors feel supported and what content is created.” As argued in my paper with Judy Wajcman (Ford, H., & Wajcman, J. (2017). ‘Anyone can edit’, not everyone does: Wikipedia’s infrastructure and the gender gap. Social Studies of Science, 47(4), 511–527. https://doi.org/10.1177/0306312717692172) gendering on Wikipedia happens at the level of infrastructure and code and it matters who is developing software tools.
  • In Section 3.1.4, you frame language fluency as less important given that “lower fluency individuals can be important for effective patrolling in small wikis [114], increase the diversity of contributors, and allow for the cross-pollination of content that might otherwise remain locked up in other languages [74]”. But it is important to recognise that there are potential problems when editors from powerful language groups (Europe and North America) contribute to small language encyclopedias (e.g. see Cebuano Wikipedia). https://www.quora.com/Why-are-there-so-many-articles-in-the-Cebuano-language-on-Wikipedia
  • In Section 2.1.7 you write about “ethnicity and race” in the context of “Sociodemographic gaps”. I worry that we have virtually no critical race scholarship of Wikipedia and that the sentence you begin with “Ethnicity and race are very contextual as to what it means about an individual’s status and access to resources” downplays the extent to which Wikipedia is a project that prioritises knowledges from white, European cultures. It seems to be a significant gap in our research, one which this strategy will not solve given the emphasis on metrics as an evaluation tool. I urge the group to discuss *this* severe gap with critical race scholars and to start a conversation about race and Wikipedia.
  • In Section 3.3.3, you write about the “tech skills gap” and the research that has found that “high internet skills are associated with an increase in awareness that Wikipedia can be edited and having edited Wikipedia” so that “edit-a-thons in particular can help to bridge this gap”. In some early work with Stuart Geiger, we noted that it isn’t just tech skills that are required to become an empowered member of the Wikimedia community. Rather, it is about “trace literacy” – “the organisational literacy that is necessary to be an empowered, literate member of the Wikimedia community”. We wrote that Literacy is a means of exercising power in Wikipedia. Keeping traces obscure help the powerful to remain in power and to keep new editors from being able to argue effectively or even to know that there is a space to argue or who to argue with in order to have their edits endure.” Our recommendation was that “Wikipedia literacy needs to engage with the social and cultural aspects of article editing, with training materials and workshops provided the space to work through particularly challenging scenarios that new editors might find themselves in and to work out how this fits within the larger organizational structure.” Again, this is about power not skills. (There is a slideshow of the paper from OpenSym conference and the paper is at https://www.opensym.org/ws2012/p21wikisym2012.pdf and https://dl.acm.org/doi/10.1145/2462932.2462954)
  • Section 4.1 looks at “Policy Gaps” although I’m not sure it is appropriate to talk about policies here as gaps? What’s missing here is notability policies and it is in the notability guidelines where the most power to keep Other voices out is exercised. More work needs to be done to investigate this but the paper above is a start (and perhaps there are others).
  • Section 4.2.2 talks about “Structured data” as a way of improving knowledge diversity and initiatives such as Abstract Wikipedia aiming to “close (the) gap”. Authors should recognise that structured data is not a panacea and that there have been critiques of these programmes within Wikipedia and by social scientists (see, for example, https://journals.sagepub.com/doi/full/10.1177/0263775816668857 Ford, H., & Graham, M. (2016). Provenance, power and place: Linked data and opaque digital geographies. Environment and Planning D: Society and Space, 34(6), 957–970. https://doi.org/10.1177/0263775816668857 open access at https://ora.ox.ac.uk/catalog/uuid:b5756cd4-6d1e-4da1-971e-37b384cd18ca/download_file?file_format=pdf&safe_filename=EPD_final.pdf&type_of_work=Journal+article)
  • The authors point to metrics and studies of the underlying causes of Wikipedia’s gaps in order to evaluate where the gaps are and where they come from. It is very important to recognise that metrics alone will not solve the problem, but I’m dismayed to see how little has been cited in terms of causes and interventions and that the only two papers cited are a literature review and a quantitative study. Quantitative research alone will not enable us to understand causes of Wikipedia’s inequality problems and qualitative and mixed methods research are, indeed, more appropriate methods for asking why questions here. For example, this study that I conducted with Wikipedians helped us to understand that the usual interventions such as editathons and training would not help to fill targeted gaps in articles relating to the South African education curriculum. Instead, the focus needed to be on bringing outsiders in – but not by forcing them to edit directly on wiki – this simply wouldn’t happen, but to find ways of negotiation required for engaging with new editor groups in the long-term project of filling Wikipedia’s gaps. Again, the focus is on the social and cultural aspects of Wikipedia and an emphasis on power. (See https://journals.sagepub.com/doi/full/10.1177/1461444818760870 and https://osf.io/preprints/socarxiv/qn5xd)
  • In terms of the methodology of this review, I noticed that the focus is on the field of “computational social science” which “tries to characterise and quantify different aspects of Wikimedia communities using a computational approach”. I strongly urge the authors to look beyond computational social science to the social science and humanities venues (including STS journals like Science, Technology and Human Values and the Social Studies of Science as well as media studies venues such as New Media and Society).

Also, I’m unsure what this document means in terms of research strategy, but I recommend three main gaps that could be addressed in a future version of this:

  1. A closer engagement with social science literature including critical data studies, media studies, STS to think about causes of Wikipedia inequality.
  2. A dialogue with critical race scholars in order to chart a research agenda to investigate this significant gap in Wikipedia research.
  3. A moment to think about the framing of the problem in terms of “gaps” is the most effective way of understanding the system-wide inequalities within Wikipedia and Wikimedia.

Finally, I believe that regular demographic surveys of Wikimedia users would be incredibly helpful for research and would move us beyond the data that we can regularly access (i.e. metrics) which, as you point out, does not reveal the diversity of our communities. I wish that I had more time to point to other research here, but I work for a university that is under severe strain at the moment and this was all I managed to find time for. I hope it is useful and I look forward to the next version of this!–Hfordsa (talk) 00:47, 30 September 2020 (UTC)

How can content moderation systems be improved?

I was invited to provide a submission to the UK’s House of Lords Communications and Digital Committee who are running an inquiry into freedom of expression online. In the submission below, I answer the Committee’s questions about content moderation systems: How can content moderation systems be improved? Are users of online platforms sufficiently able to appeal moderation decisions with which they disagree? What role should regulators play? Thanks to Lone Sorensen, Stephen Coleman and Giles Moss for their feedback and advice.

From Wikipedia, CC BY SA

Today, a handful of platforms have become keepers of the public discourse, not only for online communities but for whole nations. Platforms adjudicate the truth by forbidding some types of speech, highlighting some and ignoring others. Moderation decisions are applied undemocratically: without transparency, consistency or justification. As a result, inaccurate claims fester and grow across platforms, fuelling polarisation, hate and distrust of public institutions.

Two solutions are generally proffered in response to the growing threat of misinformation and disinformation. The first is education. Users (or more appropriately, citizen users) need to be aware of how to spot fictitious claims. The second is improved moderation. Platforms need to become better at accurately targeting harmful content while enabling content that furthers public debate and enables artistic and intellectual freedom to flourish.

The problem is that these solutions reinforce platforms as the arbiters of truth, far removed from public deliberation. According to these views, platforms make the decisions (perhaps governments step in to provide an independent appeal process) but citizens are left on the other side of the debate, having to educate themselves about how platforms operate without the power to do anything to resolve inaccuracies and lacunae. If citizens notice misinformation on a platform, they can do little to affect its removal other than to participate in obscure reporting mechanisms with no feedback about what happened to their request. If their own content is removed for going against platform rules, they aren’t provided adequate justification or the opportunity for appeal.  

Instead of seeing users only as recipients of education, we need to recognise citizen users as important agents in the moderation process. Improving moderation is not about enhancing platforms’ ability to accurately classify the quality of information. Moderation is not an end in itself – it needs to be seen as a vehicle for greater accountability. Public accountability offers the opportunity for platforms to think more creatively about how to develop moderation practices in the public interest.

Two key strategies are emerging by innovative organisations, companies and platforms involved in the development of the Internet as a public sphere. The first is enabling greater verifiability of claims made across digital platforms by user citizens. The second is ensuring adequate justification for decisions made by platforms. Both of these strategies place the emphasis on the shared problem of information quality and dampening the power of platforms to adjudicate the truth for billions of citizens around the world.

Verifiability is a quality of an information that points to its ability to be verified, confirmed or substantiated. Claims that are linked to a source that readers can check to confirm that they were authored by that source are more verifiable than those that are not. Merely existing in an external source, however, is not enough for readers to know that the claim is accurate. The claim may have been accurately cited but the original claim might be erroneous. Without being able to do the original research themselves, readers must be provided with information on which to judge whether the source is reliable or not.

Verifiability sets up a productive dynamic between readers and authors. On Wikipedia, verifiability is a core content policy and crucial to its relative resilience against misinformation. It is defined as the ability for “readers [to be] able to check that any of the information within Wikipedia articles is not just made up.” For editors, verifiability means that “all material must be attributable to reliable, published sources.” [i]

Verifiability is under threat as the Web becomes increasingly automated. Wikipedia researchers have found that factual data from Wikipedia that is surfacing as answers to user queries in digital assistants and smart search, notably by Google, does not cite Wikipedia as the source of that data [ii]. Search engines and digital assistants are becoming authoritarian gatekeepers of factual knowledge as their answers are adjudicated by algorithms that often remove the source and provide no mechanism for people to appeal decisions made. The only way that changes seem to be made is via articles written in powerful media companies but then predominantly in the US. And even then, in some cases journalists have been told to try to get others to help them train the algorithm to remove erroneous content [iii]. Platforms seem to have little control over the truths their algorithms are discharging.

At the least, verifiability should be ensured by citizen users’ ability to check the source of information being proffered. But verifiability can go much further and evade some of the problems in defining universal rules for what constitutes a “reliable source”. Verifiability should ultimately be about the ability of citizen users to make determinations (individually and collectively) about the trustworthiness of information.

Some work has started on verifying the authorship of images, for example, by surfacing metadata about its provenance. The Content Authenticity Initiative [iv](CAI), for example, is a partnership between Adobe, publishers such as the BBC and the New York Times, and platforms like Twitter, that enables citizen users to click through images they see in order to find out how those images were edited and the context of its source. In time, the CAI believes that people will be trained to look for data that helps them verify a source whenever they see startling information online, rather than to merely accept it at face value.

Tarleton Gillespie, in his book, “Custodians of the Internet [v]” (2018) suggests that “(p)latforms should make a radical commitment to turning the data they already have back to me in a legible and actionable form, everything they could tell me contextually about why a post is there and how I should assess it.” (p199) Examples include flagging to users when their posts are getting a lot of responses from possible troll accounts (with no profile image and few posts) or labelling heavily flagged content or putting it behind a clickthrough warning. Gillespie writes that these could be taken even further, to what he calls “collective lenses”. Users could categorise videos on YouTube as “sexual, violent, spammy, false, or obscene” and these tags would produce aggregate data by which users could filter their viewing experience (p199-200).

In my upcoming book about Wikipedia, I talk about platforms surfacing data that indicates the stability or instability of factual claims. Pandemics, protests, natural disasters and armed conflict are unexpected catalysts followed by a steep spike in information seeking while very little reliable information is available and consensus has not yet been built. This rift between the demand and supply of reliable information has created the perfect storm for misinformation[vi]. But rather than labelling facts as either true or false in the context of catalytic events, platforms can flag claims as stable or unstable. Instability is a quality of facts and their relation to breaking news events. Platforms have access to significant amounts of data that can signal instability: edit wars on Wikipedia or traffic spikes according to hashtags, search queries, keywords.  Rather than marking claims as either true or false, platforms can educate citizen users about the instability of claims (what Professor Noortje Marres from the University of Warwick calls “experimental facts [vii]”) that are still subject to social contestation.

A handful of publishers and platforms are experimenting with flagging posts according to their stability. Wikipedia uses human moderators to flag articles subject to breaking news, warning users that information is subject to rapid change and alteration. But automated data indicating peaks in reading and editing would be a more accurate tool for indicating instability, and one that isn’t subject to editorial politics. Platforms like Instagram automatically append tags to posts about vaccines with information from government health services [viii]. Publishers like the Guardian flag articles that are more than a year old so that users recognise that the information is possibly out of date [ix]. Factual claims in question and answer systems such as Google Knowledge Graph or Amazon’s Alexa could indicate the instability of the answers that they select, and urge citizen users to find more information in reliable, institutional resources.  

Platforms should be mandated by government to enable meaningful verifiability of content they host, giving users more control to make their own determinations. Verifiability is a critical principle for balancing the power of platforms to adjudicate the truth, but this doesn’t solve the problem of platform accountability. Even if platforms surface information to help users better adjudicate content, they still make moderation decisions to block, highlight, filter or frame. The problem is not only that they make decisions that affect the health of nations, but that they do so obscurely, inconsistently and without having to justify their decisions adequately to users.

Critical to the principle of accountability is the right to justification, as I’ve argued [x] with my colleague, Dr Giles Moss from the University of Leeds. Decisions made by platforms need to be adequately justified to those affected by those decisions. The problem is that decisions made by platforms to moderate are obscure and not adequately justified. Facebook, for example, bans users without explaining the reasons [xi]. Twitter flags tweets as “misleading” without explanation [xii]. In addition to enhancing the verifiability of content, platforms must also adequately explain their decisions beyond merely flagging content or notifying users that they have been banned.

Platforms will have to experiment with how to provide adequate justifications at scale. They will have to uncover the principles underlying the algorithms that automatically make many of these decisions. And they will have to reveal that information in meaningful ways. Governments can help by providing principles for platform justifications and enabling the independent review of a selection of decisions – not only for those who have been successful in drawing popular support or media attention [xiii], but randomly selected decisions.

Platforms moderate and will continue to moderate. We can’t prevent them making those decisions but we can improve the accountability by which they make those decisions.


[i] Wikipedia, s.v. “Wikipedia: Verifiability,” last modified January 13, 2010, https://en.wikipedia.org/wiki/Wikipedia:Verifiability.

[ii] McMahon, C., Johnson, I., & Hecht, B. (2017). The Substantial Interdependence of Wikipedia and Google: A Case Study on the Relationship Between Peer Production Communities and Information Technologies. Proceedings of the International AAAI Conference on Web and Social Media11(1). Retrieved from https://ojs.aaai.org/index.php/ICWSM/article/view/14883

[iii] https://www.nytimes.com/2017/12/16/business/google-thinks-im-dead.html

[iv] https://contentauthenticity.org/

[v] Gillespie, Tarleton. Custodians of the Internet: Platforms, content moderation, and the hidden decisions that shape social media. Yale University Press, 2018.

[vi] https://datasociety.net/library/data-voids/

[vii] Marres, Noortje. “Why we can’t have our facts back.” Engaging Science, Technology, and Society 4 (2018): 423-443.

[viii] https://about.instagram.com/blog/announcements/continuing-to-keep-people-safe-and-informed-about-covid-19

[ix] https://www.bbc.co.uk/news/technology-47799878?intlink_from_url=&

[x] https://www.elgaronline.com/view/edcoll/9781789903089/9781789903089.00019.xml

[xi] See https://www.facebook.com/help/381336705253343

[xii] E.g. see https://hotair.com/allahpundit/2021/04/18/why-did-twitter-flag-my-pro-vaccine-tweet-as-misinformation-n384095

[xiii] https://oversightboard.com/

Australian Media Literacy Research Symposium

Chris Cooper (Reset Australia), Deliana Iacoban (All Together Now), Arial Bogle (The Australian Strategic Policy Institute’s Cyber Center), myself and James Arvanitakis at the Australian Media Literacy Research Symposium, 13 April 2021, Western Sydney University Paramatta Campus

Last week, the Australian Media Literacy Research Symposium was held simultaneously in Sydney, Canberra and Brisbane. Organised by Tanya Notley, Michael Dezuanni and Sora Park, the symposium brought together representatives from civil society, government, the major platforms and research institutions interested in media literacy in Australia.

I spoke on a panel titled “Using media literacy to confront the impact of disinformation on our democracy” with Arial Bogle (The Australian Strategic Policy Institute’s Cyber Center), Deliana Iacoban (All Together Now) and Chris Cooper (Reset Australia). We had a great discussion about the problems of disinformation and what a national media literacy programme might look like in order to respond to those threats. I talked mostly about my work with Wikipedia and how I’ve been thinking not of systems detecting the truth or falsity of claims but rather their stability or instability in the wake of catalytic events. Below is the vide of the panel.

BBC Click on Wikipedia interventions

BBC Click interviewed me for a segment on possible manipulation of Wikipedia by the Chinese state (below). Manipulation of Wikipedia by states is not new. What does seem to be new here, though, is the way in which strategies for intervening in Wikipedia (both through the election of administrators and at individual articles) are so explicitly outlined.

Remember, though, that we can never know who is editing these articles. Even wikiedits bots only pick up edits within government IP address ranges. We have no way of knowing whether the person represented by that IP address in that sitting is employed by the government. The point is that there is a lot to be gained from influencing Wikipedia’s representation of people, places, events and things given Wikipedia’s prioritised role as data source for digital assistants and search engines.

It makes sense, then, that institutions (including governments, corporations and other organisations) will try to give weight to their version of the truth by taking advantage of the weak points of the peer produced encyclopedia. Guarding against that kind of manipulation is critical but not a problem that can be easily solved. More thoughts on that soon…

How Wikipedia’s Dr Jekyll became Mr Hyde: Vandalism, sock puppetry and the curious case of Wikipedia’s decline

This is a (very) short paper that I will be presenting at Internet Research in Denver this week. I want to write something longer about the story because I feel like it represents in many ways what I see as emblematic of so many of us who lived through our own Internet bubble: when everything seemed possible and there was nothing to lose. This is (a small slice of) Drork’s story. 

Richard Mansfield starring in The Strange Case of Dr. Jekyll and Mr. Hyde. Wikipedia. Public Domain.
Richard Mansfield starring in The Strange Case of Dr. Jekyll and Mr. Hyde. Wikipedia. Public Domain.

Abstract This paper concerns the rise and fall of Wikipedia editor, ‘drork’ who was blocked indefinitely from the English version of the encyclopedia after seven years of constructive contributions, movement leadership and intense engagement. It acts as a companion piece to the recent statistical analyses of patterns of conflict and vandalism on Wikipedia to reflect on the questions of why someone who was once committed to the encyclopedia may want to vandalize it. The paper compares two perspectives on the experience of being a Wikipedian: on the other hand, a virtuous experience that enables positive character formation as more commonly espoused, and alternatively as an experience dominated by in-fighting, personal attacks and the use of Wikipedia to express political goals. It concludes by arguing that the latter behavior is necessary in order to survive as a Wikipedian editing in these highly conflict-ridden areas.

Introduction

Recent scholarship has painted two competing pictures of what Wikipedia and Wikipedians are “like” and what they are motivated by. On the one hand, Benkler and Nissenbaum argue that because people contribute to projects like Wikipedia with motivations “ranging from the pure pleasure of creation, to a particular sense of purpose, through to the companionship and social relations that grow around a common enterprise”, the practice of commons-based peer production fosters virtue and enables “positive character formation” (Benkler and Nissenbaum, 2006). On the other hand, we have heard more recently about how “free and open” communities like Wikipedia have become a haven for aggressive, intimidating behavior (Reagle, 2013) and that reversions of newcomers’ contributions has been growing steadily and may be contributing to Wikipedia’s decline (Halfaker, Geiger, Morgan, & Riedl, in-press).   Continue reading “How Wikipedia’s Dr Jekyll became Mr Hyde: Vandalism, sock puppetry and the curious case of Wikipedia’s decline”

Isolated vs overlapping narratives: the story of an AFD

Editor’s Note: This month’s Stories to Action edition starts off with Heather Ford’s @hfordsa’s story on her experience of watching a story unfold on Wikipedia and in person. While working as an ethnographer at Ushahidi, Heather was in Nairobi, Kenya when she heard news of Kenya’s army invading Somolia. She found out that the article about this story was being nominated for deletion on Wikipedia because it didn’t meet the encyclopedia’s “notability” criteria. This local story became a way for Heather to understand why there was a disconnect between what Wikipedia editors and Kenyans recognised as “notable”. She argues that, although Wikipedia frowns on using social media as sources, the “word on the street” can be an important way for editors to find out what is really happening and how important the story is when it first comes out. She also talks about how her ethnographic work helped her develop insights for a report that Ushahidi would use in their plans to develop new tools for rapid real-time events. 

Heather shared this story at Microsoft’s annual Social Computing Symposium organized by Lily Cheng at NYU’s ITP. Watch the video of her talk, in which she refers to changing her mind on an article she wrote a few years ago, The Missing Wikipedians.

________________________________________________________________

A few of us were on a panel at Microsoft’s annual Social Computing Symposium led by the inimitable Tricia Wang. In an effort to reach across academic (and maybe culture) divides, Tricia urged us to spend five minutes telling a single story and what that experience made us realize about the project we were working on. It was a wonderful way of highlighting the ethnographic principle of reflexivity where the ethnographer reflects on their attitudes/thoughts/reactions in response to the experiences that they have in the field. I told this story about the misunderstandings faced by editors across geographical and cultural divides, and how I’ve come to understand Articles for Deletions (AFDs) on Wikipedia that are related to Kenya. I’ve also added thoughts that I had after the talk/conference based on what I learned here.   

npaper
In November, 2011, I arrived in Nairobi for a visit to the HQ of Ushahidi and to conduct interviews about a project I was involved with to understand how Wikipedians managed sources during rapidly evolving news events. We were trying to figure out how to build tools to help people who collaboratively curate stories about such events – especially when they are physically distant from one another. When I arrived in Nairobi, I went straight to the local supermarket and bought copies of every local newspaper. It was a big news day in the country because of reports that the Kenyan army had invaded Southern Somalia to try and root out the militant Al Shabaab terrorist group. The newspapers all showed Kenyan military tanks and other scenes from the offensive, matched by the kind of bold headlines that characterize national war coverage the world over.

A quick search on Wikipedia, and I noticed that a page had been created but that it had been nominated for deletion on the grounds that did not meet Wikipedia’s notability criteria. The nominator noted that the event was not being reported as an “invasion” but rather an “incursion” and that it was “routine” for troops from neighboring countries to cross the border for military operations.

In the next few days in Nairobi, I became steeped in the narratives around this event – on television, in newspapers, in the bars, on Twitter, and FB. I learned that the story was not actually a story about the invasion of one country by another, and that there were more salient stories that only people living in Kenya were aware of:

  1. This was a story about Kenyan military trying to prove itself: it was the first time since independence that the military had been involved in an active campaign and the country was watching to see whether they would be able to succeed.
  2. The move had been preceded by a series of harrowing stories the kidnapping of foreign aid workers and tourists on the border with southern Somalia – one of Kenya’s major tourist destinations – and the subsequent move by the British government to advise against Britons traveling to coastal areas near the Somali border. [Another narrative that Mark Kaigwa pointed out was that some Kenyans believed that this was a move by the government to prevent spending cuts to the military, and that, as an election year in Kenya, they wanted to prove themselves]
  3. There were threats of retaliation by al Shabaab – many sympathizers of whom were living inside Kenya. I remember sitting in a bar with friends and remarking how quiet it was. My friends answered that everyone had been urged not to go out – and especially not to bars because of the threat of attacks at which point I wondered aloud why we were there. Al Shabaab acted on those threats at a bar in the city center only a few miles away from us that night.

I used to think that these kind of deletions were just an example of ignorance, of cultural imperialism and even of racism. Although some of the responses could definitely be viewed that way, the editor who nominated the article for deletion, Middayexpress, was engaged in the AfD (Articles for Deletion) discussion, and has contributed the highest number of edits. His/her actions could not be explained by ignorance and bad faith alone.

What I realized when I was interviewing Wikipedians about these and other articles that were threatened with deletion for so-called “lack of notability” was that editors in countries outside of Kenya didn’t have access to these narratives that would make it obvious that this event was notable enough to deserve its own page. People outside of Kenya would have seen the single narrative about the incursion/invasion without any of these supporting narratives that made this stand out in Kenya as obviously important in the history of the country.

The Facebook page for Operation Linda Nchi has 1,825 Likes and contains news with a significant nationalistic bent about the campaign
The Facebook page for Operation Linda Nchi has 1,825 Likes and contains news with a significant nationalistic bent about the campaign

These narratives don’t travel well for three reasons:

a) The volume of international news being covered by traditional media in the West is declining. The story that Western editors were getting was a single story about a military offensive, one they thought must fit within a broader narrative about the Somali war;

b) Much of the local media that people in Kenya were exposed to (and certainly not buzz in the streets and in bars or the threat of bodily harm by terrorists) did not go online in traditional formats but was available on platforms like Facebook and Twitter, and

c) Even where it did, front pages of news websites are especially ineffective at showing readers when there is a single story that is really important. In newspapers, we fill up the entire front page with the story, make the headline shorter, run it along the entire page, and run a massive photograph when there is a war or a huge story. The front page of the Kenyan Daily Nation is always going to be busy, with a lot of competing stories, making it really difficult just by looking at the site whether a story was relatively more important than others.

This story made me realize how important it is for Wikipedians to expose themselves to social media sources so that they can get access to some of these supporting narratives that you just don’t get by looking online, and that despite Wikipedia’s general aversion to social media, this kind of contextual understanding is essential to gaining a more nuanced understanding of local notability. This finding influenced the eventual report for Ushahidi on how Wikipedians manage and debate sources and citations, and lent legitimacy to Ushahidi’s plans to develop news filtering tools for use during rapidly evolving news events such as disasters, elections and political violence.

Featured pic by NS Newsflash (CC-BY) on Flickr

A new chapter: hFord in oxFord

After four months of travel to visit friends in amazing places and visiting some wild places on my own, I have at last settled down in Oxford for my next adventure: three or four years doing my DPhil here at Oxford University. Sometimes I have to pinch myself to believe it!

This was my itinerary from June to October:

San Francisco – Johannesburg (with family) – Cape Town (with Liv) – Johannesburg – Rome (with Steph)- Falerone (with Steph and James and Jon) – Naples – Ravello – Vescovado di Murlo (with Sarah and Eric and Ellie and Helena) – Rome – Washington D.C. (for Wikimania) – Rome – Tel Aviv (with Elad) – Jerusalem – Tiberias – Ashdod – Tel Aviv – Berlin (with Vicky and Alex) – Münster (with Judy and Meinfred) – Baden-Baden – Berlin – Linz (for WikiSym) – Johannesburg – Exeter (with mom) – Padstow – Penzance – Torquay – Oxford – Painswick – Oxford (me, just me)

So many adventures were had. It wasn’t easy (it’s no surprise that the word ‘travel’ comes from the word ‘travail’, to toil, or labor) but I was surprised at how I felt like I could do this forever – wander from one place to the next, visiting friends and peeking in on their lives. Because of the visa insanity and the fact that I need a lobotomy, I didn’t have a camera (not even my iPhone!) for most of the trip. I really wanted to capture everything and so I drew a lot. This, below, was one of my favorite moments:

Image

Continue reading “A new chapter: hFord in oxFord”

Can Ushahidi Rely on Crowdsourced Verifications?

First published on PBS Idea Lab

During the aftermath of the Chilean earthquake last year, the Ushahidi-Chile team received two reports — one through the platform, the other via Twitter — that indicated an English-speaking foreigner was trapped under a building in Santiago.

“Please send help,” the report read. “i am buried under rubble in my home at Lautaro 1712 Estación Central, Santiago, Chile. My phone doesnt work.”

A few hours later, a second, similar report was sent to the platform via Twitter: “RT @biodome10: plz send help to 1712 estacion central, santiago chile. im stuck under a building with my child. #hitsunami #chile we have no supplies.”

earthquake.jpg

An investigation a few days later revealed that both reports were false and that the Twitter user was impersonating a journalist working for the Dallas Morning News. But this revelation was not in time to stop two police deployments in Santiago that leaped to the rescue before they realized that the area had not been affected by the quake and that the couple living there was alive and well.

Is false information like this one just a necessary by-product of “crowdsourced” environments like Ushahidi? Or do we need to do more to help deployment teams, emergency personnel and users better assess the accuracy of reports hosted on our platform?

Ushahidi is a non-profit tech company that develops free and open-source software for information collection, visualization and interactive mapping. We’ve just published an initial study of how Ushahidi deployment teams manage and understand verification on the platform. Doing this research has surfaced a couple of key challenges about the way that verification currently works, as well as a few easy wins that might add some flexibility into the system. It’s also revealed some questions as we look to improve the platform’s ability to do verification on large quantities of data in the future.

What We’ve Learned

We’ve learned that we need to add more flexibility into the system, enabling deployment teams to choose whether they want to use the “verified” and “unverified” tagging functionality or not. We’ve learned that the binary terms we’re currently using don’t capture other attributes of reports that are necessary to establishing both trust and “actionability” (i.e., the ability to act on the information). For example, the “unverified” tag does not capture whether a report is considered to be an act of “misinformation” or just incomplete, lacking contextual clues necessary to determine whether it is accurate or not.

We need to develop more flexibility to accommodate these different attributes, but we also need to think beyond these final determinations and understand that users might want contextual information (rather than a final determination on its verification status) to determine for themselves whether a report is trustworthy or not. After all, verification tags mean nothing unless those who must make decisions based on that information trust the team doing the verification.

The fact that many deployments are set up by teams of concerned citizens who may have never worked together before and who are therefore unknown to the user organizations makes this an important requirement. Here, we’re thinking of the job of the administering deployment team providing information about the context of a report (answering the who, what, where, when, how and why of traditional journalism perhaps) and inviting others to help flesh out this information, rather than being a “black box” in which the process for determining whether something is verified or not is opaque to users.

As an organization that is all about “crowdsourcing,” we’re taking a step back and thinking about how the crowd (i.e., people who are not known to the system) might assist in either providing more context for reports or verifying unverified reports. When I talk about the “crowd” here I’m referring to a system that’s permeable to interactions by those we don’t yet know. It’s important to note here that, although Ushahidi is talked about as an example of crowdsourcing, this doesn’t mean that the entire process of submission, publishing, tagging and commenting is open for all. Although anyone can start a map and send a report to the map, only administrators can approve and publish reports or tag a report as “verified.”

How Will Crowdsourcing Verification Work?

If we had to open up this process to “the crowd” we’d have to think really carefully about the options we might have in facilitating verification by the crowd — many of which won’t work in every deployment. Variables like scale, location and persistence differ in each deployment and can affect where and when crowdsourcing of verification will work and where it will do more harm than good.

Crowdsourcing verification can mean many different things. It could mean flagging reports that need more context and asking for more information from the crowd. But who makes the final decision that enough information has been provided to change the status of that information?

We could think of using the crowd to determine when a statistically significant portion of a community agrees with changing the status of a report to “verified.” But is this option limited to cases where a large volume of people are interested (and informed) about an issue, and could a volume-based indicator like this be gamed especially in political contexts?

Crowdsourcing verification could also mean providing users with the opportunity of using free-form tags to highlight the context of the data and then surfacing tags that are popular. But again, might this only be accurate when large numbers of users are involved and where the numbers of reports are low? Do we employ an algorithm to rank the quality of reports based on the history of their authors? It’s tempting to imagine that an algorithm alone will solve the data volume challenges, but algorithms do not work in many cases (especially when reports may be sent by people who don’t have a history of using these tools) and if they’re untrusted, they might force users to hack the system to enable their own processes.

An Enduring Question

Verification by the crowd is indeed a large and enduring question for all crowdsourced platforms, not just Ushahidi. The question is how we can facilitate better quality information in a way that reduces harms. One thing is certain: The verification challenge is both technical and social, and no algorithm, however clever, will entirely solve the problem of inaccurate or falsified information.

Thinking about the ecosystem of deployment teams, emergency personnel, users and concerned citizens and how they interact — rather than merely about a monolithic crowd — is the first place to look in understanding what verification strategy makes the most sense. After all, verification is not the ultimate goal here. Getting the right information to the right people at the right time is.

chile1.png

Image of the Basílica del Salvador in the aftermath of the Chilean earthquake courtesy of flickr user b1mbo.