What’s Wikipedia’s role in deciding who gets honoured?

by Heather Ford, Tamson Pietsch & Kelly Tall

On 26th of January the 2021 Australian Honours were announced. They are intended to recognise the outstanding service and contributions of Australians from all walks of life. But there continue to be questions about the extent to which the system represents the diversity of the Australian nation. As this debate continues, it is important to examine the role that the Honours system plays in shaping who and what is judged as important and its influence on other systems of recognition.

Who gets recognised in the Australian Honours?

Nominations for the Order of Australia are received from members of the public and these are assessed by a panel of representatives who judge a person’s merit in four levels of award: Companion (AC), Officer (OAC), Member (AM), and Medal (OAM). However, since its establishment in 1975, the awards have attracted criticism. A 1995 review highlighted several problems that continue to endure: from political partisanship and the under-representation of migrant and Indigenous groups, to a poor gender balance and a geographical distribution that is weighted towards urban recipients. 

The Honour a Woman project, founded in 2017, has worked to improve gender equity by supporting nominations and highlighting structural barriers to inclusion, but imbalances remain. The absence of Indigenous design and the continued presentation of the awards on the 26th January, both issues identified in 1995 as likely to “contribute to the alienation of indigenous Australians”, remain unchanged. For historians Karen Fox and Samuel Furphy, questions about “the politics of national recognition” and “what it may mean for honours to be “truly Australian” are inherent in the system itself

These ongoing questions highlight the importance of thinking about how notability and distinction are produced more broadly. The work of the Honour a Woman project points clearly to the interdependence of the Australian Honours on other systems of distinction. As their nomination guide points out, building a case for an Honours recipient requires mobilising other forms of recognition, such as in the media or through previous awards.  

Wikipedia as a recognition platform 

Wikipedia is one of these recognition platforms. Created and maintained by a community of volunteer editors (Wikipedians) using a wiki-based editing system, it is the eighth most popular website in Australia, attracting 200 million page views every month. Although Wikipedia claims to be neutral, it is not free from issues of unequal representation. Research has revealed systemic asymmetries that prioritise men, the Global North (particularly the United States and Western Europe), and those who were born in the last century. Women, minorities and Indigenous knowledges face significant barriers to entry. Between 84 and 92% of Wikipedia’s editors are male, topics with a predominantly female audience are weakly represented, and female editors have to endure a high degree of emotional labour when they encounter Wikipedia’s macho editing culture. 

Wikipedia judges a person’s notability according to external signals. According to policy, people are notable when they “have gained sufficiently significant attention by the world at large and over a period of time”, and when that attention can be verified according to what editors regard as “reliable sources”. Rather than deliberating in a panel about who should be recognised, Wikipedia editors determine notability individually and try to reach consensus with others when there are disputes. While the Order of Australia focuses on those who have rendered service to the nation, Wikipedia has a wider latitude to also include people who are notable for other reasons.

Creating recognition 

Given the problems of unequal representation faced by both the Order of Australia and Wikipedia, understanding the processes through which notability is created and projected is crucial. Given the power and influence of WIkipedia as a source of information, understanding its relationship to the Order of Australia is critical.

Analysis shows that most Honours recipients are not represented in Wikipedia.  89% of those with an Order of Australia do not have a Wikipedia page. But the higher the level of award, the more likely a recipient is to also have a page on Wikipedia. 85% of AC recipients have a page, but only 4% of those with an OAM have a page on Wikipedia (Fig. 1). The overwhelming majority of those judged notable by the Australian Honours system are unable to be found on the world’s most used encyclopedia.

Despite this disparity, the Australian Honours system does influence Wikipedia content creation in a powerful way. The announcements of the awards on the 26 January and in June every year act as direct triggers for page creation. The heat map (Fig. 2) shows page creation activity for the weeks leading up to and after the announcement week for each level of the honours. It is notable that this effect is most evident at the AO and AM levels. Recognition by the Order of Australia, acts as a stimulus for recognition on Wikipedia. 

Finally, it is clear that the Wikipedia pages created as a consequence of the Order announcements are for a very particular kind of recipient. Analysis of citation text shows that women who have received an Order of Australia for service in politics, the military, the media, academia and the law are likely to have pages on Wikipedia before the announcement of their award. However, those women who receive an honour for services to disability support, aged care, nursing and Aboriginal affairs have Wikipedia pages created after the announcement of their award. For these forms of work, which are disproportionately accorded less societal recognition in terms of pay and status, the Order of Australia is crucial in acting as a stimulus for wider recognition on Wikipedia. 

As we examine the asymmetries that characterise both the Order of Australia and Wikipedia, it is important to recognise that they are both systems that produce rather than merely reflect notability. Recognition on one platform can produce recognition on another. Whilst this can often work to reinforce existing inequalities, it is an insight that can potentially be employed to address imbalances and under-representation. The Honours system can be used as a tool that draws attention to those individuals that Wikipedia has not hitherto recognised as notable, just as Wikipedia can be a tool to accord recognition to those the Honours system leaves out.

Ultimately these tools lie in the hands of Wikipedia editors and members of the Australian public, any of whom can submit a nomination to the Order of Australia or add an article to Wikipedia. However, as the Honour a Woman initiative points out, and as Wikipedia researchers have highlighted, the ability to effectively use these tools requires understanding the processes, procedures and policies of these recognition platforms. 

Two courses of action might follow on from this. Campaigns, similar to Honour A Woman, that encourage nominations in the Order of Australia for under-represented groups might also consider authoring Wikipedia pages for nominees. At the same time, Wikipedia editors wishing to increase the diversity of representation, might look to those already recognised within the Australian Honours system (particularly at the AO, AM and OAM levels) and create Wikipedia pages for them.

__

Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International

This post first appeared on UTS’s website. A summary is also on Wikimedia’s blog, written by Pru Mitchell. This article highlights selected results from a larger study on the relationship between Wikipedia and the Order of Australia, undertaken by UTS researchers, Heather Ford and Tamson Pietsch, data analyst, Kelly Tall, and Wikimedia Australia volunteers, Alex Lum, Toby Hudson and Pru Mitchell. We acknowledge the generous support of the UTS School of Communication and Wikimedia Australia in funding this research. It was undertaken on Gadigal lands of the Eora Nation. 

Towards a “knowledge gap index” for Wikipedia

In January 2019, prompted by the Wikimedia Movement’s 2030 strategic direction, the Research team at the Wikimedia Foundation identified the need to develop a knowledge gaps index—a composite index to support the decision makers across the Wikimedia movement by providing: a framework to encourage structured and targeted brainstorming discussions; data on the state of the knowledge gaps across the Wikimedia projects that can inform decision making and assist with measuring the long term impact of large scale initiatives in the Movement.

Below is the text of my response to the first draft of the release. We were encouraged to submit examples of our own research to support our critique. The Research team responded to many of my comments in their final version, as the first step towards building the knowledge gap index.

This is a really great effort and it’s no small feat to gather all this research together in one frame. My comments all surround one missing piece in this, and that is the issue of power. For me, it is power that is missing from the document as a whole – the recognition that editing Wikipedia is as much about power as it is about internet access or demographic features. Framing Wikipedia’s problems as gaps that need to be filled is a mistake because it doesn’t enable us to see how Wikipedia is a system governed by unequal power dynamics that determine who is able to be an effective contributor. More specific comments below:

  • In Section 3, you leave out technical contributors from your definition of contributor. I understand why you might do this but I think it is a mistake as you note: “software and choices made in its design certainly are highly impactful on what types of contributors feel supported and what content is created.” As argued in my paper with Judy Wajcman (Ford, H., & Wajcman, J. (2017). ‘Anyone can edit’, not everyone does: Wikipedia’s infrastructure and the gender gap. Social Studies of Science, 47(4), 511–527. https://doi.org/10.1177/0306312717692172) gendering on Wikipedia happens at the level of infrastructure and code and it matters who is developing software tools.
  • In Section 3.1.4, you frame language fluency as less important given that “lower fluency individuals can be important for effective patrolling in small wikis [114], increase the diversity of contributors, and allow for the cross-pollination of content that might otherwise remain locked up in other languages [74]”. But it is important to recognise that there are potential problems when editors from powerful language groups (Europe and North America) contribute to small language encyclopedias (e.g. see Cebuano Wikipedia). https://www.quora.com/Why-are-there-so-many-articles-in-the-Cebuano-language-on-Wikipedia
  • In Section 2.1.7 you write about “ethnicity and race” in the context of “Sociodemographic gaps”. I worry that we have virtually no critical race scholarship of Wikipedia and that the sentence you begin with “Ethnicity and race are very contextual as to what it means about an individual’s status and access to resources” downplays the extent to which Wikipedia is a project that prioritises knowledges from white, European cultures. It seems to be a significant gap in our research, one which this strategy will not solve given the emphasis on metrics as an evaluation tool. I urge the group to discuss *this* severe gap with critical race scholars and to start a conversation about race and Wikipedia.
  • In Section 3.3.3, you write about the “tech skills gap” and the research that has found that “high internet skills are associated with an increase in awareness that Wikipedia can be edited and having edited Wikipedia” so that “edit-a-thons in particular can help to bridge this gap”. In some early work with Stuart Geiger, we noted that it isn’t just tech skills that are required to become an empowered member of the Wikimedia community. Rather, it is about “trace literacy” – “the organisational literacy that is necessary to be an empowered, literate member of the Wikimedia community”. We wrote that Literacy is a means of exercising power in Wikipedia. Keeping traces obscure help the powerful to remain in power and to keep new editors from being able to argue effectively or even to know that there is a space to argue or who to argue with in order to have their edits endure.” Our recommendation was that “Wikipedia literacy needs to engage with the social and cultural aspects of article editing, with training materials and workshops provided the space to work through particularly challenging scenarios that new editors might find themselves in and to work out how this fits within the larger organizational structure.” Again, this is about power not skills. (There is a slideshow of the paper from OpenSym conference and the paper is at https://www.opensym.org/ws2012/p21wikisym2012.pdf and https://dl.acm.org/doi/10.1145/2462932.2462954)
  • Section 4.1 looks at “Policy Gaps” although I’m not sure it is appropriate to talk about policies here as gaps? What’s missing here is notability policies and it is in the notability guidelines where the most power to keep Other voices out is exercised. More work needs to be done to investigate this but the paper above is a start (and perhaps there are others).
  • Section 4.2.2 talks about “Structured data” as a way of improving knowledge diversity and initiatives such as Abstract Wikipedia aiming to “close (the) gap”. Authors should recognise that structured data is not a panacea and that there have been critiques of these programmes within Wikipedia and by social scientists (see, for example, https://journals.sagepub.com/doi/full/10.1177/0263775816668857 Ford, H., & Graham, M. (2016). Provenance, power and place: Linked data and opaque digital geographies. Environment and Planning D: Society and Space, 34(6), 957–970. https://doi.org/10.1177/0263775816668857 open access at https://ora.ox.ac.uk/catalog/uuid:b5756cd4-6d1e-4da1-971e-37b384cd18ca/download_file?file_format=pdf&safe_filename=EPD_final.pdf&type_of_work=Journal+article)
  • The authors point to metrics and studies of the underlying causes of Wikipedia’s gaps in order to evaluate where the gaps are and where they come from. It is very important to recognise that metrics alone will not solve the problem, but I’m dismayed to see how little has been cited in terms of causes and interventions and that the only two papers cited are a literature review and a quantitative study. Quantitative research alone will not enable us to understand causes of Wikipedia’s inequality problems and qualitative and mixed methods research are, indeed, more appropriate methods for asking why questions here. For example, this study that I conducted with Wikipedians helped us to understand that the usual interventions such as editathons and training would not help to fill targeted gaps in articles relating to the South African education curriculum. Instead, the focus needed to be on bringing outsiders in – but not by forcing them to edit directly on wiki – this simply wouldn’t happen, but to find ways of negotiation required for engaging with new editor groups in the long-term project of filling Wikipedia’s gaps. Again, the focus is on the social and cultural aspects of Wikipedia and an emphasis on power. (See https://journals.sagepub.com/doi/full/10.1177/1461444818760870 and https://osf.io/preprints/socarxiv/qn5xd)
  • In terms of the methodology of this review, I noticed that the focus is on the field of “computational social science” which “tries to characterise and quantify different aspects of Wikimedia communities using a computational approach”. I strongly urge the authors to look beyond computational social science to the social science and humanities venues (including STS journals like Science, Technology and Human Values and the Social Studies of Science as well as media studies venues such as New Media and Society).

Also, I’m unsure what this document means in terms of research strategy, but I recommend three main gaps that could be addressed in a future version of this:

  1. A closer engagement with social science literature including critical data studies, media studies, STS to think about causes of Wikipedia inequality.
  2. A dialogue with critical race scholars in order to chart a research agenda to investigate this significant gap in Wikipedia research.
  3. A moment to think about the framing of the problem in terms of “gaps” is the most effective way of understanding the system-wide inequalities within Wikipedia and Wikimedia.

Finally, I believe that regular demographic surveys of Wikimedia users would be incredibly helpful for research and would move us beyond the data that we can regularly access (i.e. metrics) which, as you point out, does not reveal the diversity of our communities. I wish that I had more time to point to other research here, but I work for a university that is under severe strain at the moment and this was all I managed to find time for. I hope it is useful and I look forward to the next version of this!–Hfordsa (talk) 00:47, 30 September 2020 (UTC)

How can content moderation systems be improved?

I was invited to provide a submission to the UK’s House of Lords Communications and Digital Committee who are running an inquiry into freedom of expression online. In the submission below, I answer the Committee’s questions about content moderation systems: How can content moderation systems be improved? Are users of online platforms sufficiently able to appeal moderation decisions with which they disagree? What role should regulators play? Thanks to Lone Sorensen, Stephen Coleman and Giles Moss for their feedback and advice.

From Wikipedia, CC BY SA

Today, a handful of platforms have become keepers of the public discourse, not only for online communities but for whole nations. Platforms adjudicate the truth by forbidding some types of speech, highlighting some and ignoring others. Moderation decisions are applied undemocratically: without transparency, consistency or justification. As a result, inaccurate claims fester and grow across platforms, fuelling polarisation, hate and distrust of public institutions.

Two solutions are generally proffered in response to the growing threat of misinformation and disinformation. The first is education. Users (or more appropriately, citizen users) need to be aware of how to spot fictitious claims. The second is improved moderation. Platforms need to become better at accurately targeting harmful content while enabling content that furthers public debate and enables artistic and intellectual freedom to flourish.

The problem is that these solutions reinforce platforms as the arbiters of truth, far removed from public deliberation. According to these views, platforms make the decisions (perhaps governments step in to provide an independent appeal process) but citizens are left on the other side of the debate, having to educate themselves about how platforms operate without the power to do anything to resolve inaccuracies and lacunae. If citizens notice misinformation on a platform, they can do little to affect its removal other than to participate in obscure reporting mechanisms with no feedback about what happened to their request. If their own content is removed for going against platform rules, they aren’t provided adequate justification or the opportunity for appeal.  

Instead of seeing users only as recipients of education, we need to recognise citizen users as important agents in the moderation process. Improving moderation is not about enhancing platforms’ ability to accurately classify the quality of information. Moderation is not an end in itself – it needs to be seen as a vehicle for greater accountability. Public accountability offers the opportunity for platforms to think more creatively about how to develop moderation practices in the public interest.

Two key strategies are emerging by innovative organisations, companies and platforms involved in the development of the Internet as a public sphere. The first is enabling greater verifiability of claims made across digital platforms by user citizens. The second is ensuring adequate justification for decisions made by platforms. Both of these strategies place the emphasis on the shared problem of information quality and dampening the power of platforms to adjudicate the truth for billions of citizens around the world.

Verifiability is a quality of an information that points to its ability to be verified, confirmed or substantiated. Claims that are linked to a source that readers can check to confirm that they were authored by that source are more verifiable than those that are not. Merely existing in an external source, however, is not enough for readers to know that the claim is accurate. The claim may have been accurately cited but the original claim might be erroneous. Without being able to do the original research themselves, readers must be provided with information on which to judge whether the source is reliable or not.

Verifiability sets up a productive dynamic between readers and authors. On Wikipedia, verifiability is a core content policy and crucial to its relative resilience against misinformation. It is defined as the ability for “readers [to be] able to check that any of the information within Wikipedia articles is not just made up.” For editors, verifiability means that “all material must be attributable to reliable, published sources.” [i]

Verifiability is under threat as the Web becomes increasingly automated. Wikipedia researchers have found that factual data from Wikipedia that is surfacing as answers to user queries in digital assistants and smart search, notably by Google, does not cite Wikipedia as the source of that data [ii]. Search engines and digital assistants are becoming authoritarian gatekeepers of factual knowledge as their answers are adjudicated by algorithms that often remove the source and provide no mechanism for people to appeal decisions made. The only way that changes seem to be made is via articles written in powerful media companies but then predominantly in the US. And even then, in some cases journalists have been told to try to get others to help them train the algorithm to remove erroneous content [iii]. Platforms seem to have little control over the truths their algorithms are discharging.

At the least, verifiability should be ensured by citizen users’ ability to check the source of information being proffered. But verifiability can go much further and evade some of the problems in defining universal rules for what constitutes a “reliable source”. Verifiability should ultimately be about the ability of citizen users to make determinations (individually and collectively) about the trustworthiness of information.

Some work has started on verifying the authorship of images, for example, by surfacing metadata about its provenance. The Content Authenticity Initiative [iv](CAI), for example, is a partnership between Adobe, publishers such as the BBC and the New York Times, and platforms like Twitter, that enables citizen users to click through images they see in order to find out how those images were edited and the context of its source. In time, the CAI believes that people will be trained to look for data that helps them verify a source whenever they see startling information online, rather than to merely accept it at face value.

Tarleton Gillespie, in his book, “Custodians of the Internet [v]” (2018) suggests that “(p)latforms should make a radical commitment to turning the data they already have back to me in a legible and actionable form, everything they could tell me contextually about why a post is there and how I should assess it.” (p199) Examples include flagging to users when their posts are getting a lot of responses from possible troll accounts (with no profile image and few posts) or labelling heavily flagged content or putting it behind a clickthrough warning. Gillespie writes that these could be taken even further, to what he calls “collective lenses”. Users could categorise videos on YouTube as “sexual, violent, spammy, false, or obscene” and these tags would produce aggregate data by which users could filter their viewing experience (p199-200).

In my upcoming book about Wikipedia, I talk about platforms surfacing data that indicates the stability or instability of factual claims. Pandemics, protests, natural disasters and armed conflict are unexpected catalysts followed by a steep spike in information seeking while very little reliable information is available and consensus has not yet been built. This rift between the demand and supply of reliable information has created the perfect storm for misinformation[vi]. But rather than labelling facts as either true or false in the context of catalytic events, platforms can flag claims as stable or unstable. Instability is a quality of facts and their relation to breaking news events. Platforms have access to significant amounts of data that can signal instability: edit wars on Wikipedia or traffic spikes according to hashtags, search queries, keywords.  Rather than marking claims as either true or false, platforms can educate citizen users about the instability of claims (what Professor Noortje Marres from the University of Warwick calls “experimental facts [vii]”) that are still subject to social contestation.

A handful of publishers and platforms are experimenting with flagging posts according to their stability. Wikipedia uses human moderators to flag articles subject to breaking news, warning users that information is subject to rapid change and alteration. But automated data indicating peaks in reading and editing would be a more accurate tool for indicating instability, and one that isn’t subject to editorial politics. Platforms like Instagram automatically append tags to posts about vaccines with information from government health services [viii]. Publishers like the Guardian flag articles that are more than a year old so that users recognise that the information is possibly out of date [ix]. Factual claims in question and answer systems such as Google Knowledge Graph or Amazon’s Alexa could indicate the instability of the answers that they select, and urge citizen users to find more information in reliable, institutional resources.  

Platforms should be mandated by government to enable meaningful verifiability of content they host, giving users more control to make their own determinations. Verifiability is a critical principle for balancing the power of platforms to adjudicate the truth, but this doesn’t solve the problem of platform accountability. Even if platforms surface information to help users better adjudicate content, they still make moderation decisions to block, highlight, filter or frame. The problem is not only that they make decisions that affect the health of nations, but that they do so obscurely, inconsistently and without having to justify their decisions adequately to users.

Critical to the principle of accountability is the right to justification, as I’ve argued [x] with my colleague, Dr Giles Moss from the University of Leeds. Decisions made by platforms need to be adequately justified to those affected by those decisions. The problem is that decisions made by platforms to moderate are obscure and not adequately justified. Facebook, for example, bans users without explaining the reasons [xi]. Twitter flags tweets as “misleading” without explanation [xii]. In addition to enhancing the verifiability of content, platforms must also adequately explain their decisions beyond merely flagging content or notifying users that they have been banned.

Platforms will have to experiment with how to provide adequate justifications at scale. They will have to uncover the principles underlying the algorithms that automatically make many of these decisions. And they will have to reveal that information in meaningful ways. Governments can help by providing principles for platform justifications and enabling the independent review of a selection of decisions – not only for those who have been successful in drawing popular support or media attention [xiii], but randomly selected decisions.

Platforms moderate and will continue to moderate. We can’t prevent them making those decisions but we can improve the accountability by which they make those decisions.


[i] Wikipedia, s.v. “Wikipedia: Verifiability,” last modified January 13, 2010, https://en.wikipedia.org/wiki/Wikipedia:Verifiability.

[ii] McMahon, C., Johnson, I., & Hecht, B. (2017). The Substantial Interdependence of Wikipedia and Google: A Case Study on the Relationship Between Peer Production Communities and Information Technologies. Proceedings of the International AAAI Conference on Web and Social Media11(1). Retrieved from https://ojs.aaai.org/index.php/ICWSM/article/view/14883

[iii] https://www.nytimes.com/2017/12/16/business/google-thinks-im-dead.html

[iv] https://contentauthenticity.org/

[v] Gillespie, Tarleton. Custodians of the Internet: Platforms, content moderation, and the hidden decisions that shape social media. Yale University Press, 2018.

[vi] https://datasociety.net/library/data-voids/

[vii] Marres, Noortje. “Why we can’t have our facts back.” Engaging Science, Technology, and Society 4 (2018): 423-443.

[viii] https://about.instagram.com/blog/announcements/continuing-to-keep-people-safe-and-informed-about-covid-19

[ix] https://www.bbc.co.uk/news/technology-47799878?intlink_from_url=&

[x] https://www.elgaronline.com/view/edcoll/9781789903089/9781789903089.00019.xml

[xi] See https://www.facebook.com/help/381336705253343

[xii] E.g. see https://hotair.com/allahpundit/2021/04/18/why-did-twitter-flag-my-pro-vaccine-tweet-as-misinformation-n384095

[xiii] https://oversightboard.com/

Australian Media Literacy Research Symposium

Chris Cooper (Reset Australia), Deliana Iacoban (All Together Now), Arial Bogle (The Australian Strategic Policy Institute’s Cyber Center), myself and James Arvanitakis at the Australian Media Literacy Research Symposium, 13 April 2021, Western Sydney University Paramatta Campus

Last week, the Australian Media Literacy Research Symposium was held simultaneously in Sydney, Canberra and Brisbane. Organised by Tanya Notley, Michael Dezuanni and Sora Park, the symposium brought together representatives from civil society, government, the major platforms and research institutions interested in media literacy in Australia.

I spoke on a panel titled “Using media literacy to confront the impact of disinformation on our democracy” with Arial Bogle (The Australian Strategic Policy Institute’s Cyber Center), Deliana Iacoban (All Together Now) and Chris Cooper (Reset Australia). We had a great discussion about the problems of disinformation and what a national media literacy programme might look like in order to respond to those threats. I talked mostly about my work with Wikipedia and how I’ve been thinking not of systems detecting the truth or falsity of claims but rather their stability or instability in the wake of catalytic events. Below is the vide of the panel.

The Intimate Encyclopedia

The Intimate Encyclopedia is an experiment that makes explicit the subjectivities of encyclopedic knowledge. Using Wikipedia as inspiration, it offers three core principles guiding the writing of articles. It asks authors to present the 1. Subjective Point of View (IE:SPOV), warns readers that content is 2. Unverifiable and encourages 3. All Original Research (AOR). Although the Intimate Encyclopedia is no longer, this record reminds us of the alternative ways of representing knowledge, distinct from the logics that guide our current truthmaking practices.

Revision 245 of the Intimate Encyclopedia as at 11 December 2020

The following is from a talk I gave at the recent Digital Intimacies symposium organised by Paul Byron, Suneel Jethani, Amelia Johns and Natalie Krikowa, from my discipline group (Digital and Social Media) at the University of Technology Sydney.

I spend most of my time these days trying to understand what it means to know, whose knowledge is recognised and how knowledge should be governed. I do this in a world materially constituted by data and epistemologically by a moment in which truth seems to be located either as a result of machinic (as opposed to human) processes, or in the humans and crowds who seem to epitomise the rejection of a kind of politics that seems to muddy the truth. Seems, because even the algorithms that drive our truth machines are, we know, a very human craft and very much political artefacts. Seems, because the politicians who rise on the back of an idea that politics is corrupt, we learn are themselves often politically corrupt. Seems, because crowds are not – as Surowieki claimed – all wise. They do not always produce more truthful representations than individuals or groups, even if accuracy were the only thing we were in need of right now.

I’m interested in the governance of knowledge and my primary site of study is Wikipedia. When I tried to think about how I’d contribute to a conference dedicated to “Digital Intimacies”, I couldn’t imagine how. Wikipedia seems the opposite of intimate knowledge. Its policies are conservative and representative of Western enlightenment traditions. It asks editors to leave their knowledge at the door in favour of what it considers “reliable sources”, not to do original research, to represent the Neutral Point of View (NPOV).

And yet, in the decade of my research about the 2011 Egyptian Revolution article, constructed as protests descended in ever increasing waves on Egypt’s streets, I learned that intimate knowledge was everywhere. It was in the decisions about what facts to exclude, about who to contact on the ground for verification, in the knowledge about how Wikipedia really works and who to engage in order to make it work for them. As Donna Haraway wrote: “All knowledges are situated. There can be no ‘infinite vision’ – it is a ‘god trick’ (Haraway, 1988, p. 581).

And so, I started to imagine what an encyclopedia that opened itself up to this idea would look like and how it would be governed. This experiment makes knowledge’s subjectivity explicit. With the help of my colleagues in the Digital and Social Media discipline at the UTS School of Communication, we wrote seven encyclopedic articles for the inaugural and only version of the Intimate Encyclopedia. My instructions to authors were to write encyclopedia articles from a personal rather than objective point of view. The other rules came later, as they did with Wikipedia.

The Intimate Encyclopedia begins with three core content principles:

1. Subjective Point of View (IE:SPOV)

All Intimate encyclopedia articles and other encyclopedic content must be written from a subjective point of view, representing the authors’ views truthfully, momentarily and with as much bias as possible.

IE:SPOV

In the example below, Tisha Dejmanee defines the suitcase not only as a “form of luggage” but as a companion (accompanying Tisha to “grad school and new jobs, new houses and growing networks”) that is too big to hide in her new home. For Dejmanee, the suitcase (her suitcase) is symbolic of “the ruptures of 2020 while also serving as a reminder of the continued longing that carries people and hope across the world”. This statement is highly subjective (since when are suitcases symbolic?!) and thus perfectly suitable for the Intimate Encyclopedia.

“Suitcase” by Tisha Dejmanee

In another example, Paul Byron defines his chosen object, the “Portable Webcam” as “a video camera that feeds or streams an image or video in real time” but also as an instrument of oppression that represents constant surveillance and that is reflective of “a sad story of somebody who spends a lot of time at a desk.” This perspective on the webcam is reflective of a very particular moment in time and contains opinions rather than knowledge. Its place in the Intimate Encyclopedia is guaranteed!

“Portable Webcam” by Paul Byron

The second core content principle of the Intimate Encyclopedia is that it is:

2. Unverifiable (IE:U)

References provided are an indication but not evidence of the source for authors’ inspiration. Readers of the Intimate Encyclopedia must accept that authors have produced an accurate representation of their thoughts and feelings. The Intimate Encyclopedia was at one time open for challenge but is no longer*.

IE:U

In the example below, I write about the “Teapot”, “a vessel for steeping black tea leaves in boiling water”. “Only BLACK TEA?” you cry! This is an unverifiable statement (along with the method of making Proper Way tea). The citations here are a ruse – they do not support the statements made. Thankfully there is no need for verifiable knowledge on the Intimate Encyclopedia. Teapots, for this author, are “fragile things” whose “fragility reminded Ford of the tenuousness of our existence and the importance of celebrating small joys – even if they consisted only in a sip of a properly made cup of tea in a real tea cup and from a pot of freshly brewed tea made, importantly, in a teapot.”

“Teapot” by Heather Ford

“Kangaroo Paw” by Amelia Johns is equally unverifiable. Little to Johns’ knowledge, the kangaroo paw was sourced from a warehouse in Melbourne, but we must rely on Johns’ account because no original receipt was included. Kangaroo Paw, according to Johns, is the companion and toilet to Ella and a reminder of “the delicate balance of nature-animal-human cohabitations that have thrived during the pandemic.”

“Kangaroo paw” by Amelia Johns

3. All Original Research (IE: AOR)

The Intimate Encyclopedia only publishes original, untarnished thought. Although some facts may be attributed to a reliable source, authors must intersperse these with definitions of their own design so that the rendering is completely original.

IE:AOR

Bhuva Narayan’s article on the X-Ray is a very personal account of the object. Instead of an image of a human hand, she reveals that this image is, indeed, of her own hands, her own feet. These reflections are interspersed with factual statements about the ways in which X-rays were preceded by “pre-historic hunting cultures depicted animals by drawing or painting the skeletal frame and internal organs (Chaloupka, 1993)”.

X-Ray by Bhuva Narayan

In the next article about the “Dummy”, Natalie Krikowa classifies dummies as both “nipple substitute(s)” and objects “located in the cracks between couch cushions”. This original rendering is of a very particular set of dummies belonging to a very particular human.

In the final article, about the “Book”, Alan McKee presents a truly original portrait of this common object, making it very strange in this original rendering. Books, according to McKee, are not only “primitive forms of computers” but also objects that enable anxious people to “avoid staring straight into the face of the terrifying world around them”. The image is not an image of “a book” one might regularly see in an encyclopedic article about books but “a book nibbled by a parrot”. Parrots featuring in articles about books! Original indeed.

“Book” by Alan McKee

Coda

This tiny experiment demonstrates, among other things, that there are multiple ways of representing knowledges and that the rules that govern the dominant representations (from Wikipedia, for example) are not natural or obvious but shaped by particular ways of understanding what it means to know.

Through the experiment, I learned few facts about books, plants, webcams, suitcases, teapots, x-rays and dummies. I also learned about what is possibly more important: about the hopes, longings, anxieties and dreams of the people I spend many of my days with. Intimate knowledges are, indeed, a worthy persuit… alongside the Other (objective) forms we are so obsessed with at this moment in time.

* The Intimate Encyclopedia was technically available to the public for only a few weeks, even though we didn’t let anyone other than the participants of the conference. This is the only record of its existence.

Thanks to Francesco Bailo for installing our Intimate Encyclopedia and helping its authors with their contributions.

Fact Factories: Wikipedia and Writing History as it Happens

I will be speaking at the Digital Histories Research Seminar on Thursday 8 October 2020, 6.00pm (AEST).

On the 24th of January, 2011, an Egyptian born Wikipedia editor, “The Egyptian Liberal” published the first draft of an article titled “2011 Egyptian protests” on English Wikipedia. Working with hundreds of other editors over the next two weeks, “The Egyptian Liberal” documented the events that catalysed the downfall of Hosni Mubarak as hundreds of thousands of people descended on Tahrir Square and in cities through the country to demand change. In this talk, I’ll discuss my forthcoming book, Fact Factories. I’ll introduce the concept of traveling facts and the mirroring (and sometimes refracting) of material realities on Wikipedia and in the streets of Egypt in ways that framed and eventually helped determine the result of the protests. The talk is about the writing of history as it happens, about the role of automated technology in our collaborative narration of events and about how Wikipedia’s narration will always be a partial one.

Join via Zoom: https://utsmeet.zoom.us/j/99750414645 

Data analyst/visualisation expert needed

Tamson Pietsch, Head of the Centre for Public History at UTS and I are leading a small pilot project at UTS to analyse Wikipedia’s scope and progress over the past twenty years in Australia together with collaborators, Wikimedia Australia
<https://wikimedia.org.au/wiki/Wikimedia_Australia> (including Pru Mitchell
and 99of9|Toby Hudson). We are looking for someone to help us to
develop a series of visualisations for a pilot project. This will involve
extracting data about en.wp.org articles (either from Wikipedia or via Wikidata) and comparing it to another dataset (possibly the Australian Honours List),
cleaning and coding data and, importantly, visualising the data using
mapping and other visualisation tools. This is a pilot project with resources for a few days work which we would ideally like to happen over the next month. Experience with Wikimedia data analysis is a plus.

Please contact me for more info!

BBC Click on Wikipedia interventions

BBC Click interviewed me for a segment on possible manipulation of Wikipedia by the Chinese state (below). Manipulation of Wikipedia by states is not new. What does seem to be new here, though, is the way in which strategies for intervening in Wikipedia (both through the election of administrators and at individual articles) are so explicitly outlined.

Remember, though, that we can never know who is editing these articles. Even wikiedits bots only pick up edits within government IP address ranges. We have no way of knowing whether the person represented by that IP address in that sitting is employed by the government. The point is that there is a lot to be gained from influencing Wikipedia’s representation of people, places, events and things given Wikipedia’s prioritised role as data source for digital assistants and search engines.

It makes sense, then, that institutions (including governments, corporations and other organisations) will try to give weight to their version of the truth by taking advantage of the weak points of the peer produced encyclopedia. Guarding against that kind of manipulation is critical but not a problem that can be easily solved. More thoughts on that soon…

PhD Scholarships on “Data Justice” and “Living with Pervasive Media Technologies from Drones to Smart Homes”

I’m excited to announce that I will be co-supervising up to four very generous and well-supported PhD scholarships at the University of New South Wales (Sydney) on the themes of “Living with Pervasive Media Technologies from Drones to Smart Homes” and “Data Justice: Technology, policy and community impact”. Please contact me directly if you have any questions. Expressions of Interest are due before 20 July, 2017 via the links below. Please note that you have to be eligible for post-graduate study at UNSW in order to apply – those requirements are slightly different for the Scientia programme but require that you have a first class honours degree or a Master’s by research. There may be some flexibility here but that would be ideal.

Living with Pervasive Media Technologies from Drones to Smart Homes

Digital assistants, smart devices, drones and other autonomous and artificial intelligence technologies are rapidly changing work, culture, cities and even the intimate spaces of the home. They are 21st century media forms: recording, representing and acting, often in real-time. This project investigates the impact of living with autonomous and intelligent media technologies. It explores the changing situation of media and communication studies in this expanded field. How do these media technologies refigure relations between people and the world? What policy challenges do they present? How do they include and exclude marginalized peoples? How are they transforming media and communications themselves? (Supervisory team: Michael Richardson, Andrew Murphie, Heather Ford)

Data Justice: Technology, policy and community impact

With growing concerns that data mining, ubiquitous surveillance and automated decision making can unfairly disadvantage already marginalised groups, this research aims to identify policy areas where injustices are caused by data- or algorithm-driven decisions, examine the assumptions underlying these technologies, document the lived experiences of those who are affected, and explore innovative ways to prevent such injustices. Innovative qualitative and digital methods will be used to identify connections across community, policy and technology perspectives on ‘big data’. The project is expected to deepen social engagement with disadvantaged communities, and strengthen global impact in promoting social justice in a datafied world. (Supervisory team: Tanja Dreher, Heather Ford, Janet Chan)

Further details on the UNSW Scientia Scholarship scheme are available on the titles above and here:
https://www.2025.unsw.edu.au/apply/?interest=scholarships 

Wikipedia’s relationship to academia and academics

I was recently quoted in an article for Science News about the relationship between academia and Wikipedia by Bethany Brookshire. I was asked to comment on a recent paper by MIT Sloan‘s Neil Thompson and Douglas Hanley who investigated the relationship between Wikipedia articles and scientific papers using examples from chemistry and econometrics. There are a bunch of studies on a similar topic (if you’re interested, here is a good place to start) and I’ve been working on this topic – but from a very different angle – for a qualitative study to be published soon. I thought I would share my answers to the interview questions here since many of them are questions that friends and colleagues ask regularly about citing Wikipedia articles and about quality issues on Wikipedia.

Have you ever edited Wikipedia articles?  What do you think of the process?

Some, yes. Being a successful editor on English Wikipedia is a complicated process, particularly if you’re writing about topics that are either controversial or outside the purview of the majority of Western editors. Editing is complicated not only because it is technical (even with the excellent new tools that have been developed to support editing without having to learn wiki markup) – most of the complications come with knowing the norms, the rules and the power dynamics at play.

You’ve worked previously with Wikipedia on things like verification practices. What are the verification practices currently?

That’s a big question 🙂 Verification practices involve a complicated set of norms, rules and technologies. Editors may (or may not) verify their statements by checking sources, but the power of Wikipedia’s claim-making practice lies in the norms of questioning  unsourced claims using the “citation needed” tag and by any other editor being able to remove claims that they believe to be incorrect. This, of course, does not guarantee that every claim on Wikipedia is factually correct, but it does enable the dynamic labelling of unverified claims and the ability to set verification tasks in an iterative fashion.

Many people in academia view Wikipedia as an unreliable source and do not encourage students to use it. What do you think of this?

Academic use of sources is a very contextual practice. We refer to sources in our own papers and publications not only when we are supporting the claims they contain, but also when we dispute them. That’s the first point: even if Wikipedia was generally unreliable, that is not a good reason for denying its use. The second point is that Wikipedia can be a very reliable source for particular types of information. Affirming the claims made in a particular article, if that was our goal in using it, would require verifying the information that we are reinforcing through citation and in citing the particular version (the “oldid” in Wikipedia terms) that we are referring to. Wikipedia can be used very soundly by academics and students – we just need to do so carefully and with an understanding of the context of citation – something we should be doing generally, not only on Wikipedia.

You work in a highly social media savvy field, what is the general attitude of your colleagues toward Wikipedia as a research resource? Do you think it differs from the attitudes of other academics?

I would say that Wikipedia is widely recognized by academics, including those of my colleagues who don’t specifically conduct Wikipedia research, as a source that is fine to visit but not to cite.

What did you think of this particular paper overall?

I thought that it was a really good paper. Excellent research design and very solid analysis. The only weakness, I would argue, would be that there are quite different results for chemistry and econometrics and that those differences aren’t adequately accounted for. More on that below.

The authors were attempting a causational study by adding Wikipedia articles (while leaving some written but unadded) and looking at how the phrases translated to the scientific literature six months later. Is this a long enough period of time?

This seems to be an appropriate amount of time to study, but there are probably quite important differences between fields of study that might influence results. The volume of publication (social scientists and humanities scholars tend to produce much lower volumes of publications and publications thus tend to be extended over time than natural science and engineering subjects, for example), the volume of explanatory or definitional material in publications (requiring greater use of the literature), the extent to which academics in the particular field consult and contribute to Wikipedia – all might affect how different fields of study influence and are influenced by Wikipedia articles.

Do you think the authors achieved evidence of causation here?

Yes. But again, causation in a single field i.e. chemistry.

It is important to know whether Wikipedia is influencing the scientific literature? Why or why not?

Yes. It is important to know whether Wikipedia is influencing scientific literature – particularly because we need to know where power to influence knowledge is located (in order to ensure that it is being fairly governed and maintained for the development of accurate and unbiased public knowledge).

Do you think papers like this will impact how scientists view and use Wikipedia?

As far as I know, this is the first paper that attributes a strong link between what is on Wikipedia and the development of science. I am sure that it will influence how scientists and other academic view and use Wikipedia – particularly in driving initiatives where scientists contribute to Wikipedia either directly or via initiatives such as PLoS’s Topic Pages.

Is there anything especially important to emphasize?

The most important thing is to emphasize the differences between fields that I think needs to be better explained. I definitely think that certain types of academic research are more in line with Wikipedia’s way of working, forms and styles of publication and epistemology and that it will not have the same influence on other fields.