Skip to main content
SearchLoginLogin or Signup

Trans Metadata Collective

An intervention into harmful, traditional, and archaic metadata practices.

Published onFeb 14, 2024
Trans Metadata Collective


Metadata is often created about marginalized communities by professionals unfamiliar with the issues and preferences of the communities involved and so work practices which appear to be neutral continue to enact harm. To counteract these problems, metadata best practices should be developed by representative communities. That is, those directly affected by exclusionary work practices. Example projects, which take up this challenge, include the Archives for Black Lives in Philadelphia, Protocols for Native American Archival Materials, the Chicano Studies Collection Thesaurus, and the Trans Metadata Collective (TMDC). Created in 2021, TMDC, the focus of this contribution, was created in 2021, with the goal of developing a series of best practices addressing the sizable gap in information resources for trans and gender diverse materials and the communities that data is derived from.

In Inclusive Data: Metadata and Descriptive Language, Sharon Webb highlights the uses of and limitations to metadata in the organization and interpretation of cultural heritage. Drawing on her deep knowledge as a scholar, and her experience as a digital archivist with queer and feminist communities, Webb underscores the profound implications and dynamics inherent in the “power to name” (Olson 2002) when classifying, cataloging, and describing marginalized people, communities, and their artifacts or records. Reflecting on the political nature of metadata creation, Webb urges practitioners to engage with communities in the development of metadata to ensure it accurately reflects diverse identities and experiences and considers the perspectives and terminologies of the communities represented. Speaking directly to an audience likely made up of cultural heritage scholars and metadata professionals, she challenges readers to "think critically and deliberately about interventions which disrupt or challenge the normative and sometimes archaic modes of object and subject descriptions in catalogues" in order to develop "more inclusive knowledge base[s]...[and] reparative justice across intersectional lines of identity" (Webb 2023).

This contribution undertakes Webb’s challenge by describing one such intervention into the traditional, ‘normative and archaic’ metadata practices that Webb flags. Rather than focus on how GLAMS (Galleries, Libraries, Archives, Museums, and Special Collections) can collaborate (and always should have been) with the people and communities whose heritage they hold—through purchase, donation, or theft—I extend and build upon the challenge, focusing on how metadata workers and professionals belonging to an oft-described yet rarely-consulted community directly intervened in the labelling, cataloging, classification, and description of themselves and their communities.

Metadata is often created about marginalized communities by professionals unfamiliar with the issues and preferences of the communities involved and so work practices which appear to be neutral continue to enact harm. For example, revisions which attempt to rectify present-day harms through redescription and remediation are well-intended, but corrections often originate from dominant positions and can sometimes be more harmful than helpful. For example, the term “Handicapped people” to “People with disabilities” in Library of Congress Subject Headings is strongly rejected by disabled people and activists as it ‘separates’ the person from their disability (Watson and Schaefer 2023)(Ladau 2015). In order to counteract these problems, metadata best practices should be developed by those who will be affected by them, as seen in efforts like the Archives for Black Lives in Philadelphia, Protocols for Native American Archival Materials and the Chicano Studies Collection Thesaurus.

To meet these challenges, the Trans Metadata Collective (TMDC) was created in 2021, with the goal of developing a series of best practices addressing the sizable gap in information resources for trans and gender diverse materials and the communities that data is derived from. Consistent with our final report Metadata Best Practices for Trans and Gender Diverse Resources, I will use 'trans and gender diverse' as an umbrella phrase for people who identify as transgender, transsexual, nonbinary, genderfluid, agender, other related terms and/or have non-Western gender identities. Similarly, I will use the term "resources" to refer to anything for which metadata is assigned (e.g. books, movies, photographs, paintings, archival collections, artifacts, etc.) and the term “metadata creators,” as an umbrella covering cataloguers, librarians, archivists, scholars, and other information professionals involved in the creation or maintenance of metadata.

Originally formed following a conference panel that discussed the possibility of a trans-centered resource like the Archives for Black Lives (Antracoli et al. 2019) and the Cataloging Code of Ethics documents (Cataloging Ethics Steering Committee 2021), the group felt that the lack of attention paid to trans and gender diverse issues in GLAMS, especially among creators of metadata standards warranted a structured intervention. Unlike cisgender people, trans and gender diverse people face unique challenges and are especially vulnerable due to metadata which misnames or misgenders them, putting them at risk of harm or violence, and many experience or have experienced forms of personal trauma or violence connected to their current or pre-transition identities.

These harms can be especially threatening in cultural heritage institutions, where records may be poorly maintained, persist for decades and aggregated across platforms and services which use GLAMS data, including Google, Wikipedia, ChatGPT and others. Nor are these hypothetical harms: Thompson (Thompson 2016) examined the authority records for 60 authors who self-identify as trans to ascertain if their authority records ‘outed’ them as trans. By examining fields for recording name and gender, Thompson was able to identify that nearly two thirds (39/60) of the records effectively outed the authors. Only 21 of those 39 records cited the author as the source for the information about their gender identity. Thirty-four of the 39 provided more than one name for the author, either as a name set (alternate versions of a name used simultaneously) or name sequence (names that the author has used in the past but does not currently use). That so many authority records contained sensitive information that did not come from the authors themselves verified Thompson’s concern that authority records may unwittingly — and unacceptably — expose the authors in ways the authors never intended. In recent years many authors have raised concerns around the use of names as identifiers and access points (Seeman 2012)(Shiraishi 2019)(Whittaker 2019)(Kazmer 2019)(Martin 2019)(Arastoopoor and Ahmadinasab 2019)(Diao 2015)(Antracoli and Rawdon 2019).

The use of authorial or creator names in metadata records is one of the fundamental principles of Western librarianship, and the only remaining feature of Ancient Greek librarianship present in modern-day information science (Strout 1956). Indeed, the pervasiveness and importance set on this principle, is exemplified through Melvil Dewey, who spent more time and ink on the proper naming of authors than on the placement of books in the system which bears his name (Dewey 1876). Commonly used controlled vocabularies, classification systems, or metadata schema under- and mis-represent trans and gender diverse people and subjects. For example, the commonly-used library standard Resource Description and Access (RDA) requires the listing of a creators “gender” from a list of three options ("male, female, unknown") (Library of Congress 2024) and the Getty Vocabularies’ collects gender information for artist records. These represent two of many methods by which sensitive can be improperly shared.

While there are many publications in Library and Information Science (LIS) concerning issues faced by trans and gender diverse patrons, staff, and users of Galleries, Libraries, Archives, Museums, and Special Collections (GLAMS), little attention has been paid to issues faced in the description, cataloging, and classification of information resources related to these individuals, communities and/or their works, and even fewer have consulted members of those communities. With this in mind, the TMDC’s primary goal was the development of a set of best practices for the description and classification of trans and gender diverse individuals, communities, and items which could be widely shared and implemented throughout cultural and informational contexts. Traditionally, metadata workers within GLAMS have been instructed by LIS research to include all iterations of author’s name, and authors have been directed to self-cite earlier publications.

Through consensus, TMDC created four working groups: Descriptive Practices; Subject Headings & Authorities; Name Authorities & Access; and Ethical Recommendations. These groups were created to allow individual metadata workers to contribute based on their own areas of expertise or interest. Initially consisting of 10-15 members each, the working groups began meeting to develop individual documents. These documents took a variety of forms, including lists, bibliographies, and formal reports.

After discussing the possible formats of the final document, members envisioned something that would be usable across institution types and experience levels; something that could be applied in both public and academic libraries, archives, museums, and special collections at R1 institutions, independent institutions, or even local historical societies. The Collective also aimed for guidelines that could be understood by volunteers, or new professionals, but also desired the inclusion of more detailed recommendations for readers with extensive experience and expertise in technical services, metadata, or other fields. Consequently the collective agreed on a two-part structure for its final document. The first part, entitled “General Guidelines & Principles” was high-level and conceptual, an articulation of principles or achievable goals. The second part, entitled “Domain-Specific & Technical Details” would contain granular technical guidelines to support implementation across a variety of systems. This balance was important to the Collective as their goal was not to hand down instructions from above but instead to provide support to institutions and individuals exercising their professional expertise and discretion in an informed way.

Following introductory and contextual information the five high-level guidelines the Collective settled on are indicated in bold below, with some additional summarization/contextualization by me not present in the original document:

  • Make the process of metadata creation transparent, by making descriptive standards, rationale, and context publicly available, soliciting active (compensated) collaboration with communities, and other steps.

  • Use culturally and contextually appropriate labels for trans and gender diverse communities and subjects by using culturally and contextually appropriate terms—including reclaimed/self-ascribed slurs or otherwise sensitive language—in the original language (original script and transliterated) alongside added translations and descriptions, along with collaborating with affected communities.

  • Correctly name and identify trans individuals by relying on self-identification and self-description where possible, including direct consultation with individuals or communities, and recognizing that it is not necessary and not recommended to record information about someone’s gender identity or previous names when resources have nothing to do with gender identity.

  • Be explicit about transphobia in collections, items, and metadata by identifying perpetrators and victims with the usage of active voice in description and subject headings in cataloging to embed responsibility; work to identify and correct sensitive language, coded language, offensive or inaccurate terminology (including by other metadata creators).

  • Identify trans-related content and metadata through regular assessment and prioritize for remediation, and avoid using automation for batch replacement, instead using it as an assessment tool alongside qualitative analysis of the impact of existing description

Since its release in June of 2022 as an Open Access, freely-licensed document, Metadata Best Practices for Trans and Gender-Diverse Resources has been downloaded thousands of times and been adopted by many of institutions as part of their internal practices. Since 2022 several minor updates to the original document have been released to reflect changing technical recommendations or revised terminology (such as subject headings), but the work of the TMDC continues today in a broader way through the Queer Metadata Collective:

A Reply to this Pub
Inclusive Data: Metadata and Descriptive Language
Inclusive Data: Metadata and Descriptive Language

A resource to help develop more inclusive metadata descriptions. Particularly focused on controlled vocabularies. Linked to Stack 1, and Stack 2. Reviewed by Dr Kevin Guyan (July 2023) and Bri Watson (July 2023) - thank you both for your insights, recommendations and suggestions.

No comments here
Why not start the discussion?