Language Revitalization: Artificial Intelligence as Restorative Technology
By Chanelle Fagbemi
Legacies of genocide have catalyzed loss of language, threatening the dissemination of ideologies embedded in such languages globally. Notions of development or advancement have long characterized the enactments of imperialism, which established a contemporary world order upon the eradication of cultural and linguistic diversity. This notion of development has since permeated the contemporary era within the very scope of technological advancement. In other words, with technology being monopolized and weaponized as a tool to perpetually exploit long subjugated populations, technological advancement has mirrored the violence within imperial concepts of development. Technology has therefore been similarly engaged with destabilizing socio-cultural systems of knowledge; however, there is a rising collective of researchers aiming to contradict these very ills. Amidst the shift towards restorative technology exists research engaging with the development of Artificial intelligence (AI) as the epicenter of language revitalization and thus the restoration of threatened knowledge systems.
In speaking about the significance of language and the emergence of restorative technologies, the inequities that currently pervade technology are beyond imperative to reflect upon. Discussing the share of technology requires discourse concerning how the hegemonic west sustains the exclusivity of technology, while exploiting the periphery for advancements within the field. In speaking about the connectivity dilemma throughout Africa, De Montfort University professor of Information Systems, Kutoma Wakunuma insists that “technologies are value-laden (...) whoever is behind the development of these technologies will have their values inculcated in these particular technologies” (Wakunuma, 00:03:37-00:03:50). Wakunuma implicitly reveals the use of technology as an instrument to the global asymmetries created by imperial histories, which were informed by the brutish extension of ideological dominance. Furthermore, Wakunuma’s statement regards the term “digital colonialism”—a reality where western nations maintain their hegemony over digital platforms, preserving its exclusivity despite its qualities being largely developed by the raw materials and knowledge systems of the Global South. Consider the production of smartphones, where large tech firms like Apple and Google have been cited as perpetrators of exploits within a 2019 case filed for Congolese families in the mining of cobalt. The Democratic Republic of Congo provides 60% of the world’s cobalt supply—used to produce the batteries in phones that are not only distributed globally, but sold to the same Africans who are simultaneously denied connectivity to properly engage with such technology. Imperial legacies and foreign accumulation of capital are essentially the source of ceaseless exploitation of raw materials in Africa—a pattern living out in the advancement of modern technology.
Wakunuma’s statement emphasizes the reality that monopolizing the sphere of technology contradicts its entire premise to connect people and ideas—both of which are embedded in language. The United Nations Educational, Scientific, and Cultural Organization (UNESCO) categorizes that more than 43% of 6,000 languages are endangered. Language endangerment is no coincidence; from the European conquests of the Americas to the Berlin Conference, the colonial era is often romanticized as the period of mass exploration with claimed intentions to develop the peoples within lands under conquest. The reality is rather that colonization strategically dismantled pre-existing Indigenous ideologies and institutions; nonetheless, dominant study of history instead teaches of European histories as preeminent with much of the colonial era being taught through the scope of terra nullius—the concept, which justifies colonial expansion on the pretense that the territories appropriated at the time were “land belonging to no one.” Given settlers’ prospects to rid the land of its significance to Indigenous peoples, it is made apparent that terra nullius was a mere delusion. The capacity of language to communicate ideologies that inform institutions is among the reasons why language served as one of many domains attacked under colonialism. For example, criminalizing the use of Indigenous languages in North American residential schooling suited the larger objective to dismantle Indigenous institutions by indoctrinating Indigenous youth in ways compatible with the settler state that seeks to subdue them. This history not only provides the reality that formed language loss and global inequities, but also serves as the very motivation behind the emergence of restorative technologies.
Following a degree of context, which could only cover technology’s inaccessibility and language loss to a limited extent, it is appropriate to discuss AI as restorative technology. AI is the engineering of computer systems to understand human acuity, relying on algorithms to solve problems (McCarthy). While the age of AI has seemingly neglected intentions to solve the issue of inaccessibility in technology, AI in recent years has been the site of language revitalization projects globally. Among the projects to revive language through AI exists a team of researchers from Rochester Institute of Technology (RIT), dedicated to preserving the language of the Seneca Nation. Seneca being one of six Iroquois nations in North America, has a mere estimate of fewer than fifty fluent speakers, with another few hundred learning Seneca as a second language (Cometa, 2018). AI in relation to language revitalization functions by relating words to one another within a massive text database. Programmers relate words to each other based on their frequency within a single context through statistical analysis to then deduce the meaning of such words (Sagnes, 2020). The issue, however, is that endangered languages tend to lack data within these databases, as they are typically language isolates with an insignificant amount of living native speakers. While RIT researchers have encountered challenges with acquiring language data, they remain persistent in their development of an automatic speech recognition application to document and transcribe the Seneca language (Cometa, 2018). Quite similarly, the Masakhane project is navigating the issue of technology’s inaccessibility by fortifying Natural Language Processing (NLP) in African languages. NLP is described as a field within AI where computational algorithms are designed to automatically understand, manipulate, and generate human language. Some of the most recognized NLP-based systems include speech recognition, auto-correction, and sentiment analysis, among mechanisms (Cashman, 2020). In this field, Masakhane is developing an open-source AI project for translating African languages, ultimately sustaining such languages by making them accessible primary modes of communication. African languages tend to occupy far less space within translation systems; in a continent that has conserved much of its immense linguistic diversity through the impact of colonization, it becomes critical to create conditions for dialogue to occur among vast communities without demanding the use of official languages. Thus, the objective of Masakhane is to create a culture of understanding by providing space for different languages to be used as intended—as tools circulating the value of knowledge.
While projects such as those developed out of RIT and Masakhane are hindered by inaccessibility issues themselves, their research shifts the scope of technological development. Technology does not exist as an isolated entity—it is an interactive byproduct of differing knowledge systems. In this transition, technology becomes less exploitative, focused on the objective of global connectivity. Collectively, these restorative projects foster greater hope that the collaboration between differing knowledge systems will be sourced from equitable engagements and not legacies of genocidal practices, which have positioned a reality that has made language revitalization a contemporary dilemma. It is ultimately the motivations of restorative technology that will enhance a future of innovation that nullifies crippling notions of development.
Works cited:
Cashman, Katie. “Masakhane: Using AI to Bring African Languages into the Global Conversation: Chances.” Reset, 2 June 2020, en.reset.org/blog/masakhane-using-ai-bring-african-languages-global-conversation-02062020.
Cometa, Michelle. “Researchers Use AI to Preserve Seneca Language.” RIT, 15 Oct. 2018, www.rit.edu/news/researchers-use-ai-preserve-seneca-language.
Digital Colonialism: How Big Tech Exploits Africa, Redfish, 25 May 2021, www.youtube.com/watch?v=vfEbpuZkhT0.
Kontzer, Tony. “The Oral of This Story? AI Can Help Keep Rare Languages Alive.” The Official NVIDIA Blog, 3 Jan. 2019, blogs.nvidia.com/blog/2019/01/02/deep-learning-preserves-seneca-language/.
“Masakhane: A Grassroots NLP Community for Africa, by Africans.” Masakhane , www.masakhane.io/.
Sagnes, Charles. “How Can Artificial Intelligence Benefit Language Preservation?” Alcimed, 6 Apr. 2020, www.alcimed.com/en/alcim-articles/how-can-artificial-intelligence-benefit-language-preservation/.
Simmons, Anjuan. “Technology Colonialism.” Model View Culture, 18 Sept. 2015, modelviewculture.com/pieces/technology-colonialism.
“Top Tech Firms Sued over DR Congo Cobalt Mining Deaths.” BBC News, BBC, 16 Dec. 2019, www.bbc.com/news/world-africa-50812616.
“UNESCO Atlas of the World's Languages in Danger.” UNESCO, UNESCO, www.unesco.org/languages-atlas/index.php?hl=en&page=statistics.
“What Is Artificial Intelligence (AI)?” Oracle, www.oracle.com/artificial-intelligence/what-is-ai/.
Why Indigenous Languages Matter and What We Can Do to Save Them, TEDx Talks, 23 Apr. 2019, www.youtube.com/watch?v=g2HiPW_qSrs.