Proceedings of the International Conference on Smart Systems and Social Management (ICSSSM-2 2025)

Historical Genealogies of Data Colonialism: From Colonial Censuses to Digital AI Systems

Authors
Susmita Banerjee1, *
1The Assam Royal Global University, Guwahati, Assam, India
*Corresponding author. Email: sbanerjee@rgu.ac
Corresponding Author
Susmita Banerjee
Available Online 31 December 2025.
DOI
10.2991/978-2-38476-533-1_59How to use a DOI?
Keywords
Linguistic Imperialism; AI Ethics; Indigenisation
Abstract

Artificial intelligence is widely celebrated as a transformative force in the twenty-first century, yet its foundations reveal profound continuities with older systems of domination. Far from representing a clean rupture with the past, AI reproduces logics of appropriation and erasure that can be traced back to colonial regimes of knowledge extraction. This paper interrogates these continuities through the lens of data colonialism, situating contemporary AI practices within a longer genealogy of exploitation. During the British colonial period, the census, ethnographic surveys, and cartographic projects served as technologies of governance, converting social and cultural complexity into rigid categories designed for control. These archives, far from neutral, functioned as instruments of epistemic violence, shaping political hierarchies and silencing indigenous knowledge systems. In the present, AI operates through parallel mechanisms: the harvesting of massive datasets from online platforms, social media, and user-generated content without consent; the unauthorised use of personal images and creative works for training models; and the commodification of cultural production for algorithmic reproduction. Such practices echo colonial disregard for agency, positioning human life and creativity as resources to be mined. The persistence of linguistic imperialism compounds these dynamics, with English entrenched as the dominant medium of AI. This linguistic asymmetry marginalises indigenous epistemologies, distorts representation in underrepresented languages, and privileges Eurocentric perspectives. AI’s dependence on frequently searched and widely circulated content further amplifies mainstream narratives, producing outputs that reinforce silences around marginalised communities. The paper argues that addressing these challenges requires the indigenisation of AI, embedding linguistic diversity, centering cultural autonomy, and enforcing consent-driven data practices. By recovering the historical roots of extraction and domination, the study demonstrates that decolonial perspectives are indispensable for reimagining AI ethics and ensuring that intelligent systems foster plural, inclusive, and socially just futures, rather than reproducing the epistemic violence of colonial archives.

Copyright
© 2025 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the International Conference on Smart Systems and Social Management (ICSSSM-2 2025)
Series
Advances in Social Science, Education and Humanities Research
Publication Date
31 December 2025
ISBN
978-2-38476-533-1
ISSN
2352-5398
DOI
10.2991/978-2-38476-533-1_59How to use a DOI?
Copyright
© 2025 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - Susmita Banerjee
PY  - 2025
DA  - 2025/12/31
TI  - Historical Genealogies of Data Colonialism: From Colonial Censuses to Digital AI Systems
BT  - Proceedings of the International Conference on Smart Systems and Social Management (ICSSSM-2 2025)
PB  - Atlantis Press
SP  - 991
EP  - 1004
SN  - 2352-5398
UR  - https://doi.org/10.2991/978-2-38476-533-1_59
DO  - 10.2991/978-2-38476-533-1_59
ID  - Banerjee2025
ER  -