Course overview

Reading & resources

Materials that accompany the playbook.

Statements, papers, and primary sources that informed the lessons and that you can read alongside them. We start with the piece referenced most often in the course; further materials are listed below.

Featured reading

Supplementary Statement on Indigenous Data Sovereignty and AI.

Audrey TangWritten in an individual capacity, in response to a Canadian Senate committee question~1,100 words · 6–8 min

Reproduced below in full so quotes used in the lessons can be checked against their source.

Chair, deputy chair and honourable senators, thank you for the opportunity to provide this written response to Senator McCallum’s question on how governments can support indigenous data sovereignty in the age of AI.

I offer this response from my experience in Taiwan, not as advice on Canada’s legislative choices. I also do not speak on behalf of Taiwan’s Council of Indigenous Peoples, the Government of Taiwan or any indigenous peoples or community. A formal statement concerning indigenous policy in Taiwan should be reviewed, shaped and led by the appropriate indigenous institutions.

My contribution is limited to digital-governance experiences centring on how public institutions can make room for community agency, cultural sovereignty, language stewardship and accountable AI.

Taiwan has 16 officially recognized indigenous groups speaking more than 42 indigenous languages. In this context, indigenous data sovereignty cannot be reduced to data protection or privacy compliance. It is about who has the authority to decide how a people, a language, a place or a body of knowledge is represented, interpreted, shared, translated or withheld.

That question becomes especially important with artificial intelligence. AI systems do not only store information. They infer, classify, summarize, translate, predict and generate. They can turn language, stories, images, cultural practices, ecological knowledge and social relationships into training material. Without community governance, this can become a new form of extraction: knowledge taken from its living context and made useful to others without consent, reciprocity or accountability.

In Taiwan, when we speak of “sovereign AI,” this is not a national model trained from the centre. A model is not sovereign simply because it is national or speaks many languages. Sovereignty depends on the process. The more important question is whether each language community can shape the social and cultural composition of the data it represents.

This involves treating data less like oil and more like soil. “Data is oil” assumes extraction, aggregation and refinement elsewhere. “Data is soil” assumes cultivation. It asks who tends the data, understands its seasons and meanings, can correct when it becomes harmful and decides what should be planted.

For indigenous language AI, this distinction is crucial. A language model can support revitalisation, translation, education and intergenerational learning. But it can also flatten cultural differences, expose sensitive materials and make sacred or community-specific knowledge appear freely available. A better approach is not to scrape first and ask questions later. It is to begin with community-defined boundaries: what may be used, for what purpose, under whose review, with what benefits and with what right to revise, reject or withdraw.

Another way to frame this is that attribution and agency should remain attached to knowledge as it moves through digital systems. Privacy alone is not enough. A community may keep data technically protected and still lose authority if its language, stories or knowledge are absorbed into systems that it cannot see, shape, correct or reject. Indigenous data sovereignty asks not only whether data is secure, but whether the people represented by it remain in a living relationship with how it is interpreted and employed.

This is why I distinguish between cultural sovereignty and transcultural sovereignty. Cultural sovereignty means that a community holds authority over its own language, knowledge and self-representation. Transcultural sovereignty means that communities can also choose how to translate across cultures, on their own terms. AI can assist this translation, but this should not be forced. Agency remains with the people and communities whose knowledge is represented.

A practical lesson from Taiwan is that deliberative processes can help communities define how AI should behave before systems become infrastructure. These processes should not be treated as symbolic consultation after a design has already been decided. The purpose is to let affected people draw boundaries early: what an AI system may say, what it must not infer, when it should defer to a human and when it should stop.

In an indigenous context, such a process could support a community’s own code of conduct for AI agents. That code might address language use, cultural protocols, sensitive knowledge, data access, correction pathways and conditions for withdrawal. The point is not that every community should use the same process. The point is that the process should be legible, accountable and answerable to the people whose lives and knowledge are represented.

Another lesson is that smaller, local and domain-specific systems can sometimes better support sovereignty than large general-purpose models. A community-governed model for a specific language, service or knowledge domain can be corrected more quickly, audited more meaningfully and aligned with local context. It can also reduce dependence on distant vendors or platforms that treat community knowledge as raw material for global systems.

This does not mean every system must be local or that all centralized infrastructure is harmful. It means that decisions about scale should be made with sovereignty in mind. Bigger is not always more legitimate. More data is not always more representative. More capability is not always more care. For many communities, the most important features of an AI system may be the ability to inspect, correct, reject, move away from or retire.

The role of government, in this experience, is not to replace community authority with a central expert body. It is to create the conditions in which communities can exercise authority in practice. This can include support for community-controlled infrastructure, language stewardship, local technical capacity, open and interoperable tools and deliberative processes that make participation possible without requiring everyone to become a technologist.

This support must also respect limits. Some knowledge should not be digitized. Some data should not be shared. Some questions should not be answered by an AI system. Indigenous data sovereignty includes the right to participate, but also the right to withhold, refuse and remain unmodeled.

The core principle is simple: indigenous data sovereignty extends beyond preventing harm. It is about preserving the ability of a people to determine its own future in digital form. In AI, that means the right to decide how data is gathered, how language is modeled, how knowledge is interpreted, how benefits are shared and when a system should be corrected, paused or rejected. No community’s language, knowledge or cultural memory should become raw material for AI without consent and reciprocity. And no AI system should be called sovereign unless the people represented have real power to shape what happens next.

— Audrey Tang. Written in an individual capacity; not on behalf of Taiwan’s Council of Indigenous Peoples, the Government of Taiwan, or any Indigenous peoples or community.

Further reading

Companion materials, read alongside the lessons.

The lessons draw on a range of statements, papers, and field accounts. As we collect them into a shareable reading list, they’ll be added here. Suggestions welcome — write back to hello@gainforest.earth.

Bring this back into practice.

The lessons take these ideas — cultivation, refusal, transcultural sovereignty — and walk you through how to apply them to a real dataset, a real model, a real room.

Browse the curriculum