📄 License: CC-BY 4.0 — You can use, remix, and build on this project with attribution. See LICENSE.txt for details.
KhaM is an open-source project preserving the emotional nuance, dialectal memory, and cultural context of South Asian languages for use in AI systems.
It’s not a translation tool.
It’s a memory engine—designed to make AI speak like your people, not like a textbook.
Most AI speaks in generic, Westernized tones.
KhaM gives AI a soul—so Bhooter Raja sounds like your grandfather, not like Google Translate.
We’re capturing:
- Dialectal expressions (Chatgaya, Sylheti, Bhojpuri, Tamil, etc.)
- Emotionally-tagged prompts
- Voice memory (how people actually sound)
- Cultural context for AI agents (tone, rhythm, intent)
KhaM is modular. Each dialect/persona includes:
- Prompt examples
- Emotion tags
- Idiomatic phrases
- Suggested usage for AI agents
Developers can plug these into LLMs (GPT, Claude, Mistral) to create more human-like assistants.
We’re just getting started. You can help by:
- Submitting voice samples (coming soon)
- Adding idioms and tone rules for your dialect
- Translating sample prompts with emotional nuance
- Reviewing prompt structure
Start by opening an issue or contributing to /prompts/
.
- Voice-to-emotion tagging layer (
/whispered/
) - Dialect contribution spec
- KhaM API playground
- Public data collection portal
In Bangla, Khaam means "envelope."
This is our envelope for the forgotten voices of South Asia—unopened, unspoken, unpreserved until now.
📬 Have a story, saying, or dialect to preserve?
Drop it at: [ripon@khamlabs.org]