Open research community · Building AI for 40+ Kenyan languages
Language technology for every Kenyan language
From Swahili to Ogiek - building AI that speaks your language
We're building AI that understands Kenyan languages - all 40+ of them. Through open research, transparent evaluation, and inclusive collaboration, we ensure no language is left behind.
40+
Kenyan languages targeted
23+
Community contributors
10
Languages in KenBench v1
15+
Open datasets on HuggingFace
Backed by researchers from
Masakhane | Microsoft Research | Guild Code | Maseno University | Lelapa AI | CWI Amsterdam | University of Minnesota
How it works
Join the community
Fill out a short form and get added to our WhatsApp group and project channels.
Pick a project
Choose from active projects - data sourcing, model evaluation, writing, or outreach.
Contribute & publish
Collaborate on research, get co-authorship on papers, and present at top AI venues.
Why this work matters
What you'll find here
Over 70% of Kenyan languages lack usable NLP datasets. We're changing that through coordinated open research so every language gets a seat at the AI table.
Benchmarks & leaderboards
KenBench provides the first comprehensive evaluation suite for Kenyan language models - with transparent leaderboards and plug-and-play evaluation via lm-harness.
Open datasets & tools
Curated datasets on HuggingFace, evaluation pipelines, and open-source tools built for Kenya's linguistic diversity.
A community to grow with
Mentorship, reading groups, workshops, and co-authorship opportunities. From first contribution to conference presentation.
Latest (2025): KenBench evaluation is underway across 10 Kenyan languages. Synthesis workshop planned for April 2026, targeting publication at AfricaNLP 2026 co-located at EACL.
What our team says
"We envision a future where Kenyan languages and voices are fully represented, respected, and resourced in global AI - ensuring that technology development is for, by, and with the people of Kenya."
Dr. Clemencia Siro
Co-Lead · Postdoctoral Researcher, CWI Amsterdam
"Uniting Kenya's language efforts through open research, transparent evaluation, and inclusive cross-sector collaboration so that no language is left behind."
Dr. Everlyn Asiko
Co-Lead · Research Scientist, Lelapa AI
"To foster a future where language is never a barrier to economic involvement and preparation towards the future of work."
Kevin Obote
Dataset Evaluation Lead · Lead AI Systems Engineer, Guild Code
"Harmonise NLP efforts in Kenya and create localised tools to further language technologies research for Kenyan languages."
Cynthia Amol
Dataset Cleaning Lead · PhD Student, Maseno University
"To build an equitable digital future, Africa must train AI in the languages its people speak, think, and dream in."
Naome Etori
Model Research & Human Eval Lead · University of Minnesota-Twin Cities
"Bridging African NLP to the world. Handling African NLP for societal transformation."
Amos Mwendwa
Website & Outreach Lead · Founder, Meraki Digital Solutions