Skip to content
Open research community · Building AI for 40+ Kenyan languages

Language technology for every Kenyan language

From Swahili to Ogiek - building AI that speaks your language

We're building AI that understands Kenyan languages - all 40+ of them. Through open research, transparent evaluation, and inclusive collaboration, we ensure no language is left behind.

40+
Kenyan languages targeted
23+
Community contributors
10
Languages in KenBench v1
15+
Open datasets on HuggingFace

Backed by researchers from

Masakhane | Microsoft Research | Guild Code | Maseno University | Lelapa AI | CWI Amsterdam | University of Minnesota

How it works

Join the community

Fill out a short form and get added to our WhatsApp group and project channels.

Pick a project

Choose from active projects - data sourcing, model evaluation, writing, or outreach.

Contribute & publish

Collaborate on research, get co-authorship on papers, and present at top AI venues.

Why this work matters

What you'll find here

Over 70% of Kenyan languages lack usable NLP datasets. We're changing that through coordinated open research so every language gets a seat at the AI table.

Benchmarks & leaderboards

KenBench provides the first comprehensive evaluation suite for Kenyan language models - with transparent leaderboards and plug-and-play evaluation via lm-harness.

Open datasets & tools

Curated datasets on HuggingFace, evaluation pipelines, and open-source tools built for Kenya's linguistic diversity.

A community to grow with

Mentorship, reading groups, workshops, and co-authorship opportunities. From first contribution to conference presentation.

Latest (2025): KenBench evaluation is underway across 10 Kenyan languages. Synthesis workshop planned for April 2026, targeting publication at AfricaNLP 2026 co-located at EACL.

What our team says

"We envision a future where Kenyan languages and voices are fully represented, respected, and resourced in global AI - ensuring that technology development is for, by, and with the people of Kenya."
Dr. Clemencia Siro

Dr. Clemencia Siro

Co-Lead · Postdoctoral Researcher, CWI Amsterdam

"Uniting Kenya's language efforts through open research, transparent evaluation, and inclusive cross-sector collaboration so that no language is left behind."
Dr. Everlyn Asiko

Dr. Everlyn Asiko

Co-Lead · Research Scientist, Lelapa AI

"To foster a future where language is never a barrier to economic involvement and preparation towards the future of work."
Kevin Obote

Kevin Obote

Dataset Evaluation Lead · Lead AI Systems Engineer, Guild Code

"Harmonise NLP efforts in Kenya and create localised tools to further language technologies research for Kenyan languages."
Cynthia Amol

Cynthia Amol

Dataset Cleaning Lead · PhD Student, Maseno University

"To build an equitable digital future, Africa must train AI in the languages its people speak, think, and dream in."
Naome Etori

Naome Etori

Model Research & Human Eval Lead · University of Minnesota-Twin Cities

"Bridging African NLP to the world. Handling African NLP for societal transformation."
Amos Mwendwa

Amos Mwendwa

Website & Outreach Lead · Founder, Meraki Digital Solutions