Skip to content

About KenyaNLP

Every Kenyan language deserves world-class AI tools.

We aim to enable inclusive, locally relevant, and globally impactful AI that represents the linguistic and cultural diversity of Kenya.

Our story

KenyaNLP is a pilot initiative under the Masakhane umbrella, uniting Kenya's language efforts through open research, transparent evaluation, and inclusive cross-sector collaboration.

Our mission is straightforward: conduct impactful research for Kenyan languages, advance the technologies that serve them, and ensure that every language - from widely spoken Swahili and Kikuyu to endangered languages like Dahalo and Ogiek - has a place in the AI future.

Founded in 2025, we've rapidly grown into a 23-member collaboration spanning universities, industry labs, and the broader Masakhane community.

01

Research

Systematic benchmarking, model evaluation, and original research publications targeting Kenyan language NLP tasks.

02

Community

Knowledge-sharing sessions, reading groups, mentorship programs, and collaborative projects open to all contributors.

03

Data & infrastructure

Building, curating, and open-sourcing datasets and evaluation pipelines for Kenyan languages.

04

Language revival

Collecting speech and text data for endangered Kenyan languages like Kipsigis, Nandi, Ogiek, and Dahalo.

05

Partnerships

Collaborating with Masakhane, Guild Code, University of Nairobi, Maseno, Microsoft Research, KenCorpus, and Mozilla Common Voice.

Journey so far

2025

KenyaNLP founded under Masakhane umbrella

2025

Grew to 23+ contributors across 6 institutions

2025

KenBench Phase 1 launched - 10 languages, 15+ datasets

2026

Synthesis workshop & AfricaNLP publication (target)