QUICK FACTS
Created Jan 0001
Status Verified Sarcastic
Type Existential Dread
open-access, bibliographic database, linguistics, grammars, dictionaries, language affiliations, linguists, leipzig

Glottolog

“It seems you're interested in the rather mundane task of cataloging human babbling. Fine. If you insist on documenting the world's linguistic chaos, Glottolog...”

Contents
  • 1. Overview
  • 2. Etymology
  • 3. Cultural Impact

It seems you’re interested in the rather mundane task of cataloging human babbling. Fine. If you insist on documenting the world’s linguistic chaos, Glottolog might just be the least offensive tool for the job. Don’t expect it to hold your hand through the complexities, though; it’s far too busy being precise.

Online Bibliographic Database of Languages: Glottolog

General Information

Glottolog, a rather functional name for a rather functional endeavor, stands as an open-access online bibliographic database dedicated to the world’s languages. It’s produced by the Max Planck Institute of Geoanthropology in Germany, which, I suppose, gives it a certain academic gravitas, if you’re into that sort of thing. Access is mercifully free, a small kindness in a world that usually charges for everything. The primary language of its interface and documentation is English, which is convenient, I suppose, for the majority of those attempting to decipher its offerings. Its coverage primarily spans the vast and often perplexing disciplines of linguistics , meticulously compiling data on the spoken and signed expressions of humanity.

Beyond merely listing the vast trove of linguistic materials—such as grammars , scholarly articles, and dictionaries —that describe individual languages, Glottolog also prides itself on providing the most current and rigorously vetted language affiliations . These classifications aren’t just pulled from thin air; they are painstakingly derived from the collective work of expert linguists who, presumably, have nothing better to do than argue over genetic relationships between languages.

History and Development

The genesis of Glottolog can be traced back to its initial development and subsequent meticulous maintenance at the esteemed Max Planck Institute for Evolutionary Anthropology located in Leipzig , Germany. This period marked the foundational efforts to establish a comprehensive linguistic resource. Following this foundational phase, between the years 2015 and 2020, the project found a new home and continued its rigorous development at the Max Planck Institute of Geoanthropology in Jena , another corner of Germany dedicated to the pursuit of knowledge. The continuity of its development under the Max Planck umbrella underscores a commitment to academic excellence and thoroughness. The primary architects and steadfast curators of this intricate database have been Harald Hammarström and Martin Haspelmath , whose dedication has shaped Glottolog into the robust resource it is today.

Overview

The Glottolog/Langdoc project was formally initiated in 2011 by Sebastian Nordhoff and Harald Hammarström . Apparently, they felt a burning need to impose order on the chaos of linguistic documentation, and who am I to judge? (Though I am, constantly). The very creation of Glottolog was, in part, a response to a perceived void—specifically, the palpable absence of a truly comprehensive language bibliography. One might even suggest it was a direct, albeit polite, critique of existing resources, particularly the infamous Ethnologue , which, for all its ambition, often leaves a lot to be desired in terms of verifiable detail.

Glottolog, therefore, presents itself as a meticulously curated catalogue of the world’s languages and their respective family groupings, coupled with an exhaustive bibliography for each individual language. It doesn’t just list; it scrutinizes. This approach leads to several key distinctions when compared to other databases, notably Ethnologue :

  • Verified Existence and Distinctiveness: Unlike some databases that might, shall we say, inherit entries without due diligence, Glottolog adheres to a stricter protocol. It only incorporates languages that its editors have been able to definitively confirm as both genuinely existing and demonstrably distinct entities. Any language varieties that haven’t met this rigorous verification standard, but which might have been carried over from other, less scrupulous sources, are clearly flagged. They’re given the rather blunt labels of “spurious ” (meaning they don’t actually exist as presented) or “unattested ” (meaning their existence hasn’t been empirically confirmed). This avoids the proliferation of phantom languages, a problem that, frankly, should have been solved centuries ago.
  • Rigorous Genealogical Classification: Glottolog adopts a conservative, evidence-based approach to linguistic classification. It makes no claims to grand, speculative super-families. Instead, it endeavors to classify languages into families only when those groupings have been unequivocally demonstrated to be valid groupings . This validation must be supported by the robust research and consensus of linguists who specialize in those particular language families, relying on established comparative methods. It’s about verifiable relationships, not wishful thinking.
  • Comprehensive Bibliographic Detail: For those languages that actually make the cut, Glottolog offers an impressive depth of bibliographic information. This is particularly invaluable for lesser-known or critically underdescribed languages, where finding reliable documentation can often feel like searching for a specific grain of sand on an endless beach. It provides a clearer path for researchers to access the foundational work on these linguistic treasures.
  • Alternative Nomenclatures: While not its primary focus, Glottolog does, to a limited but useful extent, list alternative names for languages. These alternative designations are meticulously cross-referenced according to the specific sources that employ them, acknowledging the often-confusing landscape of linguistic terminology.
  • Focused Data Scope: Perhaps most refreshingly, Glottolog maintains a strict focus on linguistic data, eschewing extraneous information. Apart from a single, rather uninspired point-location on a map—which merely indicates its geographic center, not its sprawling cultural footprint—it provides absolutely no ethnographic or demographic information. If you want to know how many speakers a language has, or what their pottery looks like, you’ll have to look elsewhere. Glottolog is about the languages themselves, not the messy details of human existence surrounding them. A sensible choice, frankly.

The language names meticulously used within Glottolog’s extensive bibliographic entries are primarily identified either by their official ISO 639-3 code or, for those without such an international designation, by Glottolog’s own proprietary code, known as a Glottocode. To facilitate cross-referencing and further exploration, the database thoughtfully provides direct external links to the ISO standard, Ethnologue , and other reputable online language databases.

The most recent iteration of this exhaustive linguistic resource is version 5.1, which was unleashed upon the world in October 2024. It is made available under the rather generous terms of the Creative Commons Attribution 4.0 International License , allowing for widespread use and dissemination, provided proper attribution is given. This entire enterprise is a proud component of the larger Cross-Linguistic Linked Data project, which is, quite fittingly, hosted by the very same Max Planck Institute of Geoanthropology .

Language Families

Glottolog, in its relentless pursuit of accuracy, adopts a decidedly more conservative stance in its classification of languages and their family memberships compared to many other databases. This isn’t out of stubbornness, but rather a consequence of its stringent criteria for postulating larger genealogical groupings. If the evidence isn’t overwhelmingly conclusive, it simply won’t make the cut. Conversely, this rigorous approach also makes the database more permissive when it comes to classifying unclassified languages as isolates —languages that have no demonstrable genetic relationship to any other known language. This means if a language doesn’t fit neatly into a well-established family, Glottolog is less inclined to shoehorn it into some speculative proto-language tree.

As of Edition 4.8, Glottolog meticulously accounts for 421 distinct spoken language families and individual isolates. The following list, a testament to the sheer diversity of human communication, details the major genealogical families and isolates, along with their geographical distribution and the number of languages encompassed within each:

List of Glottolog Genealogical Families

NameRegionLanguages
Atlantic-CongoAfrica1,410
AustronesianAfrica, Eurasia, Oceania, South America1,272
Indo-EuropeanAfrica, Australia, Eurasia, North America, Oceania, South America585
Sino-TibetanEurasia506
Afro-AsiaticAfrica, Eurasia382
Nuclear Trans New GuineaOceania317
Pama-NyunganAustralia, Oceania250
OtomangueanNorth America181
AustroasiaticEurasia158
Tai-KadaiEurasia96
DravidianEurasia82
ArawakanNorth America, South America77
MandeAfrica75
TupianSouth America70
Uto-AztecanNorth America68
Central SudanicAfrica63
NiloticAfrica56
Nuclear TorricelliOceania55
UralicEurasia49
AlgicNorth America47
Athabaskan-Eyak-TlingitNorth America46
Pano-TacananSouth America45
QuechuanSouth America43
TurkicEurasia43
CaribanSouth America42
Hmong-MienEurasia42
KruAfrica38
Nakh-DaghestanianEurasia36
SepikOceania36
MayanNorth America34
Lower Sepik-RamuOceania30
Nuclear-Macro-JeSouth America29
ChibchanNorth America, South America27
TucanoanSouth America26
SalishanNorth America25
Timor-Alor-PantarOceania23
DogonAfrica20
Lakes PlainOceania20
Mixe-ZoqueNorth America19
Ta-Ne-OmoticAfrica19
YamOceania19
SiouanNorth America18
AnimOceania17
JaponicEurasia, Oceania17
Mongolic-KhitanEurasia17
BorderOceania15
North HalmaheraOceania15
TungusicEurasia15
Khoe-KwadiAfrica14
AnganOceania13
Eskimo-AleutEurasia, North America13
Miwok-CostanoanNorth America13
NduOceania13
NubianAfrica13
Tor-OryaOceania13
TotonacanNorth America13
ChapacuranSouth America12
GunwinyguanAustralia12
Cochimi-YumanNorth America11
IroquoianNorth America11
SkoOceania11
SurmicAfrica11
Western DalyAustralia11
Geelvink BayOceania10
Great AndamaneseEurasia, Oceania10
HeibanicAfrica10
IjoidAfrica10
MabanAfrica10
NyulnyulanAustralia10
SaharanAfrica10
SonghayAfrica10
South BougainvilleOceania10
WorrorranAustralia10
ChocoanSouth America9
DaganOceania9
TuuAfrica9
Greater KwerbaOceania8
Kiowa-TanoanNorth America8
KoiarianOceania8
MailuanOceania8
Narrow TalodiAfrica8
BosaviOceania7
Chukotko-KamchatkanEurasia7
DajuicAfrica7
HuitotoanSouth America7
MatacoanSouth America7
MuskogeanNorth America7
PomoanNorth America7
ArawanSouth America6
BainingOceania6
BarbacoanSouth America6
ChumashanNorth America6
East StricklandOceania6
Kadugli-KrongoAfrica6
KiwaianOceania6
Left MayOceania6
Lengua-MascoySouth America6
NambiquaranSouth America6
South Bird’s Head FamilyOceania6
WakashanNorth America6
YanomamicSouth America6
ZaparoanSouth America6
Abkhaz-AdygeEurasia5
ArafundiOceania5
CaddoanNorth America5
ElemanOceania5
GuahiboanSouth America5
GuaicuruanSouth America5
KartvelianEurasia5
KeramOceania5
KomanAfrica5
KxaAfrica5
MirndiAustralia5
MisumalpanNorth America5
NimboranicOceania5
PauwasiOceania5
SahaptianNorth America5
South OmoticAfrica5
West Bird’s HeadOceania5
XincanNorth America5
YarebanOceania5
YeniseianEurasia5
YuatOceania5
AymaranSouth America4
Blue Nile MaoAfrica4
ChichamSouth America4
ChinookanNorth America4
ChonanSouth America4
Eastern JebelAfrica4
Eastern Trans-FlyOceania4
HuaveanNorth America4
Iwaidjan ProperAustralia4
KamakananSouth America4
KunimaipanOceania4
MaiduanNorth America4
Mangarrayi-MaranAustralia4
ManingridaAustralia4
NaduhupSouth America4
North BougainvilleOceania4
SentanicOceania4
ShastanNorth America4
Suki-GogodalaOceania4
TamaicAfrica4
TangkicAustralia4
Turama-KikoriOceania4
WalioicOceania4
YokutsanNorth America4
YukaghirEurasia4
AinuEurasia3
BororoanSouth America3
Bulaka RiverOceania3
CharruanSouth America3
DizoidAfrica3
East Bird’s HeadOceania3
GiimbiyuAustralia3
GumuzAfrica3
JarrakanAustralia3
KalapuyanNorth America3
Kamula-ElevalaOceania3
Katla-TimaAfrica3
KawesqarSouth America3
KayagaricOceania3
KolopomOceania3
Kresh-AjaAfrica3
KuliakAfrica3
KwaleanOceania3
Lepki-Murkim-KembraOceania3
MairasicOceania3
Peba-YaguaSouth America3
SalibanSouth America3
TequistlatecanNorth America3
TsimshianNorth America3
West BomberaiOceania3
Western TasmanianAustralia3
YangmanicAustralia3
ZamucoanSouth America3
Amto-MusanOceania2
AraucanianSouth America2
Baibai-FasOceania2
Bayono-AwbonoOceania2
BogiaOceania2
BoranSouth America2
BunabanAustralia2
CahuapananSouth America2
ChimakuanNorth America2
ChiquitanoSouth America2
CoosanNorth America2
Doso-TurumsaOceania2
East KutubuOceania2
Eastern DalyAustralia2
FuranAfrica2
GarrwanAustralia2
HaidaNorth America2
HarakmbutSouth America2
Hatam-MansimOceania2
Hibito-CholonSouth America2
HuarpeanSouth America2
Hurro-UrartianEurasia2
InanwatanOceania2
Jarawa-OngeEurasia2
JicaqueanNorth America2
Kakua-NukakSouth America2
KatukinanSouth America2
Kaure-KosareOceania2
KeresanNorth America2
Konda-YahadianOceania2
KoreanicEurasia2
Kwomtari-NaiOceania2
LencanNorth America2
Limilngan-WulnaAustralia2
ManubaranOceania2
Marrku-WurruguAustralia2
Mombum-KonerawOceania2
Namla-TofanmaOceania2
NivkhEurasia2
North-Eastern TasmanianAustralia2
Northern DalyAustralia2
NyimangAfrica2
Otomaco-TaparitaSouth America2
PahoturiOceania2
PalaihnihanNorth America2
PiawiOceania2
Puri-CoroadoSouth America2
RashadAfrica2
SenagiOceania2
SomahaiOceania2
South-Eastern TasmanianAustralia2
Southern DalyAustralia2
TarascanNorth America2
Taulil-ButamOceania2
TeberanOceania2
TemeinicAfrica2
Ticuna-YuriSouth America2
Uru-ChipayaSouth America2
WintuanNorth America2
Yawa-SaweruOceania2
Yuki-WappoNorth America2
AbinomnOceania1
AbunOceania1
AdaiNorth America1
AewaSouth America1
AikanĂŁSouth America1
Alsea-YaquinaNorth America1
AndaquiSouth America1
AndoqueSouth America1
AnemOceania1
ArutaniSouth America1
AsabanoOceania1
AtacameSouth America1
AtakapaNorth America1
BangimeAfrica1
BasqueEurasia1
BeothukNorth America1
BertaAfrica1
Betoi-JiraraSouth America1
BiluaOceania1
BogayaOceania1
BurmesoOceania1
BurushaskiEurasia1
CamsĂĄSouth America1
Candoshi-ShapraSouth America1
CanichanaSouth America1
CayubabaSouth America1
CayuseNorth America1
ChimarikoNorth America1
ChitimachaNorth America1
ChonoSouth America1
CoahuiltecoNorth America1
CofĂĄnSouth America1
ComecrudanNorth America1
CotonameNorth America1
CuitlatecNorth America1
CulliSouth America1
DamalOceania1
DemOceania1
DibiyasoOceania1
DunaOceania1
ElamiteEurasia1
ElsengOceania1
EsselenNorth America1
EtruscanEurasia1
FasuOceania1
FulniĂŽSouth America1
FuyugOceania1
GaagudjuAustralia1
GuachiSouth America1
GuaicurianNorth America1
GuamoSouth America1
GuatĂłSouth America1
GuleAfrica1
GuriasoOceania1
HadzaAfrica1
HatticEurasia1
HotiSouth America1
HrusoEurasia1
IberianEurasia1
IrĂĄntxe-MĂŒnkĂŒSouth America1
ItonamaSouth America1
JalaaAfrica1
JirajaranSouth America1
Kaki AeOceania1
KanoĂȘSouth America1
KaporiOceania1
KaramiOceania1
KarankawaNorth America1
KaririSouth America1
KarokNorth America1
KehuOceania1
KenaboiEurasia1
KibiriOceania1
KimkiOceania1
Klamath-ModocNorth America1
Kol (Papua New Guinea)Oceania1
KujargeAfrica1
KunamaAfrica1
KungarakanyAustralia1
KunzaSouth America1
KuotOceania1
KusundaEurasia1
KutenaiNorth America1
KwazaSouth America1
LaalAfrica1
LafofaAfrica1
LaragiaAustralia1
LavukaleveOceania1
LecoSouth America1
LuleSouth America1
MĂĄkuSouth America1
MaratinoNorth America1
MaroriOceania1
MassepOceania1
MatanawĂ­South America1
Mato Grosso ArĂĄraSouth America1
MawesOceania1
Maybrat-KaronOceania1
MeroiticAfrica1
Mimi-GaudefroyAfrica1
MinkinAustralia1
MochicaSouth America1
MolaleNorth America1
MolofOceania1
Mor (Bomberai Peninsula)Oceania1
Mosetén-ChimanéSouth America1
MovimaSouth America1
MpurOceania1
MunicheSouth America1
MureSouth America1
NaraAfrica1
NatchezNorth America1
NihaliEurasia1
OdiaiOceania1
OmuranoSouth America1
OngotaAfrica1
OtiSouth America1
Oyster Bay-Big River-Little SwanportAustralia1
PĂĄezSouth America1
PankararĂșSouth America1
PapiOceania1
PawaiaOceania1
PayaguaSouth America1
Pele-AtaOceania1
PirahĂŁSouth America1
PuelcheSouth America1
PuinaveSouth America1
PuméSouth America1
PuquinaSouth America1
PurariOceania1
PyuOceania1
RamanosSouth America1
SalinanNorth America1
SandaweAfrica1
SapéSouth America1
SauseOceania1
SavosavoOceania1
SechuranSouth America1
SeriNorth America1
ShaboAfrica1
Shom PengEurasia1
SiamouAfrica1
SiuslawNorth America1
SulkaOceania1
SumerianEurasia1
TaboOceania1
TaiapOceania1
TakelmaNorth America1
TallĂĄnSouth America1
TamboraOceania1
TanahmerahOceania1
TarumaSouth America1
TauadeOceania1
TaushiroSouth America1
Timote-CuicaSouth America1
TimucuaNorth America1
TiniguaSouth America1
TiwiAustralia1
TonkawaNorth America1
TouoOceania1
TrumaiSouth America1
TunicaNorth America1
TuxĂĄSouth America1
UmbugarlaAustralia1
UrarinaSouth America1
UskuOceania1
VilelaSouth America1
WadjiginyAustralia1
WagemanAustralia1
WaoraniSouth America1
WaraoSouth America1
WashoNorth America1
WiruOceania1
XukurĂșSouth America1
YaleOceania1
YĂĄmanaSouth America1
YanaNorth America1
YeleOceania1
YerakaiOceania1
YetfaOceania1
YuchiNorth America1
YuracaréSouth America1
YurumanguĂ­South America1
ZuniNorth America1

It’s worth noting that creoles are classified within Glottolog according to the specific language that provided their fundamental lexicon . This pragmatic approach ensures that even these fascinating linguistic hybrids find a logical, if sometimes complex, place within the grand scheme.

In addition to these strictly genealogical families and the numerous isolates that stubbornly refuse to be categorized, Glottolog also employs several non-genealogical classifications for a variety of languages that simply don’t fit the standard mould. It’s a pragmatic necessity, I suppose, to account for all the oddities that humans invent. These categories include:

  • Pidgins : A collection of 84 languages. These are simplified means of communication that emerge when speakers of different languages need to interact but lack a common tongue. They are, by definition, not true native languages, but rather functional stopgaps.
  • Mixed languages : A modest nine languages fall into this category. These are distinct from pidgins and creoles, often arising from intense bilingualism and borrowing from two parent languages in a more fundamental way, sometimes leading to complex grammatical structures.
  • Artificial languages : A total of 31 languages. These are languages deliberately constructed by humans, rather than evolving naturally. Think Esperanto or Klingon . A testament to human’s need to control everything, even their own communication.
  • Speech registers : Fifteen entries here. These are variations in language use that depend on the social context or purpose. Not separate languages, but distinct styles or forms within a language.
  • Sign languages : A substantial 223 languages. These visual-gestural languages are complete and complex linguistic systems in their own right, and their inclusion is a crucial acknowledgment of linguistic diversity beyond spoken forms.
  • Unclassifiable attested languages: 121 languages. These are languages that are known to exist and have been documented, but for which no convincing genealogical relationship to any other known family has been established. They linger in a state of tantalizing isolation.
  • Unattested languages : 68 entries. These are languages whose existence is inferred or recorded in historical sources, but for which no direct linguistic data (like texts or native speakers) has been preserved. They are ghosts of languages past.
  • Bookkeeping: spurious languages: A rather large category, encompassing 390 entries, including 6 sign languages. These are primarily placeholders for languages that were once listed in other databases (like retired ISO entries) but which Glottolog has, after its rigorous verification process, determined to be either non-existent, duplicates, or misidentifications. They are kept for internal bookkeeping, a necessary evil to maintain a clean and accurate record.

Notes

  1. ^ For a rather illuminating demonstration of Glottolog’s discerning nature, one need only consult the dedicated bookkeeping section. There, you’ll find a clear accounting of various ISO languages that Glottolog, in its infinite wisdom, has deemed to represent “spurious ” distinctions. However, it is important to note that this level of meticulous discrimination, while admirable for distinct languages, does not, regrettably, extend to mere dialects. Many of these sub-varieties have been inherited from sources like MultiTree or other less-vetted origins without undergoing the same rigorous verification process. A small oversight, perhaps, in the grand scheme of things, but an oversight nonetheless.
  2. ^ The enumeration of spoken language families and isolates does not include sign languages , pidgins , or other non-genealogical categories in this specific count. While sign languages are listed collectively, often grouped typologically as village sign languages , and pidgins along with other unclassified languages are also presented, it’s critical to understand that this organizational structure does not imply any inherent genealogical relationship among them. It’s merely a pragmatic grouping for ease of reference, not a statement on shared ancestry.
  3. ^ The original source for this information sometimes uses the term “Papunesia,” a rather clumsy portmanteau combining “Papua (New Guinea)” and “Austronesia.” This term specifically refers to the multitude of islands found within Insular Southeast Asia and the broader region of Oceania , with the notable exclusion of Australia. For clarity and to avoid such linguistic contortions, it has been straightforwardly replaced with ‘Oceania ’ in the main table. You’re welcome.