Skip to content

numeralbank/sand

Repository files navigation

CLDF dataset derived from Mamta's "South Asian Numerals Database" from 2024

CLDF validation

How to cite

If you use these data please cite

  • the original source

    Mamta, K. (2024): South Asian Numerals Database (SAND). Leipzig: Max Planck Institute for Evolutionary Anthropology.

  • the derived dataset using the DOI of the particular released version you were using

Description

This dataset is licensed under a CC-BY-4.0 license

Available online at https://github.com/numeralbank/sand

Statistics

CLDF validation Glottolog: 100% Concepticon: 96% Source: 100% BIPA: 100% CLTS SoundClass: 100%

  • Varieties: 131 (linked to 129 different Glottocodes)
  • Concepts: 130 (linked to 119 different Concepticon concept sets)
  • Lexemes: 15,364
  • Sources: 9
  • Synonymy: 1.02
  • Invalid lexemes: 0
  • Tokens: 140,583
  • Segments: 127 (0 BIPA errors, 0 CLTS sound class errors, 127 CLTS modified)
  • Inventory size (avg): 27.77

Possible Improvements:

Contributors

Name GitHub user Description Role
Mamta Kumari @Mamta-Kum Data Collection Author
Johann-Mattis List @LinguList Prepared initial version of the CLDF data Other
Christoph Rzymski @chrzyki Maintainer Other

CLDF Datasets

The following CLDF datasets are available in cldf: