Skip to content

Qix-/german-noun-rule-adherence

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

German Noun Common Ending Rules Adherence

Learning German nouns typically requires that you learn the gender along with the noun. There do exist some "rules" but always with the disclaimer that you can't rely on them.

I wanted to test how often you could rely on them. Taking about 90k German nouns and their known genders, I tested them against the ending rules to see which percentage of them adhered to the rule.

Just a fun morning exercise while I learn some German.

To run:

pip3 install -r requirements.txt
python3 adherence.py

Results (as of 02-Jan-25):

Total: 102444, Skipped: 12405, Processed 90039

Feminine endings:
  schaft: 98.71% (m: 1.11%, n: 0.18%) - f=534, m=6, n=1, total=541
  heit: 98.46% (m: 0.00%, n: 1.54%) - f=510, m=0, n=8, total=518
  keit: 100.00% (m: 0.00%, n: 0.00%) - f=999, m=0, n=0, total=999
  ung: 99.16% (m: 0.82%, n: 0.02%) - f=5531, m=46, n=1, total=5578
  enz: 97.99% (m: 0.80%, n: 1.20%) - f=244, m=2, n=3, total=249
  ion: 96.42% (m: 0.94%, n: 2.64%) - f=1645, m=16, n=45, total=1706
  tät: 100.00% (m: 0.00%, n: 0.00%) - f=780, m=0, n=0, total=780
  ei: 84.30% (m: 7.04%, n: 8.66%) - f=467, m=39, n=48, total=554
  ie: 97.20% (m: 1.73%, n: 1.07%) - f=2185, m=39, n=24, total=2248
  ik: 95.53% (m: 4.11%, n: 0.36%) - f=790, m=34, n=3, total=827
  in: 82.46% (m: 9.52%, n: 8.02%) - f=4771, m=551, n=464, total=5786

Masculine endings:
  ismus: 100.00% (f: 0.00%, n: 0.00%) - f=0, m=559, n=0, total=559
  ling: 91.81% (f: 1.42%, n: 6.76%) - f=4, m=258, n=19, total=281
  and: 57.75% (f: 15.23%, n: 27.02%) - f=115, m=436, n=204, total=755
  ant: 84.72% (f: 2.78%, n: 12.50%) - f=6, m=183, n=27, total=216
  ent: 47.75% (f: 0.24%, n: 52.01%) - f=1, m=202, n=220, total=423
  ist: 94.40% (f: 4.98%, n: 0.62%) - f=24, m=455, n=3, total=482
  er: 79.30% (f: 6.47%, n: 14.23%) - f=640, m=7847, n=1408, total=9895
  ig: 90.79% (f: 4.61%, n: 4.61%) - f=7, m=138, n=7, total=152
  or: 90.13% (f: 0.17%, n: 9.70%) - f=1, m=539, n=58, total=598

Neuter endings:
  chen: 88.53% (f: 0.86%, m: 10.61%) - f=8, m=99, n=826, total=933
  lein: 98.00% (f: 0.00%, m: 2.00%) - f=0, m=2, n=98, total=100
  ment: 95.69% (f: 0.00%, m: 4.31%) - f=0, m=9, n=200, total=209
  trum: 100.00% (f: 0.00%, m: 0.00%) - f=0, m=0, n=52, total=52
  nis: 77.88% (f: 19.55%, m: 2.56%) - f=61, m=8, n=243, total=312
  tum: 91.98% (f: 0.00%, m: 8.02%) - f=0, m=13, n=149, total=162
  um: 73.47% (f: 0.29%, m: 26.23%) - f=4, m=356, n=997, total=1357
  o: 54.27% (f: 10.58%, m: 35.14%) - f=78, m=259, n=400, total=737

Total adherence:
  Feminine: 93.28%
  Masculine: 79.46%
  Neuter: 76.77%
  Total: 86.57%

License

I submit this work (specifically, my code) into the public domain or CC0, whichever works best for you.

I would advise that it's more appropriate if using the result data to instead adhere to the license of the dataset itself, which is available at gambolputty/german-nouns.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages