Adaptive-embedding transformer trained on wikitext-103-v1

The goal is to replicate the model described in Adaptive input representations for neural language modeling [Baevski and Auli, 2019], and train it on wikitext-103-v1.

TODO

python run.py --big

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
scripts		scripts
wiki103		wiki103
.gitignore		.gitignore
README.md		README.md
run.py		run.py
setup.py		setup.py