Skip to content

ejmichaud/narrow

Repository files navigation

Explorations in making general models narrow.

This respository accompanies the paper "On the creation of narrow AI: hierarchy and nonlocality of neural network skills" by Eric Michaud, Asher Parker-Sartori, and Max Tegmark. We provide instructions for replicating each of our figures below:

Figure 2 (compositional-parity.pdf): This figure is created in notebooks/parity-compositions5 (figure).ipynb. This notebook loads data created by the runs in experiments/compositions4. Each run was of the experiments/compositions4/train.py script, called by each of four the slurm job array .sh scripts in experiments/compositions4/.

Figure 7: (compositional-parity-depth-comparisons.pdf): This figure is also created in notebooks/parity-compositions5 (figure).ipynb, using data from experiments/compositions4.

Figure 3: (cmspnetworkstructureandpruningresultsmain-labeled.png): This figure is created in notebooks/parity-nonlocality-pruning.ipynb. This notebook loads data created by the runs in experiments/compositions3. The script for these runs is experiments/compositions3/trainprunesave.py and the grid search over widths and seeds is defined in experiments/compositions3/run.sh.

Figure 10: (pruningresultspretrainedacrosswidthandseed.pdf): This figure is also created in notebooks/parity-nonlocality-pruning.ipynb, using data from experiments/compositions3.

Figure 8 (ablationscoresk3m3composite2bothlayers.pdf) and Figure 9 (ablationscoresk3m3composite2bothlayersimgshow.pdf) are both created in notebooks/parity-nonlocality-compositional.ipynb.

Figure 4 (mnist.pdf): This figure is created in mnist/plots.ipynb using data generated by mnist/distillation.py (distillation), mnist/prune_then_recover.py (attribution-based pruning followed by recovery training), mnist/pruning.py (group lasso regularized pruning), and mnist/train_from_scratch.py (training networks from scratch). The original teacher model was trained using mnist/train_teacher.py, and mnist/weight_vis.ipynb is used to create animations of the pruning process given snapshots of the weights recorded during training.

Figure 5 (pruning-and-recovery-curves.pdf) is created in notebooks/group-lasso-recovery.ipynb, from the experiment experiments/pruneandtrain00, which uses data from experiments/tuneprune15.

Figure 6 (llmfrontierslosssmall.pdf) is created in notebooks/llm-frontiers2.ipynb. This loads data from experiments/trainscratch01, experiments/distillscratch00, and experiments/pruneandtrain01.

Figure 12 (training-run-tripanel.png) is also created in notebooks/llm-frontiers2.ipynb using data from experiments/trainscratch01, experiments/distillscratch00, and experiments/pruneandtrain01.

Figure 11 (attribution-vs-ablation-residual.pdf) is created in notebooks/attribution-vs-ablation.ipynb.

Figure 13 (random_vs_attribution_recovery_curves.pdf) is created in notebooks/random_vs_attribution.ipynb using data from experiments/pruneandtrain01 and experiments/pruneandtrainrandom00.

If anything is missing or unclear, contact Eric at ericjm [at] mit.edu or eric.michaud99 [at] gmail.com.

About

Turning general models into narrow ones

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •