-
-
Couldn't load subscription status.
- Fork 171
Open
Description
Hi,
I am hoping this project may enable me to try inference across a PC (Windows) with NVIDIA 4070 Super GPU and a Mac mini (M2 Pro).
I used:
huggingface-cli download meta-llama/Llama-3.2-1B --local-dir ./Llama-3.2-1B
then:
bash-3.2$ target/release/cake-cli --model ~/Llama-3.2-1B/ --api 0.0.0.0:8080
[2024-12-07T21:23:11Z INFO ] [Master] dtype=F16 device=Metal(MetalDevice(DeviceId(1))) mem=10.2 MiB
[2024-12-07T21:23:11Z WARN ] no topology file specified, the entire model will be loaded
[2024-12-07T21:23:11Z INFO ] loading configuration from /Users/stevef/Llama-3.2-1B/config.json
[2024-12-07T21:23:11Z INFO ] loading tensors from model.safetensors ...
[2024-12-07T21:23:11Z INFO ] loading embeddings ...
[2024-12-07T21:23:12Z INFO ] loading lm_head ...
Error: cannot find tensor lm_head.weight
There is no lm_head.weight file in the hugging face model files repo so what do I need to do?
Regards
Metadata
Metadata
Assignees
Labels
No labels