AWS Kubernetes Ansible Provisioner launches an AWS instance, and then installs Kubernetes and llm-d on the instance with one command.
Create a file at ~/.aws/credentials with the following content:
[default]
aws_access_key_id = <aws_access_key_id>
aws_secret_access_key = <aws_secret_access_key>Save router-team-us-east2.pem in your ~/.ssh/ directory and run:
chmod 400 ~/.ssh/router-team-us-east2.pem- Get your token from: https://huggingface.co/docs/hub/en/security-tokens
- Save it to:
mkdir -p ~/.cache/huggingface
echo "<your_token>" > ~/.cache/huggingface/tokenFrom the project directory, run:
./deploy-k8s-cluster.sh deployLook for a log like the following:
Instance launched successfully!
Instance ID: i-xxxxxxxxxxxxxxxxx
Public IP: xxx.xxx.xxx.xxx
or get info from the instance-*-details.txt file that gets created.
Use the public IP:
ssh -i ~/.ssh/router-team-us-east2.pem [email protected]When done, don't forget to delete your instance -- it costs $'s.
./deploy-k8s-cluster.sh cleanup- Region: us-east-2
- Instance Type: g6.4xlarge (1 L4 GPU)
- AMI: Ubuntu 22.04 with NVIDIA drivers
- Storage: 500GB GP3 EBS volume
- SSH Key: router-team-us-east2.pem
- Security Group: Pre-existing security group with ports 22, 6443, 10250 - 10259, 2379 - 2380 open.
- Runtime: CRI-O 1.33
- Version: Kubernetes 1.33
- CNI: Flannel
- Storage: Local Path Provisioner
- Model: Qwen/Qwen3-0.6B
- Storage: Local Path Provisioner
- HuggingFace Token: Add to ~/.cache/huggingface/token
ssh -i ~/.ssh/router-team-us-east2.pem ubuntu@<instance-ip>