python3 -m grader state_agent -v
python3 bundle.py state_agent group33
python3 -m grader group33.zip -vTraining Command:
python3 -m imitation_agent.train -e 1 -v AI_L2x256_blue --time_steps 500000 --time_steps_infer 10 --nenv 2 --use_opponent --expert jurgen_agent --batch_size 512 --device cuda --md 90 --net_arch "256,256"Command for creating jit compatible file:
python3 -m imitation_agent.canvas_jit -e 1 -v AI_L2x256_blue_error --time_steps 1 --time_steps_infer 10 --nenv 1 --use_opponent --expert jurgen_agent --batch_size 512 --device cuda --md 90 --net_arch "256,256" --resume_training "AI_L2x256_blue/AI_L2x256_blue.pt"- Minimize "kart_to_puck_dist"
- Rewarding the player (it increases exponentially): np.exp(-x)
- TODO: Check if normalizing the values helps or not
 
- Aligning the player-puck-opponent goal post
- 1st vector: (puck - player)
- 2nd vector: (opponent goal post - player)
- Use cosine similarity b/w the two vectors
 
- Reward for scoring the goal
- Minimize the distance b/w ball and goal post
 
- Reward for not allowing the ball in a region
- Reward based on current match state
player_state
- 
camera (https://pystk.readthedocs.io/en/latest/state.html#pystk.Camera) 
- 
Not needed???? 
- 
Kart (https://pystk.readthedocs.io/en/latest/state.html#pystk.Kart) 
- 
attachment - types of attachment 
- 
front - Front direction of kart 1/2 kart length forward from location - float3 
- 
id - Kart id compatible with instance labels - int 
- 
jumping - Is the kart jumping? - bool (Not needed I think) 
- 
location - 3D world location of the kart - float3 
- 
max_steer_angle - Maximum steering angle - float 
- 
name - Player name - str 
- 
overall_distance - Overall distance traveled - float (Not needed I think) 
- 
player_id - Player id - int 
- 
powerup - Powerup collected - powerup 
- 
rotation - Quaternion rotation of the kart - Quaternion 
- 
velocity - Velocity of kart - float3 
game_state
- 
ball 
- 
id - Object id of the soccer ball - int 
- 
location - 3D world location of the item - float3 ( ) 
- 
size - Size of the ball - float 
- 
goal_line (static) - Start and end of the goal line for each team - List[List[float3[2]][2]] 
- 
kart_direction - float2 
- 
kart_angle - float 
- 
kart_to_puck_direction - float2 
- 
kart_to_puck_angle - float 
- 
kart_to_puck_angle_difference - float 
- 
kart_to_opponent0 - float2 
- 
kart_to_opponent0_angle - float 
- 
kart_to_opponent0_angle_difference - float 
- 
goal_line_center - float2 
- 
puck_to_goal_line - float2 
- 
puck_to_goal_line_angle - float 
- 
kart_to_goal_line_angle_difference - float 
Somethings that can help
- ball velocity
- ball acceleration?
Rewards (https://github.com/Rolv-Arild/Necto)
- agents velocity towards ball
- balls velocity towards goal
- +ve reward on goal
- -ve reward on opponent goal
- +ve reward if shot on target
- +ve on making a save
- +ve on impeding opponent
psuedo code for necto rewards
- game state reward = ball_pos closer to goal (continuos)
- How does it know to reverse?
- continous space
- how can i train with some base agent
- circiculum learning - ppo
- action space
- rewarding shaping