Files

Example implementation of key algorithms of paper Projections for Approximate Policy Iteration Algorithms [1].

The file kl_projection.py implements Alg. 2 of the paper. It takes as input a linear-Gaussian policy and projects it to another policy that has KL divergence w.r.t. a target policy, smaller than a threshold.

The file policy_with_entropy_cst.py implements a policy with an embedded strict entropy inequality constraint, to ensure that the entropy of a policy never goes below a threshold. This code can easily be extended to enforce a strict entropy equality constraint by replacing self.chol = tf.cond(ent < tent, lambda: self.chol * tf.exp((tent - ent) / act_dim), lambda: self.chol) with self.chol = self.chol * tf.exp((tent - ent) / act_dim).

References

[1] Akrour, R.; Pajarinen, J.; Neumann, G.; Peters, J. (2019). Projections for Approximate Policy Iteration Algorithms. Proceedings of the International Conference on Machine Learning (ICML).

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
README.md		README.md
kl_projection.py		kl_projection.py
policy_with_entropy_cst.py		policy_with_entropy_cst.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Files

References

About

Uh oh!

Releases

Packages

Languages

akrouriad/papi

Folders and files

Latest commit

History

Repository files navigation

Files

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages