Motivating example from egg documentation:
- We'd like to perform rewrite
x * 2 / 2 ==> x
- A similar rewrite-rule might prevent (1) from happening:
x * 2 / 2 ==> (x << 1) / 2
- Essentially, we must search all rewrite paths simultaneously, which can be quite slow if implemented naively.
E-Graph is a fast data structure for equivalence saturation.
I'm not a proof engineering expert, but I think this has a lot of potentials in compiler optimizations. For instance, Halide's simplification pass essentially applies a series of rewrite rules in a greedy manner, which might cause it to miss certain optimization opportunities (e.g., see motivating example).
As far as I can see, ideally the implementation should
- use union-find to speed up merging different equivalence classes (especially if they are large)
- use cached structural hash to speed up matching subgraphs based on structure
- know when to stop (avoid term explosion during rewrites)
This implementation kinda does (2) with a naive value-numbering.