|Welcome to the Monastery|
Long ago we did something similar in University studies (second year maybe) ... IIRC it was called minimal word (or term) problem.
In short you are transforming the term into a representation which allows a weighting function defining a maximum. The permutation (associativity, commutativity, ...) leading to the maximum is your normal form.
To terms leading to the same normal form are identical.
Sounds abstract but shouldn't be to difficult to achieve, e.g. normal forms of polynomials are easy, if you have different variables like in 5x³y² + 2xy² *, give certain variables a (e.g. lexicographic) precedence (i.e. x>y) and order them by exponent.
If your terms are not transformable to polynomials it gets a bit more difficult...
( addicted to the Perl Programming Language)
* ) which is a normal form of xy * (5x²+2) * y