I don't think that this would be a suitable solution for every problem of this type, but I think how useful it is will depend on the precise problem being addressed. This is not my field, but here are some thoughts on your reply.
One situation where you would want to compare multiple long strings would be if you have a set of sequenced functional mutants which you needed to compare to the wildtype, and to one another to see if you can cluster them and obtain insight into function. If there were enough mutants, this may promote the vec method. If you are looking at non standard bases (perhaps from damage or being from an unusual archaea) this solution might not be suitable, but my understanding was that usually non standard bases replace the common ones, not add to the list so you would still keep the total to four (one example that comes to mind is in RNA U, uracil, replaces T, thymine). Unusual bases would also only appear in a fraction of problems, so it might well not be a problem here.
What I don't know is how large the problem set would have to be to offset the encoding time. If somebody knows how to find that, that would help determine if the solution is appropriate for a given problem. Unfortunately I don't.