It has been a number of years since I have done anything of significance in C, so some best practices may have changed, but from a quick glance, I see a couple of things:
- If you are looping through a list of items for, say, a 0-based copy, you may get better results from something like:
for (j=maxval; j; j--) {...}
- The main function is a bit long for my tastes, although I understand that you are trying to do this for performance, so removing the function calls might make sense.
- Some of the repeated board manipulations or constants should probably be put into macros. The loop for copying the board is used at least twice and could be macro-ized, and many of the constants are just magic numbers as I read through.
Other than that, do you have a profiler that shows any hotspots?