http://www.perlmonks.org?node_id=231591

Elgon has asked for the wisdom of the Perl Monks concerning the following question:

Hi folks,

I've got a quick query: I'm writing an assembler for the venerable Zilog Z80 microprocessor (nostaligia caught up with me) and I've been doing it in my typically brute-force manner. This will take a file of assembly code with comments and so forth and convert it into a file of hex object code. This isn't really much of a problem per se but I'd like to try and do it well or even stylishly rather than by the BF&I approach.

The first pass looks up instructions that never vary (i.e. ones which don't include any user data or addresses) such as things like im 0 or ex sp,ix in a two dimensional array to find the hex.

The second pass will, when I get around to writing it, convert the remaining instructions to their hex equivalents using a series of nested if(){} constructions containing regexps (mostly.)

Now that I've done the explanation, here's the query: Is there a better way of doing this? Although the Z80 is an 8 bit CPU it has rather a lot of instructions in its set and the array holding the data for the first pass is going to be pretty big (something like 500 * 2 elements in it.) Is there a better way?

Secondly, I would prefer to use something like a hash but many of the instructions contain spaces, commas etc... Should I just use a regexp to turn these into underscores or similar and just turn the array into a hash? Is this frowned upon?

I suppose that what I'm really asking is this - can anyone point me to a tutorial or short primer on assembler writing?

Thanks, Elgon.

"What this book tells me is that goose-stepping morons, such as yourself, should read books instead of burning them."
       - Dr. Jones Snr, Indiana Jones and the Last Crusade