http://www.perlmonks.org?node_id=289250

waxmop has asked for the wisdom of the Perl Monks concerning the following question:

I need to parse this page and create a hash using the codes as keys and the descriptions as values.

This is a section showing what the page looks like:

Total index B50001 Crude processing (capacity) B5610C Primary & semifinished processing (capacity) B562A3C Finished processing (capacity) B5640C Manufacturing ("SIC") B00004 Manufacturing (NAICS) GMF Durable manufacturing (NAICS) GMFD Wood product G321 + 321 Nonmetallic mineral product G327 + 327 Primary metal G331 + 331 Iron and steel products G3311A2 + 3311,2 Fabricated metal product G332 + 332 Machinery G333 + 333

I want to build a hash that would work like this:

my $code = "GMF"; print "$code: $description_hash{$code}.\n";
That should print:
GMF: Manufacturing (NAICS).

All preceding and trailing whitespace needs to be removed from the description.

I've never been expert with regular expressions, so I'd love to see how the really smart people that hang out on this site would build that hash. Thanks in advance!