perlquestion brengo Hey monks, I'd like to fill a database with values that I grab from 50000 html documents. There is no API available and I can't decide what method to use to parse the html structure. Right now I have saved all the files locally (later on a direct access via web would be great) and they look like this: <code> ... (the usual html, head, body tags, a table, some text) <table width=75%><tr><td width=50%><table width=95%><tr><td width=45% valign=top> <table width=100% cellspacing=0 cellpadding=0><tr bgcolor=#DFDFDF><td colspan=2 height=30><center>tool1_name</center></td></tr> <tr bgcolor=#999999><td width=70%> heading_1 </td><td width=30%></td></tr> <tr bgcolor=#DFDFDF><td>drill diameter:</td> <td>936</td></tr> <tr bgcolor=#CCCCCC><td>drill depth:</td> <td>20</td></tr> <tr bgcolor=#DFDFDF><td>drill speed:</td> <td>4</td></tr> <tr bgcolor=#CCCCCC><td>drill material:</td> <td>506</td></tr> <tr bgcolor=#DFDFDF><td>height:</td> <td>502</td></tr> <tr bgcolor=#CCCCCC><td>width:</td> <td>6</td></tr> <tr bgcolor=#DFDFDF><td>angle:</td> <td>2.76</td></tr> <tr bgcolor=#CCCCCC><td>cooling liquid:</td> <td>14</td></tr> <tr bgcolor=#DFDFDF><td>manufactured in:</td> <td>27</td></tr> <tr bgcolor=#CCCCCC><td>lane code:</td> <td>76</td></tr> <tr bgcolor=#DFDFDF><td>quality test 1:</td> <td>581 (11.4%)</td></tr> <tr bgcolor=#CCCCCC><td>quality procedure:</td> <td>19,021</td></tr> <tr bgcolor=#DFDFDF><td>quality test 2:</td> <td>843 (90.1%)</td></tr> <tr bgcolor=#CCCCCC><td>package worth:</td> <td>$257,524</td></tr> <tr bgcolor=#DFDFDF><td>single unit worth:</td> <td>$90,945</td></tr> <tr bgcolor=#CCCCCC><td>colour:</td> <td>48</td></tr> <tr bgcolor=#DFDFDF><td>coating:</td> <td>2,602</td></tr> </table> <table width=100% cellspacing=0 cellpadding=0><tr bgcolor=#999999><td width=70%> sells </td><td width=30%></td></tr> <tr bgcolor=#DFDFDF><td>sold this month:</td> <td>118</td></tr> <tr bgcolor=#CCCCCC><td>sold in plant A:</td> (...) </code> There are about 110 unique values in 12 tables that I have to grab. On the pages are always two sets of these values: first the values (110 values in 12 tables) of a reference drill, then the values that are interesting to me. So how do I parse these files quickly, reading all these values (stripped of dollar signs, commas, percentages) as quickly as possible? I guess I'd use File::Slurp to store a file in a scalar, then HTML::TableExtract (How do I get the second occurrence?)? Or should I use a regex (how do I get the second occurrence?)? Or a template (how?)? I'd be very grateful for your ideas and I really would appreciate code-snippets as I am really new to perl (replacing a bash script (yep) now... Thanks!