And to toyyink and kennethk, it is actually a dataset which i need to prepare for the next stage analysis. it comes in as several html files and each of them contains a rather stable pattern like:
id xxx
borrower xxx
date xxx
...
and i want to code them into some standard format which can be read by some commercial statistical software like stata. e.g.
id borrower date ...
xxx xxxx xxxx
and it is a little too time-consuming to do it in excel, so i switch to perl as i really would like to learn it. doing by learning would be more fun. you can say it is a kind of a one-off project because i will (hope) not frequently parse HTML but thank you anyway for the suggestion, totally agreed with you.
best regards,sh |