Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: parsing a bibliography

by bprew (Monk)
on Dec 01, 2004 at 23:03 UTC ( #411610=note: print w/replies, xml ) Need Help??


in reply to parsing a bibliography

When I have had to do things similar to this (read: parse unordered data), I've found that attempting to pass it through several pre-filters and trying for the 90% (or sometimes only 80%) solution works well.

Its sometimes faster to do the unthinkable --Enter the data by hand, then it is to write a highly complex regex/parsing engine for some unordered data format you're unlikely to see again. Or, write a partial solution that gets most of the data, spot check it, and then enter the rest by hand.

Ultimately, you're dealing with the problem that the a human can "understand" the data they're reading and make correct assumptions about the data, but the computer has a much harder time, and its not well-suited to it.

If the data is in Word, perhaps you could try formatting it *more*, and then try possibly getting it into Excel, or as mentioned above, using italics or bolding, etc.


--
Ben
"Naked I came from my mother's womb, and naked I will depart."

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://411610]
help
Chatterbox?
[talexb]: Hmm .. fascinated to learn that there's no INT function in SQL Server, only FLOOR and CEIL. #interestingjobint erviewquestion
[Corion]: talexb: How about convert( decimal( 10,0 ), EXPR )) ? ;-D
[Corion]: On the upside, $work has published two openings for cow-orkers; we'll see whether someone wants to work here ;)

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (9)
As of 2017-08-16 13:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Who is your favorite scientist and why?



























    Results (265 votes). Check out past polls.

    Notices?