|
|
| laziness, impatience, and hubris | |
| PerlMonks |
Comment on |
| ( #3333=superdoc: print w/ replies, xml ) | Need Help?? |
|
Thanks for your clarification. I did understand Grandfather's code, I think I just used the wrong terminology in my question -- as you said, what I wanted was the proper regex to search for that four-digit year. Your additions (as well as your modification of the $bibData field) did that beautifully. I do expect to come upon a number of rough spots, especially as I'm expecting to edit all of my research notes in a file that is equally human- and machine-readable. Quite a dream, isn't it? One immediate problem I see with this is that the script only recognizes bibliographic data between quotation marks. So, a journal article between quotes will get picked up while a book title, which conventionally doesn't have quotes, will not. This effectively excludes about a third of my data from the xml output. I think I might go back and edit the raw text file so that the bibliographic info on each line is between | characters. My question is, what regex could I use to replace ^([^"]* "[^"]+".*?) so that $bibData identifies all text between | characters? Thanks again. I'll be sure to show everyone the final product once I'm finished. In reply to Re^4: Converting a Text file to XML
by strobodyne
|
|