http://www.perlmonks.org?node_id=411598

patrickrock has asked for the wisdom of the Perl Monks concerning the following question:

Ok, I have been handed an annotated bibliography created in ms word. I have extracted out each entry onto its own line thus:

==========begin biblio===========
-Lightfoot, J. B. St. Paul’s Epistle to the Philippians. Grand Rapids: Zondervan, 1953 (= 1913). Classic commentary by one of the greatest English-speaking NT scholars of all time. 2
-Martin, Ralph P. Philippians. Rev. ed.; NCB. Grand Rapids: Eerdmans, 1980. Clear and informed. 2
O'Brien, Peter, T. Commentary on Philippians. NIGTC. Grand Rapids: Eerdmans, 1991. Thorough and insightful comments on the Greek text. 1
-Silva, Moisés. Philippians. Baker Exegetical Commentary. Grand Rapids: Baker, 1993. Sound comments on the Greek text. 2
-Barth, Markus and Helmut Blanke. The Letter to Philemon: A New Translation with Notes and Commentary. Grand Rapids: Eerdmans, 2000. With over 500 pages devoted to a letter that was probably written on a single sheet of papyrus, this work will be consulted by all who want the most thorough treatment of Philemon and avoided by the rest of us. 3
-Bruce, F. F. The Epistles to the Colossians, to Philemon, and to the Ephesians. NIC. Grand Rapids: Eerdmans, 1984. See comments under “Commentaries on Ephesians.” 2
==========end biblio===========


any ideas how you would parse this into its consituent parts for insertion into a database? Like Author(s), Title, Publisher, comments etc...

There isn't anything obvious to split() on, nor any regex wizardry that occurs to me either.

Thought I'd run it by you guys before hiring a temp to type it all in by hand. Thanks in advance, Pat Rock