Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

Re: split function using multiple delimiters

by jhourcle (Prior)
on Dec 29, 2011 at 18:56 UTC ( #945544=note: print w/replies, xml ) Need Help??

in reply to split function using multiple delimiters

As someone who's had to parse author lists before, let me just say that unless they're clean coming in, you can run into a *lot* of problems if you just try to split. You also have the 'Susan L.' example of a given-name, but you might also run into a 'Frank de Leo' where the last name would be 'de Leo' not 'Leo'

It looks like the Biblio::Citation::Parser hasn't seen an update in 7 years, but it's likely that's it's a solved problem, and isn't in need of updates (unless someone wants to add DOI or other ID handling). It's intended to take a full citation, so you might have to look at how they're parsing the author string -- look for sub find_authors in Biblio::Citation::Parser::Citebase.

As there are quite a few people in the libraries using Perl, you could ask on the code4lib mailing list, which has lots of Perl folks on it, or the perl4lib which is lower volume (but more focused in scope), to ask if there are any better parsers out there.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://945544]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (4)
As of 2019-07-22 22:27 GMT
Find Nodes?
    Voting Booth?
    If you were the first to set foot on the Moon, what would be your epigram?

    Results (21 votes). Check out past polls.