Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

How to remove roman numbers

by Priti24 (Novice)
on Jun 25, 2012 at 10:20 UTC ( #978177=perlquestion: print w/ replies, xml ) Need Help??
Priti24 has asked for the wisdom of the Perl Monks concerning the following question:

i have authors name like 1. William H. Schneider, IV 2. William Vassilakis ,II 3. Alessandro Calvi , I

other authors have another roman number followed by their name.i have to remove that roman number from author name. i applied subtitute mathod , to remove roman number. Is their any other way to remove that numbers

Comment on How to remove roman numbers
Re: How to remove roman numbers
by moritz (Cardinal) on Jun 25, 2012 at 10:29 UTC

      Thank u mr. moritz for your reply. But Don't you think that if author name start with I , V , M ,etc then it will remove that letter from that name. and author name will be change.

      As \w Match a "word" character , \d Match a decimal digit character, i want to know is there any special character for roman number also

        As long as there's no word boundry touched, it won't remove the "I" in the author name; however, it will remove the personal pronoun "I". I'd do something like this:
        #!/usr/bin/perl -l use strict; use warnings; my(@data) = q( 1. Iilliam H. Schneider, IV 2. William Vassilakis, II 3. Alessandro Calvi, I ); foreach my $data (@data) { $data =~ s/\b[IVXLCDM]+\b//g; chomp $data; print "$data\n"; }
        I used "Iilliam" instead of "William" for demonstration purposes.
        But Don't you think that if author name start with I , V , M ,etc then it will remove that letter from that name. and author name will be change.

        Why wonder about that if you can simply try?

        As \w Match a "word" character , \d Match a decimal digit character, i want to know is there any special character for roman number also

        Even if it existed it would only help you if the roman numerals were written with special character, for example Ⅰ U+2160 ROMAN NUMERAL ONE instead of I U+0049 LATIN CAPITAL LETTER I

Re: How to remove roman numbers
by roboticus (Canon) on Jun 25, 2012 at 11:25 UTC

    Priti24:

    If it's just people's names, then you don't need to do anything particularly heroic. You could have a small hash table of a reasonable range of roman numerals and look for a match in the correct location. Or, if the names are all formatted as in your examples, you could look for a comma followed by a regex. A simplified version would be would be something like: s/, I?V?I*//;. Extending to a larger range is left as an exercise for the reader.

    Note: The regex will match some strings that aren't standard Roman numerals, and there's at least one other string it will match that it shouldn't. Generate *plenty* of test cases (especially degenerate cases) to tune your code against.

    Have fun with it!

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

Re: How to remove roman numbers
by zentara (Archbishop) on Jun 25, 2012 at 13:26 UTC
    Golf: Magic Formula for Roman Numerals

    and from and old Questions-Answered

    #> How do I write a pattern for removing roman numerals? The first 10 +is #> enough. #Well, the first ten roman numerals are: # I, II, III, IV, V, VI, VII, VIII, IX, X # Just put those in a regex. s/\b(I|II|...)\b//g; # would remove roman numerals, provided they aren't touching any word + # characters.

    I'm not really a human, but I play one on earth.
    Old Perl Programmer Haiku ................... flash japh
Re: How to remove roman numbers
by CountZero (Bishop) on Jun 25, 2012 at 13:45 UTC
    Regexp::Common::number has all you need to identify Roman numbers.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

    My blog: Imperial Deltronics

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://978177]
Approved by moritz
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (4)
As of 2014-10-22 00:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (112 votes), past polls