Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: Finding specific alphanumeric IDs from the string

by Grimy (Pilgrim)
on May 21, 2012 at 08:53 UTC ( #971575=note: print w/ replies, xml ) Need Help??


in reply to Finding specific alphanumeric IDs from the string

Removing meta-characters isn't a good idea. If I understood your problem correctly, as soon as there's a non-comma, non-alphanumeric character, the ID is incorrect and you should stop there. Example code:

for (split ',', $ids) { die "$_: Invalid ID" unless /^(CWG)?(\d{3,})$/; print "ID $2"; }


Comment on Re: Finding specific alphanumeric IDs from the string
Download Code
Re^2: Finding specific alphanumeric IDs from the string
by Anonymous Monk on May 21, 2012 at 16:34 UTC

    hi,
    Thanks for replying but comma is not a constant separator to use inside the split...
    Ids might be enveloped within some string text or spaces like shown in one of my examples in error classes I observed...since these are the syntactical errors I need to catch and flag.
    Can you help me in this scenario?

      local $/; # Slurp for (<DATA> =~ /(?:CWG)?\d{3,}/g) { print "ID: $_\n"; } __DATA__ 123 123456,34567889 12345,CWG123456,1234 "Blah Blah 10-20 m can be taken into consideration 1234556" 123,30,40 http://www.takeithere.com/123456789/fig1,987643,34467889
      Outputs:
      ID: 123 ID: 123456 ID: 34567889 ID: 12345 ID: CWG123456 ID: 1234 ID: 1234556 ID: 123 ID: 123456789 ID: 987643 ID: 34467889

      But is there a precise criteria that you can use to distinguish IDs from other numbers? What if it said "Blah Blah 100m can be taken into consideration 1234556", 100 isn't an ID but would still be matched.
        Poor Grimy, you're actually working
        Hey,
        Thanks for the reply...It surely works in all above cases, but my data is really pretty mismanaged and I do not find any specific criteria to match something. Since this is a regression tool it could have any kind of errors made from user's end in the syntax of putting these IDs and I need to catch them :'(. You are right about "What if it said "Blah Blah 100m can be taken into consideration 1234556", 100 isn't an ID but would still be matched." case... Even I am funnily doubtful about the same. Seems like an unending approach.
        Thanks again!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://971575]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (10)
As of 2014-09-23 22:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (241 votes), past polls