Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re: Finding specific alphanumeric IDs from the string

by Grimy (Pilgrim)
on May 21, 2012 at 08:53 UTC ( [id://971575]=note: print w/replies, xml ) Need Help??


in reply to Finding specific alphanumeric IDs from the string

Removing meta-characters isn't a good idea. If I understood your problem correctly, as soon as there's a non-comma, non-alphanumeric character, the ID is incorrect and you should stop there. Example code:
for (split ',', $ids) { die "$_: Invalid ID" unless /^(CWG)?(\d{3,})$/; print "ID $2"; }

Replies are listed 'Best First'.
Re^2: Finding specific alphanumeric IDs from the string
by Anonymous Monk on May 21, 2012 at 16:34 UTC

    hi,
    Thanks for replying but comma is not a constant separator to use inside the split...
    Ids might be enveloped within some string text or spaces like shown in one of my examples in error classes I observed...since these are the syntactical errors I need to catch and flag.
    Can you help me in this scenario?

      local $/; # Slurp for (<DATA> =~ /(?:CWG)?\d{3,}/g) { print "ID: $_\n"; } __DATA__ 123 123456,34567889 12345,CWG123456,1234 "Blah Blah 10-20 m can be taken into consideration 1234556" 123,30,40 http://www.takeithere.com/123456789/fig1,987643,34467889
      Outputs:
      ID: 123 ID: 123456 ID: 34567889 ID: 12345 ID: CWG123456 ID: 1234 ID: 1234556 ID: 123 ID: 123456789 ID: 987643 ID: 34467889

      But is there a precise criteria that you can use to distinguish IDs from other numbers? What if it said "Blah Blah 100m can be taken into consideration 1234556", 100 isn't an ID but would still be matched.
        Hey,
        Thanks for the reply...It surely works in all above cases, but my data is really pretty mismanaged and I do not find any specific criteria to match something. Since this is a regression tool it could have any kind of errors made from user's end in the syntax of putting these IDs and I need to catch them :'(. You are right about "What if it said "Blah Blah 100m can be taken into consideration 1234556", 100 isn't an ID but would still be matched." case... Even I am funnily doubtful about the same. Seems like an unending approach.
        Thanks again!
        Poor Grimy, you're actually working

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://971575]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (2)
As of 2024-04-26 03:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found