Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Advance Regular expression questions

by rajeshatbuzz (Novice)
on Jan 04, 2012 at 17:06 UTC ( #946257=perlquestion: print w/ replies, xml ) Need Help??
rajeshatbuzz has asked for the wisdom of the Perl Monks concerning the following question:

I am just begginer in perl. Just want to do following task through regular expression.

I have huge sql file and trying to replace value of each line first 4 digit separated by comma.

FILE 1:

(7, 0, 7, 55, NULL, 0, '', 'Wise Package Studio Overview and its advan +tage', 1260350867, '203.145.176.177', 0, 0, 0, 0, 197, 0, 0, 0, ''), (8, 0, 8, 55, NULL, 0, '', 'what is Tarma QuickInstall 2 and its usage +', 1260351089, '203.145.176.177', 0, 0, 0, 0, 61, 0, 0, 0, ''), (9, 0, 9, 55, NULL, 0, '', 'what is Tarma Installer 5 and its usage?' +, 1260351321, '203.145.176.177', 0, 0, 0, 0, 253, 0, 0, 0, ''), (10, 0, 10, 56, NULL, 0, '', 'what is Tarma ExpertInstall 3 and its us +age?', 1260351466, '203.145.176.177', 0, 0, 0, 0, 69, 0, 0, 0, ''), (11, 0, 11, 55, NULL, 0, '', 'What is Smart Install Maker and its usag +e?', 1260351588, '203.145.176.177', 0, 0, 0, 0, 241, 0, 0, 0, ''), (14, 0, 14, 55, NULL, 0, '', 'what is MSI Studio and its use?', 126035 +2667, '203.145.176.177', 0, 0, 0, 0, 174, 0, 0, 0, ''),

FILE 2 : It should be manipulated from first file based on following specification...

First Digit - should be incremented by 1

Second Digit - if its 0, no change but with others digit +3

Third Digit -Should be added +3

Fourth Digit - if 55 then make it 65 and if 56 then make it 66

Rest digit should remain unchanged.

Any help on this? I am basically got stuck in regular expression part

Comment on Advance Regular expression questions
Download Code
Re: Advance Regular expression questions
by Skeeve (Vicar) on Jan 04, 2012 at 17:17 UTC

    So if you're stuck you could show what you already got?

    Or do you search for someone you are willing to pay for solving your work?


    s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
    +.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e
Re: Advance Regular expression questions
by kennethk (Monsignor) on Jan 04, 2012 at 17:19 UTC
    If you have a "huge sql file", then perhaps it would be more prudent to use SQL, specifically UPDATE, in order to correct these lines. The chances you will corrupt your file are very high, at which point you will no longer have a "huge sql file" and will rather just have a huge junk file.

    To go back to a more traditional answer, what have you tried? What didn't work? It sounds like the trick you need is e Modifiers to evaluate your arithmetic after a match. The examples in Search and replace should give plenty of guidance. Post your code, and I'll post mine.

Re: Advance Regular expression questions
by jethro (Monsignor) on Jan 04, 2012 at 17:33 UTC

    This could be done with regular expressions, but it would be much easier to use split to separate the fields into separate variables. If you insist on using regular expressions, just use the e modifier, so that the replacement pattern is an expression

Re: Advance Regular expression questions
by roboticus (Canon) on Jan 04, 2012 at 18:04 UTC

    rajeshatbuzz:

    That's really not an advanced question. It's a beginner-level question, so labeling it advanced may prevent many people from viewing it who aren't at an "advanced" level in regular expressions.

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

Re: Advance Regular expression questions
by JavaFan (Canon) on Jan 04, 2012 at 18:10 UTC
    Untested:
    s{^\(\s*\K([0-9]+)(\s*,\s*)([0-9]+)(\s*,\s*)([0-9]+)(\s*,\s*)([0-9]+)} {@{[$1+1]}$2@{[$3&&($3+3)]}$4@{[$5+3]}$6@{[$7==55||$7==56?$7+10:$7]}}
Re: Advance Regular expression questions
by CountZero (Bishop) on Jan 04, 2012 at 19:24 UTC
    Not with a regular expression, but easier to understand, maintain and debug:
    use Modern::Perl; while (<DATA>) { my (undef, $first, $second, $third, $fourth, $balance) = split /[( +,]/, $_, 6; $first++; $second += 3 unless $second == 0; $third += 3; $fourth += 10 if $fourth == 55 or $fourth = 56; say '(', join ', ', $first, $second, $third, $fourth, $balance; } __DATA__ (7, 0, 7, 55, NULL, 0, '', 'Wise Package Studio Overview and its advan +tage', 1260350867, '203.145.176.177', 0, 0, 0, 0, 197, 0, 0, 0, ''), (8, 0, 8, 55, NULL, 0, '', 'what is Tarma QuickInstall 2 and its usage +', 1260351089, '203.145.176.177', 0, 0, 0, 0, 61, 0, 0, 0, ''), (9, 0, 9, 55, NULL, 0, '', 'what is Tarma Installer 5 and its usage?' +, 1260351321, '203.145.176.177', 0, 0, 0, 0, 253, 0, 0, 0, ''), (10, 0, 10, 56, NULL, 0, '', 'what is Tarma ExpertInstall 3 and its us +age?', 1260351466, '203.145.176.177', 0, 0, 0, 0, 69, 0, 0, 0, ''), (11, 0, 11, 55, NULL, 0, '', 'What is Smart Install Maker and its usag +e?', 1260351588, '203.145.176.177', 0, 0, 0, 0, 241, 0, 0, 0, ''), (14, 0, 14, 55, NULL, 0, '', 'what is MSI Studio and its use?', 126035 +2667, '203.145.176.177', 0, 0, 0, 0, 174, 0, 0, 0, ''),

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

      /[(,]/ is a regular expression.

        You are right, of course!

        CountZero

        A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Re: Advance Regular expression questions
by sundialsvc4 (Abbot) on Jan 06, 2012 at 14:33 UTC

    For tasks like these I often pull out bigger guns, like Parse::RecDescent, which not only does a lot of the regular-expression work for you but also sets the whole thing up in a hierarchical way, and the parser includes a certain amount of built-in “find a way to get there” capability.

    To illustrate this point of view, the file appears to consist, at the highest level, of a list of zero-or-more groups, each enclosed by parentheses and each group separated from the next by one comma.   That is the outermost-level description of the file.   The first “inner” description is to say that each group consists of a list of one or more “tokens” separated by commas, where a “token” is either a number or a quoted-string.

    Where parsers really start to shine, though, is when you might want to express some sort of rule about, say, the structure of (or, the meaning of...) one of those groups, especially if the structure of a group may vary in some way.   A parser takes as its basic input a formal description of (a grammar for...) what a valid file may consist of, not just physically but structurally, and it seeks to match what it is given to whatever the grammar says that it is to expect.   Exactly like finding your way through an unfamiliar city using a map.

    In my experience, messy file-parsing tasks (done without a parser/grammar) can very quickly devolve into “write-only code.”   You might establish that it works correctly now, but you dare not touch it again.   In many shops, exactly such programs can be found in abundance:   mission-critical, and utterly fossilized.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://946257]
Approved by kennethk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (7)
As of 2014-11-24 00:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My preferred Perl binaries come from:














    Results (134 votes), past polls