Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re^2: help with regular expression required

by PitifulProgrammer (Acolyte)
on Aug 13, 2014 at 13:12 UTC ( [id://1097276]=note: print w/replies, xml ) Need Help??


in reply to Re: help with regular expression required
in thread help with regular expression required

Thank you choroba and also many thanks to AppleFritter

I do not have to change the structure of the file, it is just about replacing some strings. Chances are this has to be put into a 'proper' script at some stage (I will surely try out the XPath variant)

As for the file, I can post a small snippet, which I had to modify. I changed the content but the structure is the same. Sorry for not having initiallly posted the snippet.

<tu creationdate="12345Z" changedate="12345" creationid="John Doe" cha +ngeid="John DOE" srclang="en-US"> <prop type="user-defined">server:xyz</prop> <prop type="user-defined">quality:100</prop> <prop type="user-defined">storing mode:1</prop> <prop type="user-defined">uncontrolled:0</prop> <prop type="user-defined">lastuser:unknown</prop> <prop type="user-defined">context left:1;2;3;4;5;6;7;8;9;10;11 +;12;13;14;15;16;17;18;19;20</prop> <prop type="user-defined">context right:21;22;23;24;25;26;27;2 +8;29;30;31;32;33;34;35;36;37;38;39;40</prop> <prop type="user-defined">a lot of junk with letters and &quot +;90365AF9&quot</prop> <prop type="Guid::xxx">123456789</prop> <prop type="Att::xxx">name_of_someone</prop>^M <tuv xml:lang="en-US"><seg>a lot of junk with letters and &quo +t;90365AF9&quot<bpt i="1"></bpt>text<ept i="1">&lt;/cf&gt;</ept></seg +></tuv> <tuv xml:lang="en-US"><seg>a lot of junk with letters and &quo +t;90365AF9&quot<bpt i="1"></ept></seg></tuv> </tu>

Thanks a mil for helping me out. Really helps beginners to move on. Looking forward to your replies.

Kind regards C.

Replies are listed 'Best First'.
Re^3: help with regular expression required
by AppleFritter (Vicar) on Aug 13, 2014 at 16:36 UTC

    Thanks for sharing this piece of data -- however, your oneliner is still working just fine there for me. I understand that you may not be in a position to share confidential data, but on the other hand, I'm sure you'll understand it's a bit difficult to diagnose a problem without a test case showing the problem.

    I second what other monks have suggested: if regular expressions aren't working here for whatever reason, a proper XML wrangling module may be the way to go. And XML::Twig is specifically intended for working with very large XML files that don't fit into memory, so that's my recommendation as well.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1097276]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (5)
As of 2024-04-20 00:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found