Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: Faster file read, text search and replace

by NetWallah (Canon)
on Feb 14, 2018 at 04:56 UTC ( #1209102=note: print w/replies, xml ) Need Help??


in reply to Faster file read, text search and replace

An XML file larger than ~ 500 MB is indicative of a poorly designed application system.

The reason is that typically, XML files are serialized/processed after reading them into memory, and at over 500M, memory demands start to enter the region where they need special treatment for resource allocation.

Consider loading the XML file into a database that can manage memory much better, while providing structured access.

Something like this sqlite UI with an XML plug-in could help.

                Python is a racist language what with it's dependence on white space!

  • Comment on Re: Faster file read, text search and replace

Replies are listed 'Best First'.
Re^2: Faster file read, text search and replace
by Jenda (Abbot) on Feb 14, 2018 at 11:58 UTC

    While I agree about the poorly designed system, reading whole XMLs into memory is more often than not poor design as well. Whether the file is huge (already) or not, if you do not have to, do not load the whole file into a huge maze of interconnected objects, but rather process it in chunks. XML::Twig or XML::Rules make that fairly easy to do.

    Jenda
    Enoch was right!
    Enjoy the last years of Rome.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1209102]
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (7)
As of 2018-04-25 10:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?