Re: Faster file read, text search and replace

in reply to Faster file read, text search and replace

An XML file larger than ~ 500 MB is indicative of a poorly designed application system.

The reason is that typically, XML files are serialized/processed after reading them into memory, and at over 500M, memory demands start to enter the region where they need special treatment for resource allocation.

Consider loading the XML file into a database that can manage memory much better, while providing structured access.

Something like this sqlite UI with an XML plug-in could help.

Python is a racist language what with it's dependence on white space!

Comment on Re: Faster file read, text search and replace

Replies are listed 'Best First'.
Re^2: Faster file read, text search and replace by Jenda (Abbot) on Feb 14, 2018 at 11:58 UTC
While I agree about the poorly designed system, reading whole XMLs into memory is more often than not poor design as well. Whether the file is huge (already) or not, if you do not have to, do not load the whole file into a huge maze of interconnected objects, but rather process it in chunks. XML::Twig or XML::Rules make that fairly easy to do. Jenda Enoch was right! Enjoy the last years of Rome.	[reply]

Replies are listed 'Best First'.

Re^2: Faster file read, text search and replace
by Jenda (Abbot) on Feb 14, 2018 at 11:58 UTC

While I agree about the poorly designed system, reading whole XMLs into memory is more often than not poor design as well. Whether the file is huge (already) or not, if you do not have to, do not load the whole file into a huge maze of interconnected objects, but rather process it in chunks. XML::Twig or XML::Rules make that fairly easy to do.

Jenda
Enoch was right!
Enjoy the last years of Rome.

[reply]

In Section Seekers of Perl Wisdom