Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

When can the character length of a Perl Scalar become an issue?

by misterperl (Pilgrim)
on Sep 20, 2023 at 12:53 UTC ( #11154543=perlquestion: print w/replies, xml ) Need Help??

misterperl has asked for the wisdom of the Perl Monks concerning the following question:

I'm creating an XML array with Perl. One line of the XML can potentially contain many characters. Generally it's not an issue. But sometimes the line gets truncated at some arbitrary point, like:

<tag>many characters ends abruptly here in the middl</tag>

I save the line in a scalar array element, then print the entire array. The result has the begin and end tag, but the line in the middle is truncated. In the last fail, it truncated at 23,052 characters which didn't seem like a particularly significant number. In other cases the line greatly exceeds that length, without truncation.

Ideas of why this happens on some lines and not others, or better ways to save these long lines with are appreciated. Maybe more than one tag? I studied MAX XML tag-contents restrictions, as well as Perl scalar length limits, and I don't see any specific limits.

Replies are listed 'Best First'.
Re: When can the character length of a Perl Scalar become an issue?
by philipbailey (Curate) on Sep 20, 2023 at 13:09 UTC

    According to perldata, "a scalar is a single string (of any size, limited only by the available memory), number, or a reference to something". So the length of your XML element is not constrained by any arbitrary limit in Perl's scalar variables.

    The string truncation must therefore be due to some other cause. As suggested by marto, you will likely need to show us code to get further help.

      Dear Philip Bailey

      Better be prepared ... soon eyepops will be showing up and taking bets on your sanctification.

      Only 30 points left to go, but you should post more often than every 2 month in average! ;)

      Cheers Rolf
      (addicted to the 𐍀𐌴𐍂𐌻 Programming Language :)
      Wikisyntax for the Monastery

      PS: I like your work, especially "Easy Lover" with Phil Collins... ;-P

        Yes, it's been a slow climb to (nearly) sainthood, and I probably should post more!

        Thanks for remembering the singing career of my namesake!

Re: When can the character length of a Perl Scalar become an issue?
by marto (Cardinal) on Sep 20, 2023 at 13:02 UTC

    No SSCCE?

    Update: example:

    #!/usr/bin/perl use strict; use warnings; use Mojo::DOM; use feature 'say'; my $derp = 'x' x 30000; my $dom = Mojo::DOM->new('<?xml version="1.0"?><herp><derp>Test</derp> +</herp>'); say length ( $dom->at( 'derp' )->text ); $dom->at( 'derp' )->content( $derp ); say length ( $dom->at( 'derp' )->text ); #say $dom;
      I'm not sure what this tells me? It all ran fine. As I said sometimes it trucates sometimes not.
      Enter h or 'h h' for help, or 'man perldebug' for more help. main::(./y.pl:6): my $derp = 'x' x 30000; DB<1> n main::(./y.pl:7): my $dom = Mojo::DOM->new('<?xml version="1.0"? +><herp><derp>Test</derp> main::(./y.pl:8): +</herp>'); DB<1> main::(./y.pl:9): say length ( $dom->at( 'derp' )->text ); DB<1> 4 main::(./y.pl:10): $dom->at( 'derp' )->content( $derp ); DB<1> main::(./y.pl:11): say length ( $dom->at( 'derp' )->text ); DB<1> 30000 main::(./y.pl:12): say $dom; DB<1> <?xml version="1.0"?><herp><derp>xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx +xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx +xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx +xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx +xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx +xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx +xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx +xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx +xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx (30,000 printed total)</derp> +</herp> Debugged program terminated. Use q to quit or R to restart, use o inhibit_exit to avoid stopping after program termination, h q, h R or h o to get additional info. DB<1>

        Purely an SSCCE to illustrate adding 30K characters to an XML node. You're likely doing something in the code you're not showing us that causes the problem.

Re: When can the character length of a Perl Scalar become an issue?
by Polyglot (Chaplain) on Sep 21, 2023 at 04:58 UTC
    There are a few times in my past experience where this sort of seemingly random thing has happened to me. Usually it was that I had not properly closed the file I had been writing to, or that the write buffer had not, for whatever reason, entirely been emptied--the latter problem of which was alleviated by turning the print handle "hot":

    $| = 1;

    (Place that somewhere prior to your print task--typically near the top of the script for me.)

    Blessings,

    ~Polyglot~

      Usually it was that I had not properly closed the file I had been writing to, or that the write buffer had not, for whatever reason, entirely been emptied

      Perl flushes buffers when closing a file and closes files when exiting, so this is almost certainly not the problem, especially since the OP says that the closing tag still gets written after the truncation.

      the latter problem of which was alleviated by turning the print handle "hot": $| = 1;

      This won't help not only due to what I said above, but also because $| refers to the currently selected filehandle, so STDOUT by default; it won't do anything for code like OP showed. Even though it won't help here, for completeness, $filehandle->autoflush(1) is one way of controlling it.

        This won't help not only due to what I said above, but also because $| refers to the currently selected filehandle, so STDOUT by default; it won't do anything for code like OP showed. Even though it won't help here, for completeness, $filehandle->autoflush(1) is one way of controlling it.

        Yes. I'm always saddened to see unfortunate old Perl globals, such as $|, still being used today.

        For more detail on this topic see: Re: what is the meaning of $| in perl? (Buffering/autoflush/Unicode/UTF-8 References)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11154543]
Approved by philipbailey
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (4)
As of 2023-12-09 18:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What's your preferred 'use VERSION' for new CPAN modules in 2023?











    Results (38 votes). Check out past polls.

    Notices?