Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??

Realistically, how could that be “a bug report?”   Think about it ... every multi-byte character encoding scheme that has ever been invented (or that could be) involves significant-bytes that precede the data that they modify.   If you are reading the file from stern to stem, well, “either you read them or you didn’t.”

It stands to reason, therefore, that you must be the one to have read “a few more bytes than you need,” and, having read those bytes, you have to figure out whether (unlucky you ...) you started reading smack-dab in the middle of a multi-byte (MBCS) sequence or not.   There is no bright-line rule answer for this.   The only reliable strategy that I can think of is to rely upon some contextual knowledge about the data stream itself.   Find some string of (non-MBCS) sequence that you know will occur somewhere within the last n characters of the data.   Then, read some n+x (for some x...) bytes from the tail of the file, then use a regex to search within that data for that reliable sequence.   Advance suspiciously forward from there.

Bear in mind that the onus is upon your application, not merely to come up with the right answers if it can, but to reliably fail if it cannot.   Your application is the only player with the capability to do this.   The fact that the algorithm does “produce answers at all” must, itself, be a positive indication that those answers are in fact worthy to be trusted.


In reply to Re: How do I use the "File::ReadBackwards" and open in "Unicode text, UTF-32, little-endian" mode by sundialsvc4
in thread How do I use the "File::ReadBackwards" and open in "Unicode text, UTF-32, little-endian" mode by hashperl

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others examining the Monastery: (8)
    As of 2014-09-16 18:18 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      How do you remember the number of days in each month?











      Results (40 votes), past polls