Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re^2: Native newline encoding

by salva (Canon)
on May 23, 2012 at 08:28 UTC ( [id://971982]=note: print w/replies, xml ) Need Help??


in reply to Re: Native newline encoding
in thread Native newline encoding

I am extending Net::SFTP::Server to implement version 4 of the SFTP protocol. That version supports opening files in TEXT mode (similar to FTP) and there are two ways to do it, first one is to convert the native new-lines to CRLF before sending through the network and the second one is to tell the client what the native newline sequence is and let it handle the burden of the conversion.

At this point, it seems to me that the simple solution is the first one letting Perl read the file in text mode and then applying s/\n/\r\n/. This may be slightly incorrect in some edge cases (for instance, files on Windows with \n line endings) that nobody would care about so I don't either!

Replies are listed 'Best First'.
Re^3: Native newline encoding
by BrowserUk (Patriarch) on May 23, 2012 at 09:16 UTC
    That version supports opening files in TEXT mode (similar to FTP) and there are two ways to do it, first one is to convert the native new-lines to CRLF before sending through the network and the second one is to tell the client what the native newline sequence is and let it handle the burden of the conversion.

    Hm. My reading of the appropriate RFC is slightly different, in that the server can choose whether to send CRLF or a single char line ending of their choice:

    And it is down to the clients to convert whatever the server sends to their required local form.

    At this point, it seems to me that the simple solution is the first one letting Perl read the file in text mode and then applying s/\n/\r\n/. This may be slightly incorrect in some edge cases (for instance, files on Windows with \n line endings) that nobody would care about so I don't either!

    I whole-heartedly agree, though I would approach that solution in a slightly different manner.

    When TEXT mode is requested:

    1. Open the file in text mode;
    2. Read the file line-by-line using the system default INPUT_SEPARATOR;
    3. chomp each line read;
    4. Write to the socket line-by-line; having set the OUTPUT_SEPARATOR to CRLF;

    This way, whatever the local line separator is, it gets taken care of by Perl (or the CRT of you're using XS). And the data is transmitted with the required 'canonical newlines'.

    Clients then do the same in reverse. Read from the socket line-by-line having set their INPUT_SEPARATOR to CRLF; chomp; and write line-by-line using the default OUTPUT_SEPARATOR for their local platform.

    This way, the conversions are taken care of at both ends by perl or the CRT. At least, for ascii/ANSi/ISO-whatever-that-number-is files that have the 'correct' newlines on the originating platforms.

    Things (will) get far more messy once the RFCs start dealing with Unicrap.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    The start of some sanity?

      Things (will) get far more messy once the RFCs start dealing with Unicrap.

      The RFCs have handled binary data for years, and no one batted an eye.

      -sauoq
      "My two cents aren't worth a dime.";

        So, you consider this "binary":

        C:\test>od -t x1 Huawei.xml | head 0000000 0d 0a 0d 0a 0d 0a 0d 0a 3c 21 44 4f 43 54 59 50 0000020 45 20 48 54 4d 4c 20 50 55 42 4c 49 43 20 22 2d 0000040 2f 2f 57 33 43 2f 2f 44 54 44 20 48 54 4d 4c 20 0000060 34 2e 30 31 20 54 72 61 6e 73 69 74 69 6f 6e 61 0000100 6c 2f 2f 45 4e 22 20 22 68 74 74 70 3a 2f 2f 77 0000120 77 77 2e 77 33 2e 6f 72 67 2f 54 52 2f 68 74 6d 0000140 6c 34 2f 6c 6f 6f 73 65 2e 64 74 64 22 3e 0d 0a 0000160 3c 68 74 6d 6c 3e 0d 0a 3c 68 65 61 64 3e 0d 0a 0000200 3c 74 69 74 6c 65 3e e8 8f af e7 82 ba ef bc 8c 0000220 e8 8f af e7 82 ba e5 85 ac e5 8f b8 ef bc 8c e8 ...

        Source


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        The start of some sanity?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://971982]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (2)
As of 2024-04-20 03:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found