That version supports opening files in TEXT mode (similar to FTP) and there are two ways to do it, first one is to convert the native new-lines to CRLF before sending through the network and the second one is to tell the client what the native newline sequence is and let it handle the burden of the conversion.
Hm. My reading of the appropriate RFC is slightly different, in that the server can choose whether to send CRLF or a single char line ending of their choice:
4.3 Determining Server Newline Convention
In order to correctly process text files in a cross platform
compatible way, the newline convention must be converted from that
+of
the server to that of the client, or, during an upload, from that o
+f
the client to that of the server.
Versions 3 and prior of this protocol made no provisions for
processing text files. Many clients implemented some sort of
conversion algorithm, but without either a 'canonical' on the wire
format or knowledge of the servers newline convention, correct
conversion was not always possible.
Starting with Version 4, the SSH_FXF_TEXT file open flag (Section
6.3) makes it possible to request that the server translate a file
+to
a 'canonical' on the wire format. This format uses \r\n as the lin
+e
separator.
Servers for systems using multiple newline characters (for example,
Mac OS X or VMS) or systems using counted records, MUST translate t
+o
the canonical form.
However, to ease the burden of implementation on servers that use a
single, simple separator sequence, the following extension allows t
+he
canonical format to be changed.
string "newline"
string new-canonical-separator (usually "\r" or "\n" or "\r\n"
+)
All clients MUST support this extension.
When processing text files, clients SHOULD NOT translate any
character or sequence that is not an exact match of the servers
newline separator.
In particular, if the newline sequence being used is the canonical
"\r\n" sequence, a lone \r or a lone \n SHOULD be written through
without change.
And it is down to the clients to convert whatever the server sends to their required local form.
At this point, it seems to me that the simple solution is the first one letting Perl read the file in text mode and then applying s/\n/\r\n/. This may be slightly incorrect in some edge cases (for instance, files on Windows with \n line endings) that nobody would care about so I don't either!
I whole-heartedly agree, though I would approach that solution in a slightly different manner.
When TEXT mode is requested:
- Open the file in text mode;
- Read the file line-by-line using the system default INPUT_SEPARATOR;
- chomp each line read;
- Write to the socket line-by-line; having set the OUTPUT_SEPARATOR to CRLF;
This way, whatever the local line separator is, it gets taken care of by Perl (or the CRT of you're using XS). And the data is transmitted with the required 'canonical newlines'.
Clients then do the same in reverse. Read from the socket line-by-line having set their INPUT_SEPARATOR to CRLF; chomp; and write line-by-line using the default OUTPUT_SEPARATOR for their local platform.
This way, the conversions are taken care of at both ends by perl or the CRT. At least, for ascii/ANSi/ISO-whatever-that-number-is files that have the 'correct' newlines on the originating platforms.
Things (will) get far more messy once the RFCs start dealing with Unicrap.
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
| [reply] [d/l] |
| [reply] |
C:\test>od -t x1 Huawei.xml | head
0000000 0d 0a 0d 0a 0d 0a 0d 0a 3c 21 44 4f 43 54 59 50
0000020 45 20 48 54 4d 4c 20 50 55 42 4c 49 43 20 22 2d
0000040 2f 2f 57 33 43 2f 2f 44 54 44 20 48 54 4d 4c 20
0000060 34 2e 30 31 20 54 72 61 6e 73 69 74 69 6f 6e 61
0000100 6c 2f 2f 45 4e 22 20 22 68 74 74 70 3a 2f 2f 77
0000120 77 77 2e 77 33 2e 6f 72 67 2f 54 52 2f 68 74 6d
0000140 6c 34 2f 6c 6f 6f 73 65 2e 64 74 64 22 3e 0d 0a
0000160 3c 68 74 6d 6c 3e 0d 0a 3c 68 65 61 64 3e 0d 0a
0000200 3c 74 69 74 6c 65 3e e8 8f af e7 82 ba ef bc 8c
0000220 e8 8f af e7 82 ba e5 85 ac e5 8f b8 ef bc 8c e8
...
Source
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
| [reply] [d/l] |