Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re: Why are there no errors when opening a filename that contains colons on Win10

by rjt (Curate)
on Oct 09, 2019 at 20:25 UTC ( #11107266=note: print w/replies, xml ) Need Help??


in reply to Why are there no errors when opening a filename that contains colons on Win10

You asked Windows to create a file, and gave it a name. Windows said, "OK, no problem!", and you got a successful return from open, but behind the scenes, Windows munged the filename. That's why you don't get an error, and Perl probably can't be reasonably expected to catch this sort of thing, as it runs on over a hundred different platforms, and does not guarantee filename in = filename out.

The colon is of course the volume separator on Win32, so while it's not a valid Win32 filename character, it is a valid path character supplied to open, (just as you might open C:\Windows\Crash). It of course isn't doing what you expect, and then I don't see how a "volume" of testlog_8- makes any sense. Chopping off the : and anything that follows, and saving a file with what would be the volume name makes even less sense, but it is what it is, I guess. DOS is, well, a bit different. :-)

Win32 naming is, well, a bit convoluted. This MSDN naming guide article illustrates the rules.

I'm not sure why you'd end up with random characters in your output logfile when you are generating the filename, but the solution is probably one of either a) making sure your filename generator function doesn't use those characters, or b) filtering out invalid characters before the call to open (or erroring out on invalid characters, perhaps).

If you need cross-platform support, it's a little trickier, but you also just be extra-picky and allow alnum, underscore, and dash, for example. I'm not aware of a module that portably processes filenames, otherwise that's what I'd recommend. In practice, a simple regex usually does the trick: $filename =~ s/[^\w-]//g;, but see File::Spec for some help with volume and path components, if need be. Mind the encoding.

use strict; use warnings; omitted for brevity.

Replies are listed 'Best First'.
Re^2: Why are there no errors when opening a filename that contains colons on Win10
by Lotus1 (Vicar) on Oct 09, 2019 at 20:44 UTC

    rjt, thanks for responding.

    The colon is of course the volume separator on Win32, so while it's not a valid Win32 filename character, it is a valid path character supplied to open, (just as you might open C:\Windows\Crash). It of course isn't doing what you expect, and then I don't see how a "volume" of testlog_8- makes any sense. Chopping off the : and anything that follows, and saving a file with what would be the volume name makes even less sense, but it is what it is, I guess. DOS is, well, a bit different. :-)

    Have a look at the output from test characters 4 and 5. Those were '/' and '\'. For those two the error given was: No such file or directory : The system cannot find the path specified. It was trying to create a file called '---.log' in a folder that did not exist. If the colon is a valid path character on Windows I would expect the same thing to happen.

    I'm not sure why you'd end up with random characters in your output logfile when you are generating the filename,[...],

    I don't follow what you mean about random characters in the output files. The print statement wrote to the files that were created successfully except for test character 8, the colon. In that case the file is empty. It seems to be a valid filehandle but the print didn't write to the file and the print seemed to succeed since it returned a true value.

      Exactly, that goes back to what I'm saying: every operating system has its own rules for valid pathnames, and its own unique semantics for what happens when you step outside of those rules. Perl, for the most part, respects those semantics, leaving it to the programmer to decide how best to handle them. If you ask to open a file, Perl passes that request along to the OS, and the return value you get is a function of what the OS itself returns. The errors for your other cases are coming from Windows, not Perl.

      I completely agree this Win32 behavior is complex and weird in spots, but it's Win32, not Perl that is giving you this behavior.

      I don't follow what you mean about random characters in the output files. The print statement wrote to the files that were created successfully except for test character 8, the colon. In that case the file is empty. It seems to be a valid filehandle but the print didn't write to the file and the print seemed to succeed since it returned a true value.

      See the Alternate Data Streams discussion in the MSDN article I linked. And add that to the list of surprises in support of validating your filenames! If you read back the file with the same script, you will actually get the contents back, even though foo appears to exist but be empty (and type foo outputs nothing):

      use autodie; my $filename = 'foo:bar.txt'; if ($ARGV[0]) { say "Skipping write."; } else { say "Writing $filename. Run $0 -skip to skip writing."; open my $fh, '>', $filename; say $fh 'Test text'; close $fh; } open my $read, '<', $filename; print "$filename: ' . <$read>; close $read; __END__ Test text

      This "works" because we're opening the Alternate Data Stream "bar.txt" of the file "foo". If you delete "foo", then "foo:bar.txt" will no longer be readable either.

      And this nicely highlights the general issue: Perl doesn't know or care what you are opening, only whether it succeeds. Perl trusts you to know what you are asking the OS to do. Similarly, if the Windows system call confirms 12 bytes were written, Perl will take Windows' word for it. It's up to you to verify the bytes written if desired, which isn't always possible (e.g., when writing to sockets). Knowing the semantics of your target platform(s) is incumbent on you whenever you are dealing with system-level code.

      Lastly, I do understand that this is experimental for you, and you have uncovered some interesting Win32 behavior. I hope these insights are helpful in answering your question.

      use strict; use warnings; omitted for brevity.

        rjt, Thanks for the follow up. I misunderstood a few things you said in your posts. The random characters you were referring to were characters in the filename. I only put a random character in my filename to test my logic in case there was a problem with opening a file. Also, I noticed on rereading your post you were talking about the colon being the volume separator not a path separator.

        Thanks for the link to File::Spec. It looks very useful for the kind of backend server work I do.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://11107266]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (7)
As of 2020-03-29 19:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    To "Disagree to disagree" means to:









    Results (171 votes). Check out past polls.

    Notices?