Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: RFC - File::Util 4.x Series Pre-Release

by vsespb (Chaplain)
on Jan 30, 2013 at 23:43 UTC ( [id://1016181]=note: print w/replies, xml ) Need Help??


in reply to RFC - File::Util 4.x Series Pre-Release

Some comments:


1. Filename like - :a:b:c:d:e:f:g.txt  - I think Mac
OS Classic support was dropped from Perl !



2. That's very sad that you use 5.006 and seems
not to support Unicode filenames processing, or you do?



3. Bigger problem with crospatform filenames
is case insensetivity and unicode normalization (on MacOSX). Need unicode for this.
Would be great to implement stuff to convert filenames to canonical form with this in mind.

4.
> escape_filename
> Illegal characters (i.e.- any type of newline character, tab, vtab, and the following / | * " ? < : > \),

It's not really clear to me what escape you are talking about and what characters are illegal. Escape for shell command line? What shell/what OS?



5. file_type, existent, can_write etc
I don't see any good reason to create wrappers to perl -X operators


6. Returns alphabetically sorted all file names in the directory specified if it exists
Do you mean ASCII sorted. To sort alphabetically you need unicode AND to know which locale to use.


7.
> list_dir
> Recurse subdirectories

hm. there is no option to follow/not follow symlinks for directory? What about recursive symlinks?

8.
> needs_binmode
i think any OS needs binmode:

> http://search.cpan.org/~dom/perl-5.12.5/pod/perlfunc.pod#binmode
> . Note that, despite what may be implied in "Programming Perl" (the Camel, 3rd edition) or elsewhere, :raw is not simply the inverse of :crlf. Other layers that would affect the binary nature of the stream are also disabled. See PerlIO, perlrun, and the discussion about the PERLIO environment variable.

9.
> $ILLEGAL_CHR = qr/\\$NL\r\n\t\013\*\"\?\<\:\>/;
a) on Linux any character except NULL and '/' is legal
b) NULL is illegal.

Replies are listed 'Best First'.
Re^2: RFC - File::Util 4.x Series Pre-Release
by Tommy (Chaplain) on Jan 31, 2013 at 00:37 UTC

    Brilliant, vsespb. Thank you for all your points! I'll respond below...

    point #1 - I've never encountered a situation where someone needed classic macos support, but I tried to support it anyway.

    point #2 - I've been considering lifting the minimum Perl to 5.6. Thoughts?

    point #3 - still thinking about that one

    point #4 - in addition to claiming to be cross platform, File::Util guides you to use filenames and characters that can port between FAT32, EXT2 and upwards. Is it bad to enforce that? Hmmm. Nobody ever brought it up before. This could become much more complicated if I get unicode involved --- or --- I could just not attempt to trap nasty characters. The entire point of trying to do so was to make sure nobody tried to name a file with an embedded directory separator in it. It grew out from there (by request, from people who wanted me to trap *potential* dangers and provide diagnostic and "helpful" error messages.) Perhaps it's time to leave that behind...

    point #5 - Agree to disagree there. While I personally can relate to what you're saying, I've had a lot of people ask for methods that are "easier to remember" than -X. For the sake of those people, those methods will remain.

    point #6 - Yes, they are sorted a la sort { $a cmp $b } OR sort { uc $a cmp uc $b } depending on what was requested by the caller. That's "asciibetical" sorting for the most part. I should either advertise that up front, or use a unicode sorting mechanism. I wonder if the latter is overkill.

    point #7 - I haven't written a way to detect looping symlinks, so I don't follow them in the code. It could be an option, but I'd have to keep track of actual inodes I think (lstat). Is there a preferred way to do this without memory bloat and performance degredation while constantly adding to and comparing entries in the %inodes_seen lookup table?

    point #8 - That very well may be deprecated and silently removed from the documentation. On the backend, everything is done with syswrite anyway, for THAT EXACT reason.

    point #9 - same reasoning and response as point #4. Open to suggestions and criticisms on this.

    Tommy
    A mistake can be valuable or costly, depending on how faithfully you pursue correction
      > point #2 
      It was about unicodel. I think think that proper unicode support will require 5.8.. No other objections about version.
      
      > point #4
      It's pretty uncommon to _escape_ illegal characters to make filenames portable. I think better deny it. Btw here is what Dropbox thinks about filename portability https://www.dropbox.com/help/145/en
      
      > point #7
      yep, that's the thing. when one provide method to traverse directory, it usually have option to follow symlinks (with this option on it takes much more memory) OR at least
      this method should not hang - it should detect symlinks to stop crawling it
      
      Why 5.6 or even older versions? Are you supporting legacy systems? Unicode support seems far more important these days.

      Elda Taluta; Sarks Sark; Ark Arks
      My deviantART gallery

        Well. I'm getting a big indication that Unicode is a priority, moreso than back compat. This has got my head buzzing with ideas... such as detecting old Perls and conditionally enabling Unicode if the Perl is new enough. I just don't know if it can be done at runtime -- haven't tried it yet. Seems like an unwise strategy.

        I may just throw down the guantlet and support Unicode, requiring 5.8 from now on... after all, that's what Perl did. If you don't like it, you can always go get an older version, right?

        Tommy
        A mistake can be valuable or costly, depending on how faithfully you pursue correction

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1016181]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (5)
As of 2024-03-29 13:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found