Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Tracking this stuff through the perl sources is a nightmare.

  • The -f is mapped to a function Perl_pp_ftis in opcode.h (for FileTestIs (a file) perhaps?).
  • Which is mapped to lib\core\embed.h:#define pp_ftis                        Perl_pp_ftis
  • Which is implemented in terms of my_stat() in pp_sys.c
    PP(pp_ftis) { I32 result = my_stat(); dSP; if (result < 0) RETPUSHUNDEF; RETPUSHYES; }
  • Which is mapped to embed.h:3209:#define my_stat()          Perl_my_stat(aTHX)
  • Perl_my_stat() is implemented in doio.c in terms of PerlLIO_fstat()
  • Which is mapped to iperlsys.h:734:#define PerlLIO_fstat(fd, buf)           Fstat((fd), (buf))
  • Which gets mapped to lib\core\dosish.h:133:#define Fstat(fd,bufptr)   fstat((fd),(bufptr))
  • Which is a c-runtime api.

    However, fstat() takes a file descriptor (fd), which implies an open file handle...Should have noticed that earlier. Track back to where the calls are not defined in terms of a file descriptor and were back at doio.c. Sure enough, follow the other path in Perl_my_stat()and we find it calls PerlLIO_stat()

  • Which maps to lib\core\iperlsys.h:739:#define PerlLIO_stat(name, buf)         Stat((name), (buf))
  • Which is mapped to dosish.h:142:#  define Stat(fname,bufptr) stat((fname),(bufptr))
  • Then you have to move over to the C-runtime headers, which I don't have for MSVC++ (as used by Active State), so I can't bore you with those details, but suffice it to say, stat() ends up getting mapped to _stat64() as AS build with large file support which means they need the version of stat that can handle filesizes >32-bit.

Dissassembling the code in MSVCRT.dll:_stat64 we find this

Disassembly of Function _stat64 (0x78018226) ;********************************************************************* +*********** ; *** _stat64 (456) *** ; SYM:_stat64 0x78018226: push ebp 0x78018227: mov ebp,esp 0x78018229: sub esp,0x24c 0x7801822F: push ebx 0x78018230: mov ebx,dword ptr [ebp+0x8] ; ARG:0x8 0x78018233: push esi 0x78018234: push edi 0x78018235: push 0x78034c00 ; DATA:?* 0x7801823A: push ebx 0x7801823B: call 0x7800edaa ; SYM:_mbspbrk 0x78018240: pop ecx 0x78018241: test eax,eax 0x78018243: pop ecx 0x78018244: jnz 0x78018398 ; (*+0x154) 0x7801824A: cmp byte ptr [ebx+0x1],0x3a 0x7801824E: jnz 0x7801838e ; (*+0x140) 0x78018254: mov eax,dword ptr [ebx] 0x78018256: test al,al 0x78018258: jz 0x78018264 ; (*+0xC) 0x7801825A: cmp byte ptr [ebx+0x2],0x0 0x7801825E: jz 0x78018398 ; (*+0x13A) 0x78018264: movsx al,al ; <==0x7801825 +8(*-0xC) 0x78018267: push eax 0x78018268: call 0x7801eceb ; SYM:_mbctolo +wer 0x7801826D: pop ecx 0x7801826E: sub eax,0x60 0x78018271: mov dword ptr [ebp-0x4],eax ; VAR:0x4; <== +0x78018393(*+0x122) 0x78018274: lea eax,dword ptr [ebp-0x148] ; VAR:0x148 0x7801827A: push eax 0x7801827B: push ebx 0x7801827C: call dword ptr [0x78033080] ; EXT:KERNEL32 +.DLL!FindFirstFileA 0x78018282: cmp eax,0xff 0x78018285: mov dword ptr [ebp-0x8],eax ; VAR:0x8 0x78018288: jnz 0x780183b1 ; (*+0x129) 0x7801828E: push 0x78034bfc ; DATA:./\ 0x78018293: push ebx 0x78018294: call 0x7800edaa ; SYM:_mbspbrk 0x78018299: pop ecx 0x7801829A: test eax,eax 0x7801829C: pop ecx 0x7801829D: jz 0x78018398 ; (*+0xFB) 0x780182A3: push 0x104 0x780182A8: lea eax,dword ptr [ebp-0x24c] ; VAR:0x24 0x780182AE: push ebx 0x780182AF: push eax 0x780182B0: call 0x78017bbf ; SYM:_fullpat +h 0x780182B5: mov esi,eax 0x780182B7: xor edi,edi 0x780182B9: add esp,0xc 0x780182BC: cmp esi,edi 0x780182BE: jz 0x78018398 ; (*+0xDA) 0x780182C4: push esi 0x780182C5: call 0x78003a9f ; SYM:strlen 0x780182CA: cmp eax,0x3 0x780182CD: pop ecx 0x780182CE: jz 0x780182df ; (*+0x11) 0x780182D0: push esi 0x780182D1: call 0x7801874b 0x780182D6: test eax,eax 0x780182D8: pop ecx 0x780182D9: jz 0x78018398 ; (*+0xBF) 0x780182DF: push esi ; <==0x780182C +E(*-0x11) 0x780182E0: call dword ptr [0x78033170] ; EXT:KERNEL32 +.DLL!GetDriveTypeA 0x780182E6: cmp eax,0x1 0x780182E9: jbe 0x78018398 ; (*+0xAF) 0x780182EF: and byte ptr [ebp-0x11c],0x0 ; VAR:0x11 0x780182F6: mov esi,dword ptr [ebp+0xc] ; ARG:0x 0x780182F9: push 0xff 0x780182FB: push edi 0x780182FC: push edi 0x780182FD: push edi 0x780182FE: push 0x1 0x78018300: push 0x1 0x78018302: push 0x7bc 0x78018307: mov dword ptr [ebp-0x148],0x10 ; VAR:0x148 0x78018311: mov dword ptr [ebp-0x12c],edi ; VAR:0x12 0x78018317: mov dword ptr [ebp-0x128],edi ; VAR:0x128 0x7801831D: call 0x7802ab02 0x78018322: mov dword ptr [esi+0x28],eax 0x78018325: mov dword ptr [esi+0x2c],edx 0x78018328: mov eax,dword ptr [esi+0x28] 0x7801832B: mov ecx,edx 0x7801832D: mov dword ptr [esi+0x20],eax 0x78018330: mov dword ptr [esi+0x30],eax 0x78018333: add esp,0x1c 0x78018336: mov dword ptr [esi+0x24],ecx 0x78018339: mov dword ptr [esi+0x34],ecx 0x7801833C: push ebx ; <==0x7801842 +E(*+0xF2) 0x7801833D: push dword ptr [ebp-0x148] ; VAR:0x148 0x78018343: call 0x78017ea1 0x78018348: pop ecx 0x78018349: mov word ptr [esi+0x6],ax 0x7801834D: pop ecx 0x7801834E: mov word ptr [esi+0x8],0x1 0x78018354: push 0x1 0x78018356: push edi 0x78018357: push edi 0x78018358: push dword ptr [ebp-0x12c] ; VAR:0x12 0x7801835E: call 0x78003e1a 0x78018363: mov ecx,dword ptr [ebp-0x128] ; VAR:0x128 0x78018369: xor ebx,ebx 0x7801836B: add eax,ecx 0x7801836D: mov word ptr [esi+0x4],di 0x78018371: mov dword ptr [esi+0x18],eax 0x78018374: mov eax,dword ptr [ebp-0x4] ; VAR:0x4 0x78018377: adc edx,ebx 0x78018379: dec eax 0x7801837A: mov dword ptr [esi],eax 0x7801837C: mov dword ptr [esi+0x10],eax 0x7801837F: mov dword ptr [esi+0x1c],edx 0x78018382: mov word ptr [esi+0xc],di 0x78018386: mov word ptr [esi+0xa],di 0x7801838A: xor eax,eax 0x7801838C: jmp 0x780183ac ; (*+0x20) 0x7801838E: call 0x78017a8f ; SYM:_getdriv +e; <==0x7801824E(*-0x140) 0x78018393: jmp 0x78018271 ; (*-0x122) 0x78018398: call 0x7800c9ac ; SYM:_errno; +<==0x78018244(*-0x154), 0x7801829D(*-0xFB), 0x780182BE(*-0xDA), 0x780 +182D9(*-0xBF), 0x7801825E(*-0x13A), 0x780182E9(*-0xAF) 0x7801839D: push 0x2 0x7801839F: pop esi 0x780183A0: mov dword ptr [eax],esi 0x780183A2: call 0x7800c9b5 ; SYM:__doserr +no 0x780183A7: mov dword ptr [eax],esi 0x780183A9: or eax,0xff 0x780183AC: pop edi ; <==0x7801838 +C(*-0x20) 0x780183AD: pop esi 0x780183AE: pop ebx 0x780183AF: leave 0x780183B0: ret ;********************************************************************* +***********

The salient point here is the call to EXT:KERNEL32.DLL!FindFirstFileA.

You asked: ...why this is effectively performing a directory search in order to test one filename!..... And the answer goes something like this.

When you pass a filespec to stat() or one of its varients, in order to ask the os for information about the file, you need to get an OS 'handle' (an INODE) in unix terms) to that file. You can get one of these by various means, but if you aren't interested in opening the file, then the OS gives you a call to obtain that handle. In Win32 this is FindFirstFile(), I think the equivalent under unix is opendir or maybe it gets translated directly into (ioctl?) calls to the underlying filesystem.

Anyway, the point is that under normal circumstances, calling FFF with a non-wildcard filespec returns a structure containing (almost) all the information required to satisfy the stat() call.

typedef struct _WIN32_FIND_DATA { DWORD dwFileAttributes; FILETIME ftCreationTime; FILETIME ftLastAccessTime; FILETIME ftLastWriteTime; DWORD nFileSizeHigh; DWORD nFileSizeLow; DWORD dwReserved0; DWORD dwReserved1; TCHAR cFileName[MAX_PATH]; TCHAR cAlternateFileName[14]; } WIN32_FIND_DATA, *PWIN32_FIND_DATA;

Whilst you may not thinkof stat() as doing a "directory search", it has to have the filesystem search to find the information to fulfil the stat() call, which on unix is in the INODE and so a search is being done, it's just that the API name doesn't reflect this.

The fact that your filespec contains some adornments used for determining the filemode is a Perl thing and not the OS's problem.

That these conflict with an undocumented feature within the OS is ...erm.. unfortunate! I guess that LW and MS came to the same conclusion that '<' & '>' are good candidates for using as meta characters as most sane people are unlikely to embed these in their filenames as they would conflict with their use as redirection metacharacters by CLIs.

In the final analysis, you would have to strip Perls two-arg open metacharacters from the filespec before you passed them to stat(), whichever OS you are on!


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
If I understand your problem, I can solve it! Of course, the same can be said for you.


In reply to Re: Re: Re: Re: Re: Unexpected file test (-f) result on Windows by BrowserUk
in thread Unexpected file test (-f) result on Windows by DaveH

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (4)
As of 2024-04-20 10:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found