Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Using Microsoft Visual C Link Time Code Generation (LTCG) and Profile Guided Optimization (PGO) with XS modules

by bulk88 (Priest)
on Jul 22, 2012 at 23:48 UTC ( #983092=CUFP: print w/ replies, xml ) Need Help??

If you have a modern Visual C, you can compile Perl XS Modules with Profile Guided Optimization. I won't repeat what Google and MSDN say, but I'll describe PGO as this, for this code.
char *inString = (((inStringSV)->sv_flags & (0x00000400)) == 0x00000400 ? ((in_len = ((XPV *) (inStringSV)->sv_any)- +>xpv_cur), ((inStringSV)->sv_u.svu_pv)) : Perl_sv_2p +v_flags (my_perl, inStringSV, &in_len, + 2));
The Perl_sv_2pv_flags branch will be moved to the end of executable region of the DLL.
if (items != 3) S_croak_xs_usage(my_perl, cv,"Destination, Source, Length");
The S_croak_xs_usage branch will be moved to the end of the executable region of the DLL. Those are the 2 most visible changes I saw in disassembly. There are other that I didn't see or realize, or won't describe here. So, to get started, first, check that your Visual C has PGO. Easiest way is to do "link /?". Here is the output from Visual Studio 2003.
Microsoft (R) Incremental Linker Version 7.10.6030 Copyright (C) Microsoft Corporation. All rights reserved. usage: LINK [options] [files] [@commandfile] options: /ALIGN:# /ALLOWBIND[:NO] /ASSEMBLYDEBUG[:DISABLE] /ASSEMBLYLINKRESOURCE:filename /ASSEMBLYMODULE:filename /ASSEMBLYRESOURCE:filename /BASE:{address|@filename,key} /DEBUG /DEF:filename /DEFAULTLIB:library /DELAY:{NOBIND|UNLOAD} /DELAYLOAD:dll /DELAYSIGN[:NO] /DLL /DRIVER[:{UPONLY|WDM}] /ENTRY:symbol /EXETYPE:DYNAMIC /EXPORT:symbol /FIXED[:NO] /FORCE[:{MULTIPLE|UNRESOLVED}] /HEAP:reserve[,commit] /IDLOUT:filename /IGNOREIDL /IMPLIB:filename /INCLUDE:symbol /INCREMENTAL[:NO] /KEYFILE:filename /KEYCONTAINER:name /LARGEADDRESSAWARE[:NO] /LIBPATH:dir /LTCG[:{NOSTATUS|PGINSTRUMENT|PGOPTIMIZE|STATUS}] (PGINSTRUMENT and PGOPTIMIZE are only available for IA64) /MACHINE:{AM33|ARM|EBC|IA64|M32R|MIPS|MIPS16|MIPSFPU|MIPSFPU16|M +IPSR41XX| SH3|SH3DSP|SH4|SH5|THUMB|X86} /MAP[:filename] /MAPINFO:{EXPORTS|LINES} /MERGE:from=to /MIDL:@commandfile /NOASSEMBLY /NODEFAULTLIB[:library] /NOENTRY /NOLOGO /OPT:{ICF[=iterations]|NOICF|NOREF|NOWIN98|REF|WIN98} /ORDER:@filename /OUT:filename /PDB:filename /PDBSTRIPPED:filename /PGD:filename /RELEASE /SAFESEH[:NO] /SECTION:name,[E][R][W][S][D][K][L][P][X][,ALIGN=#] /STACK:reserve[,commit] /STUB:filename /SUBSYSTEM:{CONSOLE|EFI_APPLICATION|EFI_BOOT_SERVICE_DRIVER| EFI_ROM|EFI_RUNTIME_DRIVER|NATIVE|POSIX|WINDOWS| WINDOWSCE}[,#[.##]] /SWAPRUN:{CD|NET} /TLBOUT:filename /TSAWARE[:NO] /TLBID:# /VERBOSE[:{LIB|SAFESEH}] /VERSION:#[.#] /VXD /WINDOWSCE:{CONVERT|EMULATION} /WS:AGGRESSIVE
Visual Studio 2003 isn't any good. Here is the output from Visual Studio 2008.
Microsoft (R) Incremental Linker Version 9.00.30729.01 Copyright (C) Microsoft Corporation. All rights reserved. usage: LINK [options] [files] [@commandfile] options: /ALIGN:# /ALLOWBIND[:NO] /ALLOWISOLATION[:NO] /ASSEMBLYDEBUG[:DISABLE] /ASSEMBLYLINKRESOURCE:filename /ASSEMBLYMODULE:filename /ASSEMBLYRESOURCE:filename[,[name][,PRIVATE]] /BASE:{address[,size]|@filename,key} /CLRIMAGETYPE:{IJW|PURE|SAFE} /CLRSUPPORTLASTERROR[:{NO|SYSTEMDLL}] /CLRTHREADATTRIBUTE:{STA|MTA|NONE} /CLRUNMANAGEDCODECHECK[:NO] /DEBUG /DEF:filename /DEFAULTLIB:library /DELAY:{NOBIND|UNLOAD} /DELAYLOAD:dll /DELAYSIGN[:NO] /DLL /DRIVER[:{UPONLY|WDM}] /DYNAMICBASE[:NO] /ENTRY:symbol /ERRORREPORT:{NONE|PROMPT|QUEUE|SEND} /EXPORT:symbol /FIXED[:NO] /FORCE[:{MULTIPLE|UNRESOLVED}] /FUNCTIONPADMIN[:size] /HEAP:reserve[,commit] /IDLOUT:filename /IGNOREIDL /IMPLIB:filename /INCLUDE:symbol /INCREMENTAL[:NO] /KEYCONTAINER:name /KEYFILE:filename /LARGEADDRESSAWARE[:NO] /LIBPATH:dir /LTCG[:{NOSTATUS|PGINSTRUMENT|PGOPTIMIZE|PGUPDATE|STATUS}] /MACHINE:{ARM|EBC|IA64|MIPS|MIPS16|MIPSFPU|MIPSFPU16| SH4|THUMB|X64|X86} /MANIFEST[:NO] /MANIFESTDEPENDENCY:manifest dependency /MANIFESTFILE:filename /MANIFESTUAC[:{NO|UAC fragment}] /MAP[:filename] /MAPINFO:{EXPORTS} /MERGE:from=to /MIDL:@commandfile /NOASSEMBLY /NODEFAULTLIB[:library] /NOENTRY /NOLOGO /NXCOMPAT[:NO] /OPT:{ICF[=iterations]|NOICF|NOREF|REF} /ORDER:@filename /OUT:filename /PDB:filename /PDBSTRIPPED:filename /PGD:filename /PROFILE /RELEASE /SAFESEH[:NO] /SECTION:name,[[!]{DEKPRSW}][,ALIGN=#] /STACK:reserve[,commit] /STUB:filename /SUBSYSTEM:{BOOT_APPLICATION|CONSOLE|EFI_APPLICATION| EFI_BOOT_SERVICE_DRIVER|EFI_ROM|EFI_RUNTIME_DRIVER| NATIVE|POSIX|WINDOWS|WINDOWSCE}[,#[.##]] /SWAPRUN:{CD|NET} /TLBID:# /TLBOUT:filename /TSAWARE[:NO] /VERBOSE[:{ICF|LIB|REF|SAFESEH}] /VERSION:#[.#] /WX[:NO]
We are in business.

I will use Win32API::File as an example. First run perl makefile.pl.
C:\Documents and Settings\Administrator\Desktop\w32f>perl makefile.pl Checking if your kit is complete... Looks good Writing Makefile for Win32API::File Writing MYMETA.yml and MYMETA.json C:\Documents and Settings\Administrator\Desktop\w32f>
Now to use PGO, we require obj files that were created with the -GL option (LTCG compiler side). Look in the makefile for the OPTIMIZE macro. Yours might be different than mine.
# --- MakeMaker cflags section: CCFLAGS = -nologo -GF -W3 -MD -Zi -DNDEBUG -O1 -DWIN32 -D_CONSOLE -DNO +_STRICT -D_CRT_SECURE_NO_DEPRECATE -D_CRT_NONSTDC_NO_DEPRECATE -DPERL +_TEXTMODE_SCRIPTS -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_P +ERLIO OPTIMIZE = -MD -Zi -DNDEBUG -O1
Now the magic here is, you don't need to change CCFLAGS, OPTIMIZE comes after CCFLAGS, and Visual C will use the last option as the final option, so if CCFLAGS has -Od, and OPTIMIZE has -O1, -O1 takes precendence, there will be warnings at the console though. So we have to add -GL to OPTIMIZE macro. Not in the makefile, but on the command line to nmake, keep reading. Also look for the OTHERLDFLAGS macro, it is always empty by default unless your XS module has a very fancy Makefile.PL.
# --- MakeMaker dynamic_lib section: # This section creates the dynamically loadable $(INST_DYNAMIC) # from $(OBJECT) and possibly $(MYEXTLIB). OTHERLDFLAGS = INST_DYNAMIC_DEP = $(INST_DYNAMIC): $(OBJECT) $(MYEXTLIB) $(BOOTSTRAP) $(INST_ARCHAUTODIR +)$(DFSEP).exists $(EXPORT_LIST) $(PERL_ARCHIVE) $(INST_DYNAMIC_DEP) $(LD) -out:$@ $(LDDLFLAGS) $(LDFROM) $(OTHERLDFLAGS) $(MYEXTLIB) $ +(PERL_ARCHIVE) $(LDLOADLIBS) -def:$(EXPORT_LIST) if exist $@.manifest mt -nologo -manifest $@.manifest -outputresou +rce:$@;2 if exist $@.manifest del $@.manifest $(CHMOD) $(PERM_RWX) $@
OTHERLDFLAGS is where we will put "-LTCG:PGINSTRUMENT" option and put in " pgort.lib " which has the intrumenting code which must be static linked into the 1st DLL. Now PGO works like this, first you link an instrumented DLL, give it a workload, it makes report files, then you link again, this time using the report files, to make a non-instrumented PGOed DLL. Now to use PGO, we need a workload. I can't think of anything better than running the XS module's own test suite. Ideally it will go through all the XSUBs atleast once, and on the most common path most of the time. Now for the 1st build.
C:\Documents and Settings\Administrator\Desktop\w32f>nmake OPTIMIZE="- +MD -Zi -DN DEBUG -O1 -GL" OTHERLDFLAGS="-LTCG:PGINSTRUMENT pgort.lib " test Microsoft (R) Program Maintenance Utility Version 9.00.30729.01 Copyright (C) Microsoft Corporation. All rights reserved. cp cFile.pc blib\arch\Win32API\File\cFile.pc cp File.pm blib\lib\Win32API\File.pm C:\p517\bin\perl.exe C:\p517\lib\ExtUtils\xsubpp -typemap C:\ +p517\lib\E xtUtils\typemap -typemap typemap File.xs > File.xsc && C:\p517\bin\pe +rl.exe -ME xtUtils::Command -e mv -- File.xsc File.c Warning: Found a 'CODE' section which seems to be using 'RETVAL' but n +o 'OUTPUT' section. in File.xs, line 159 Warning: Found a 'CODE' section which seems to be using 'RETVAL' but n +o 'OUTPUT' section. in File.xs, line 181 Warning: Found a 'CODE' section which seems to be using 'RETVAL' but n +o 'OUTPUT' section. in File.xs, line 511 cl -c -nologo -GF -W3 -MD -Zi -DNDEBUG -O1 -GL -DWIN32 -D_C +ONSOLE -DN O_STRICT -D_CRT_SECURE_NO_DEPRECATE -D_CRT_NONSTDC_NO_DEPRECATE -DPERL +_TEXTMODE_ SCRIPTS -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -MD - +Zi -DNDEBU G -O1 -GL -DVERSION=\"0.1200\" -DXS_VERSION=\"0.1200\" "-IC:\p517 +\lib\CORE" File.c File.c Running Mkbootstrap for Win32API::File () C:\p517\bin\perl.exe -MExtUtils::Command -e chmod -- 644 File. +bs C:\p517\bin\perl.exe -MExtUtils::Mksymlists -e "Mksymlists('N +AME'=>\"Wi n32API::File\", 'DLBASE' => 'File', 'DL_FUNCS' => { }, 'FUNCLIST' => +[], 'IMPOR TS' => { }, 'DL_VARS' => []);" link -out:blib\arch\auto\Win32API\File\File.dll -dll -nologo - +nodefaultl ib -debug -opt:ref,icf -ltcg -libpath:"c:\p517\lib\CORE" -machine:x8 +6 "/manife stdependency:type='Win32' name='Microsoft.Windows.Common-Controls' ver +sion='6.0. 0.0' processorArchitecture='*' publicKeyToken='6595b64144ccf1df' langu +age='*'" F ile.obj -LTCG:PGINSTRUMENT pgort.lib C:\p517\lib\CORE\perl517.lib ol +dnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32 +.lib shell 32.lib ole32.lib oleaut32.lib netapi32.lib uuid.lib ws2_32.lib mpr.li +b winmm.li b version.lib odbc32.lib odbccp32.lib comctl32.lib msvcrt.lib -def:Fi +le.def Creating library blib\arch\auto\Win32API\File\File.lib and object b +lib\arch\a uto\Win32API\File\File.exp Generating code Finished generating code if exist blib\arch\auto\Win32API\File\File.dll.manifest mt -no +logo -mani fest blib\arch\auto\Win32API\File\File.dll.manifest -outputresource:bl +ib\arch\au to\Win32API\File\File.dll;2 if exist blib\arch\auto\Win32API\File\File.dll.manifest del bl +ib\arch\au to\Win32API\File\File.dll.manifest C:\p517\bin\perl.exe -MExtUtils::Command -e chmod -- 755 blib\ +arch\auto\ Win32API\File\File.dll C:\p517\bin\perl.exe -MExtUtils::Command -e cp -- File.bs blib +\arch\auto \Win32API\File\File.bs C:\p517\bin\perl.exe -MExtUtils::Command -e chmod -- 644 blib\ +arch\auto\ Win32API\File\File.bs C:\p517\bin\perl.exe "-MExtUtils::Command::MM" "-e" "test_harn +ess(0, 'bl ib\lib', 'blib\arch')" t/*.t t/file.t .. ok t/pod.t ... skipped: Test::Pod 1.14 required for testing POD t/tie.t ... ok All tests successful. Files=3, Tests=277, 2 wallclock secs ( 0.14 usr + 0.08 sys = 0.22 C +PU) Result: PASS C:\Documents and Settings\Administrator\Desktop\w32f>
Now if you look in the root directory of the XS module, you will see a one .pgd file, in this example, you will see a "File.pgd". The pgd is a "meta" file generated by link, it doesn't contain the benchmark info from the workload, but the benchmark files files require the .pgd for link to understand the whole thing on the 2nd dll compile.
C:\Documents and Settings\Administrator\Desktop\w32f>dir Volume in drive C has no label. Volume Serial Number is 0CFF-E7B6 Directory of C:\Documents and Settings\Administrator\Desktop\w32f 07/22/2012 07:13 PM <DIR> . 07/22/2012 07:13 PM <DIR> .. 07/22/2012 07:13 PM <DIR> blib 07/01/2011 02:47 AM 19,006 buffers.h 09/16/2009 05:57 PM 77 cFile.h 09/16/2009 05:57 PM 6,151 cFile.pc 07/22/2012 07:13 PM 0 cFile_pc_to_blib 07/01/2011 10:34 AM 5,637 Changes 09/16/2009 05:57 PM 5,302 const2perl.h 07/22/2012 06:55 PM <DIR> ex 07/22/2012 06:55 PM <DIR> ExtUtils 07/22/2012 07:13 PM 0 File.bs 07/22/2012 07:13 PM 49,513 File.c 07/22/2012 07:13 PM 98 File.def 07/22/2012 07:13 PM 348,195 File.obj 07/22/2012 07:13 PM 281,600 File.pgd 07/01/2011 10:34 AM 100,643 File.pm 02/25/2011 06:22 PM 15,217 File.xs 07/22/2012 07:13 PM 35,108 Makefile 07/22/2012 06:58 PM 35,108 Makefile.old 07/01/2011 02:52 AM 6,342 Makefile.PL 07/01/2011 10:36 AM 412 MANIFEST 07/01/2011 10:36 AM 1,141 META.json 07/01/2011 10:36 AM 641 META.yml 07/22/2012 07:13 PM 1,191 MYMETA.json 07/22/2012 07:13 PM 683 MYMETA.yml 07/22/2012 07:13 PM 0 pm_to_blib 09/16/2009 05:57 PM 8,068 ppport.h 07/01/2011 10:35 AM 4,818 README 07/22/2012 06:55 PM <DIR> t 09/16/2009 05:57 PM 3,449 typemap 07/22/2012 07:13 PM 110,592 vc90.pdb 26 File(s) 1,038,992 bytes 6 Dir(s) 501,818,015,744 bytes free C:\Documents and Settings\Administrator\Desktop\w32f>
Now lets look for the .pgc files which contain the benchmark info. They will be right next to the instrumented DLL.
C:\Documents and Settings\Administrator\Desktop\w32f>cd C:\Documents a +nd Setting s\Administrator\Desktop\w32f\blib\arch\auto\Win32API\File C:\Documents and Settings\Administrator\Desktop\w32f\blib\arch\auto\Wi +n32API\Fil e>dir Volume in drive C has no label. Volume Serial Number is 0CFF-E7B6 Directory of C:\Documents and Settings\Administrator\Desktop\w32f\bli +b\arch\aut o\Win32API\File 07/22/2012 07:13 PM <DIR> . 07/22/2012 07:13 PM <DIR> .. 07/22/2012 07:13 PM 0 .exists 07/22/2012 07:13 PM 42,680 File!1.pgc 07/22/2012 07:13 PM 42,240 File!2.pgc 07/22/2012 07:13 PM 0 File.bs 07/22/2012 07:13 PM 126,464 File.dll 07/22/2012 07:13 PM 958 File.exp 07/22/2012 07:13 PM 1,946 File.lib 07/22/2012 07:13 PM 371,712 File.pdb 8 File(s) 586,000 bytes 2 Dir(s) 501,818,171,392 bytes free C:\Documents and Settings\Administrator\Desktop\w32f\blib\arch\auto\Wi +n32API\Fil e>
link has to see the .pgc and .pgd and the .obj files together, so
C:\Documents and Settings\Administrator\Desktop\w32f\blib\arch\auto\Wi +n32API\Fil e>copy *.pgc "C:\Documents and Settings\Administrator\Desktop\w32f" File!1.pgc File!2.pgc 2 file(s) copied. C:\Documents and Settings\Administrator\Desktop\w32f\blib\arch\auto\Wi +n32API\Fil e>
Now lets look at the root directory again.
C:\Documents and Settings\Administrator\Desktop\w32f>dir Volume in drive C has no label. Volume Serial Number is 0CFF-E7B6 Directory of C:\Documents and Settings\Administrator\Desktop\w32f 07/22/2012 07:24 PM <DIR> . 07/22/2012 07:24 PM <DIR> .. 07/22/2012 07:13 PM <DIR> blib 07/01/2011 02:47 AM 19,006 buffers.h 09/16/2009 05:57 PM 77 cFile.h 09/16/2009 05:57 PM 6,151 cFile.pc 07/22/2012 07:13 PM 0 cFile_pc_to_blib 07/01/2011 10:34 AM 5,637 Changes 09/16/2009 05:57 PM 5,302 const2perl.h 07/22/2012 06:55 PM <DIR> ex 07/22/2012 06:55 PM <DIR> ExtUtils 07/22/2012 07:13 PM 42,680 File!1.pgc 07/22/2012 07:13 PM 42,240 File!2.pgc 07/22/2012 07:13 PM 0 File.bs 07/22/2012 07:13 PM 49,513 File.c 07/22/2012 07:13 PM 98 File.def 07/22/2012 07:13 PM 348,195 File.obj 07/22/2012 07:13 PM 281,600 File.pgd 07/01/2011 10:34 AM 100,643 File.pm 02/25/2011 06:22 PM 15,217 File.xs 07/22/2012 07:13 PM 35,108 Makefile 07/22/2012 06:58 PM 35,108 Makefile.old 07/01/2011 02:52 AM 6,342 Makefile.PL 07/01/2011 10:36 AM 412 MANIFEST 07/01/2011 10:36 AM 1,141 META.json 07/01/2011 10:36 AM 641 META.yml 07/22/2012 07:13 PM 1,191 MYMETA.json 07/22/2012 07:13 PM 683 MYMETA.yml 07/22/2012 07:13 PM 0 pm_to_blib 09/16/2009 05:57 PM 8,068 ppport.h 07/01/2011 10:35 AM 4,818 README 07/22/2012 06:55 PM <DIR> t 09/16/2009 05:57 PM 3,449 typemap 07/22/2012 07:13 PM 110,592 vc90.pdb 28 File(s) 1,123,912 bytes 6 Dir(s) 501,818,015,744 bytes free C:\Documents and Settings\Administrator\Desktop\w32f>
Looks good so far, we have the obj, the pgc, and pgd files together. Now to recreate the dll BUT NOT recreate any other files, the pdb and the obj and the pgd must be exactly the same as was used for the instrumented DLL. Timestamps are checked I belive. To do this delete the dll in the blib/arch/auto/***** folder.
C:\Documents and Settings\Administrator\Desktop\w32f>cd "C:\Documents +and Settin gs\Administrator\Desktop\w32f\blib\arch\auto\Win32API\File" C:\Documents and Settings\Administrator\Desktop\w32f\blib\arch\auto\Wi +n32API\Fil e>del File.dll C:\Documents and Settings\Administrator\Desktop\w32f\blib\arch\auto\Wi +n32API\Fil e>dir Volume in drive C has no label. Volume Serial Number is 0CFF-E7B6 Directory of C:\Documents and Settings\Administrator\Desktop\w32f\bli +b\arch\aut o\Win32API\File 07/22/2012 07:28 PM <DIR> . 07/22/2012 07:28 PM <DIR> .. 07/22/2012 07:13 PM 0 .exists 07/22/2012 07:13 PM 42,680 File!1.pgc 07/22/2012 07:13 PM 42,240 File!2.pgc 07/22/2012 07:13 PM 0 File.bs 07/22/2012 07:13 PM 958 File.exp 07/22/2012 07:13 PM 1,946 File.lib 07/22/2012 07:13 PM 371,712 File.pdb 7 File(s) 459,536 bytes 2 Dir(s) 501,817,868,288 bytes free C:\Documents and Settings\Administrator\Desktop\w32f\blib\arch\auto\Wi +n32API\Fil e>
switch back to the root directory, and this time do
C:\Documents and Settings\Administrator\Desktop\w32f>nmake OPTIMIZE="- +MD -Zi -DN DEBUG -O1 -GL" OTHERLDFLAGS="-LTCG:PGOPTIMIZE " install Microsoft (R) Program Maintenance Utility Version 9.00.30729.01 Copyright (C) Microsoft Corporation. All rights reserved. link -out:blib\arch\auto\Win32API\File\File.dll -dll -nologo - +nodefaultl ib -debug -opt:ref,icf -ltcg -libpath:"c:\p517\lib\CORE" -machine:x8 +6 "/manife stdependency:type='Win32' name='Microsoft.Windows.Common-Controls' ver +sion='6.0. 0.0' processorArchitecture='*' publicKeyToken='6595b64144ccf1df' langu +age='*'" F ile.obj -LTCG:PGOPTIMIZE C:\p517\lib\CORE\perl517.lib oldnames.lib k +ernel32.li b user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32 +.lib ole32 .lib oleaut32.lib netapi32.lib uuid.lib ws2_32.lib mpr.lib winmm.lib + version.l ib odbc32.lib odbccp32.lib comctl32.lib msvcrt.lib -def:File.def Merging File!1.pgc Merging File!2.pgc Generating code 43 of 43 (100.00) profiled functions will be compiled for speed Finished generating code if exist blib\arch\auto\Win32API\File\File.dll.manifest mt -no +logo -mani fest blib\arch\auto\Win32API\File\File.dll.manifest -outputresource:bl +ib\arch\au to\Win32API\File\File.dll;2 if exist blib\arch\auto\Win32API\File\File.dll.manifest del bl +ib\arch\au to\Win32API\File\File.dll.manifest C:\p517\bin\perl.exe -MExtUtils::Command -e chmod -- 755 blib\ +arch\auto\ Win32API\File\File.dll Files found in blib\arch: installing files in blib\lib into architectu +re depende nt library tree Installing C:\p517\lib\auto\Win32API\File\File!1.pgc Installing C:\p517\lib\auto\Win32API\File\File!2.pgc Installing C:\p517\lib\auto\Win32API\File\File.bs Installing C:\p517\lib\auto\Win32API\File\File.dll Installing C:\p517\lib\auto\Win32API\File\File.exp Installing C:\p517\lib\auto\Win32API\File\File.lib Installing C:\p517\lib\auto\Win32API\File\File.pdb Appending installation info to c:\p517\lib/perllocal.pod C:\Documents and Settings\Administrator\Desktop\w32f>
You now want to delete "C:\p517\lib\auto\Win32API\File\File!1.pgc" and "C:\p517\lib\auto\Win32API\File\File!2.pgc" and any other .pgc files that were accidentally copied.

Be very careful to check the build log on the 2nd DLL building. My 2nd build was sucessfully PGOed, but I think if PGO fails or warns, a non PGOed DLL will be created anyway, which is bad.
Merging File!1.pgc Merging File!2.pgc Generating code 43 of 43 (100.00) profiled functions will be compiled for speed
Thats it.

If your curious about PGO, try playing with the pgomgr tool, it can dump the branch by branch timings and path % distributions and dead code for you, but I don't know of a way to match that data with your original C code.
C:\Documents and Settings\Administrator\Desktop\w32f>pgomgr Microsoft (R) Profile Guided Optimization Manager 9.00.21022.08 Copyright (C) Microsoft Corporation. All rights reserved. Usage: PGOMGR [options] [Profile-Count-Paths ...] Profile-Database Options: /? Display this help /help Display this help /clear Remove all merge data from this data base /detail Display verbose program statistics /merge[:n] Merge the given PGC file(s), with optional integer + weight /summary Display program statistics /unique Display decorated function names C:\Documents and Settings\Administrator\Desktop\w32f>
If your curious about LTCG, BTW, I wouldn't use LTCG without PGO on 32 bit code. LTCG seems to offer no changes without PGO and adds microscopic amount of code bloat. The assembly code difference I'm not sure about.

The PGO DLL created above was 32bit code. Its sections sizes were
.text C0CCh .rdata 171Eh .data 364h .rsrc 3C0h .reloc 92Ah
A LTCGed but no PGO and no instrumenting DLL was
.text AD70h .rdata 171Eh .data 364h .rsrc 3C0h .reloc 8CCh
A plain -O1 no LTCG was
.text AD6Ah .rdata 171Eh .data 364h .rsrc 3C0h .reloc 8CCh
Notice that LTCG results in alot of code bloat, assembly wise, what would have been a 2 byte jump opcode (jump relative IP 1 byte operand) will be a 5 byte jump opcode (jump relative IP, 4 byte 32 bot operand) to the end of the executable region. Also LTCG is very aggressive in inlining, which causes alot of code bloat (11%,0xc0cc/0xAD70=1.11....) even though -O1 was requested, in a different XS module than I used here, a 10 line Perl callback (pushmark, putback, call_pv, etc) C function inlined in all 6 places it was called. I showed here how to use LTCG and PGO for XS Modules, I didn't benchmark anything, and I can't advise you whether PGO is correct for your project, just that it is possible. On final thought, using Win32API::File was a bad example for any assembly code comparison, that module doesn't use PERL_NO_GET_CONTEXT.

Comment on Using Microsoft Visual C Link Time Code Generation (LTCG) and Profile Guided Optimization (PGO) with XS modules
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: CUFP [id://983092]
Approved by GrandFather
Front-paged by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (6)
As of 2014-08-30 14:11 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (293 votes), past polls