http://www.perlmonks.org?node_id=863081


in reply to Re^5: path-names [a very easy question of a true beginner]
in thread path-names [a very easy question of a true beginner]

hello dear Khen1950fx

guess that i have some bad luck today. Do i have this done to the wrong directory...!? I cannot start it... i always get the response:

Results:
Can't stat /home/usr/perl/htmlfiles: No such file or directory at /usr/lib/perl5/site_perl/5.12.1/File/Find/Rule.pm line 594

i use

#!/usr/bin/perl use strict; use warnings; use diagnostics; use File::Find::Rule; my @files = File::Find::Rule->file() ->name('einzelergebnis*.html') ->in( '/home/usr/perl/htmlfiles' ); foreach my $file(@files) { print $file, "\n"; }

Replies are listed 'Best First'.
Re^7: path-names [a very easy question of a true beginner]
by Perlbeginner1 (Scribe) on Oct 02, 2010 at 19:21 UTC
    hello all

    many thanks to you! I did as you adviced me! And now i was successful! i changed from
    #!/usr/bin/perl use strict; use warnings; use diagnostics; use File::Find::Rule; my @files = File::Find::Rule->file() ->name('einzelergebnis*.html') ->in( '/home/usr/perl/htmlfiles' ); foreach my $file(@files) { print $file, "\n"; }


    to this

    #!/usr/bin/perl use strict; use warnings; use diagnostics; use File::Find::Rule; my @files = File::Find::Rule->file() ->name('einzelergebnis*.html') ->in( '.' ); foreach my $file(@files) { print $file, "\n"; }

    and then i got the following output:
    htmlfiles/einzelergebnis80b5.html<br> htmlfiles/einzelergebnisa0ef.html<br> htmlfiles/einzelergebnis1b42.html<br> htmlfiles/einzelergebnis5960.html<br> htmlfiles/einzelergebnise523.html<br> htmlfiles/einzelergebnis2c7e.html<br> htmlfiles/einzelergebnisdf57.html<br> htmlfiles/einzelergebnis2b53-2.html<br> htmlfiles/einzelergebnisb1c0-2.html<br> htmlfiles/einzelergebnis8e8b.html<br> htmlfiles/einzelergebnisdcc1.html<br> htmlfiles/einzelergebnis1dae-2.html<br> htmlfiles/einzelergebnisa70d.html<br> htmlfiles/einzelergebnis3cec.html<br> htmlfiles/einzelergebnis3f1f.html<br> htmlfiles/einzelergebnis1d2b.html<br> htmlfiles/einzelergebnis396c.html<br> htmlfiles/einzelergebnis2592.html<br> htmlfiles/einzelergebnisdee0.html<br> htmlfiles/einzelergebnis987b-2.html<br> htmlfiles/einzelergebnise20b.html<br>


    ...and 22 thousand lines further... ;-)

    This seems to be the starting point! now i can continue figuring out how i have to configure the script of Keath - see more here URL=http://forums.devshed.com/showpost.php?p=2538358&postcount=12see this link to another thread here in this great forum - with the little script/URL . As this previous thread is very very long i think that it is worth to begin a new one! Note: many many thanks to Keath and Axldrweil for their great and generous help!!! So after having nailed down the I-O handle-issues and the path names in General the parser-script has to be configured.

    well this means i have to define the paths in $file the file/directory incl. path and furthermore to define a path in $html_dir
    BTW – what does the
    Array @html_files do


    here the full code or the html-parser:
    #!/usr/bin/perl use strict; use warnings; use HTML::TokeParser; my $file = 'school.html'; my $p = HTML::TokeParser->new($file) or die "Can't open: $!"; my %school; while (my $tag = $p->get_tag('div', '/html')) { # first move to the right div that contains the information last if $tag->[0] eq '/html'; next unless exists $tag->[1]{'id'} and $tag->[1]{'id'} eq 'inhalt_ +large'; $p->get_tag('h1'); $school{'location'} = $p->get_text('/h1'); while (my $tag = $p->get_tag('div')) { last if exists $tag->[1]{'id'} and $tag->[1]{'id'} eq 'fusszei +le'; # get the school name from the heading next unless exists $tag->[1]{'class'} and $tag->[1]{'class'} e +q 'fm_linkeSpalte'; $p->get_tag('h2'); $school{'name'} = $p->get_text('/h2'); # verify format for school type $tag = $p->get_tag('span'); unless (exists $tag->[1]{'class'} and $tag->[1]{'class'} eq 's +chulart_text') { warn "unexpected format: parsing stopped"; last; } $school{'type'} = $p->get_text('/span'); # verify format for address $tag = $p->get_tag('p'); unless (exists $tag->[1]{'class'} and $tag->[1]{'class'} eq 'e +inzel_text') { warn "unexpected format: parsing stopped"; last; } $school{'address'} = clean_address($p->get_text('/p')); # find the description $tag = $p->get_tag('p'); $school{'description'} = $p->get_text('/p'); } } print qq/$school{'name'}\n/; print qq/$school{'location'}\n/; print qq/$school{'type'}\n/; foreach (@{$school{'address'}}) { print "$_\n"; } print qq/\nDescription: $school{'description'}\n/; sub clean_address { my $text = shift; my @lines = split "\n", $text; foreach (@lines) { s/^\s+//; s/\s+$//; } return \@lines; }


    Note: i can provide you with much further information - on what the script does!

    i look forward to any and all help! This is a very very great place to share knowlege!! MAny many thanks for this great plac3!
    perlbeginner1!
      This is probably beside the point, but instead of
      my @files = File::Find::Rule->file() ->name('einzelergebnis*.html') ->in( '.' );
      I would simply use
      my @files = <einzelergebnis*.html>;
      as all your files seem to be in one directory (which also seems to be your current working directory).

      The difference is that your code would also find files that reside in a subdirectories - and that may or may not be what you want. (I would not want subdirs as I then could simply create a subdir and move files I want exclude from processing there but you may think differently about this - you just have to be aware of it).