Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re^5: path-names [a very easy question of a true beginner]

by Khen1950fx (Canon)
on Oct 02, 2010 at 13:27 UTC ( [id://863070]=note: print w/replies, xml ) Need Help??


in reply to Re^4: path-names [a very easy question of a true beginner]
in thread path-names [a very easy question of a true beginner]

Use  ->name('einzelergebnis*'). Like so:
#!/usr/bin/perl use strict; use warnings; use diagnostics; use File::Find::Rule; my @files = File::Find::Rule->file() ->name('einzelergebnis*') ->in( '/home/usr/htmlfiles' ); foreach my $file(@files) { print $file, "\n"; }

Replies are listed 'Best First'.
Re^6: path-names [a very easy question of a true beginner]
by Perlbeginner1 (Scribe) on Oct 02, 2010 at 15:33 UTC
    hello dear Khen1950fx

    guess that i have some bad luck today. Do i have this done to the wrong directory...!? I cannot start it... i always get the response:

    Results:
    Can't stat /home/usr/perl/htmlfiles: No such file or directory at /usr/lib/perl5/site_perl/5.12.1/File/Find/Rule.pm line 594

    i use

    #!/usr/bin/perl use strict; use warnings; use diagnostics; use File::Find::Rule; my @files = File::Find::Rule->file() ->name('einzelergebnis*.html') ->in( '/home/usr/perl/htmlfiles' ); foreach my $file(@files) { print $file, "\n"; }
      hello all

      many thanks to you! I did as you adviced me! And now i was successful! i changed from
      #!/usr/bin/perl use strict; use warnings; use diagnostics; use File::Find::Rule; my @files = File::Find::Rule->file() ->name('einzelergebnis*.html') ->in( '/home/usr/perl/htmlfiles' ); foreach my $file(@files) { print $file, "\n"; }


      to this

      #!/usr/bin/perl use strict; use warnings; use diagnostics; use File::Find::Rule; my @files = File::Find::Rule->file() ->name('einzelergebnis*.html') ->in( '.' ); foreach my $file(@files) { print $file, "\n"; }

      and then i got the following output:
      htmlfiles/einzelergebnis80b5.html<br> htmlfiles/einzelergebnisa0ef.html<br> htmlfiles/einzelergebnis1b42.html<br> htmlfiles/einzelergebnis5960.html<br> htmlfiles/einzelergebnise523.html<br> htmlfiles/einzelergebnis2c7e.html<br> htmlfiles/einzelergebnisdf57.html<br> htmlfiles/einzelergebnis2b53-2.html<br> htmlfiles/einzelergebnisb1c0-2.html<br> htmlfiles/einzelergebnis8e8b.html<br> htmlfiles/einzelergebnisdcc1.html<br> htmlfiles/einzelergebnis1dae-2.html<br> htmlfiles/einzelergebnisa70d.html<br> htmlfiles/einzelergebnis3cec.html<br> htmlfiles/einzelergebnis3f1f.html<br> htmlfiles/einzelergebnis1d2b.html<br> htmlfiles/einzelergebnis396c.html<br> htmlfiles/einzelergebnis2592.html<br> htmlfiles/einzelergebnisdee0.html<br> htmlfiles/einzelergebnis987b-2.html<br> htmlfiles/einzelergebnise20b.html<br>


      ...and 22 thousand lines further... ;-)

      This seems to be the starting point! now i can continue figuring out how i have to configure the script of Keath - see more here URL=http://forums.devshed.com/showpost.php?p=2538358&postcount=12see this link to another thread here in this great forum - with the little script/URL . As this previous thread is very very long i think that it is worth to begin a new one! Note: many many thanks to Keath and Axldrweil for their great and generous help!!! So after having nailed down the I-O handle-issues and the path names in General the parser-script has to be configured.

      well this means i have to define the paths in $file the file/directory incl. path and furthermore to define a path in $html_dir
      BTW – what does the
      Array @html_files do


      here the full code or the html-parser:
      #!/usr/bin/perl use strict; use warnings; use HTML::TokeParser; my $file = 'school.html'; my $p = HTML::TokeParser->new($file) or die "Can't open: $!"; my %school; while (my $tag = $p->get_tag('div', '/html')) { # first move to the right div that contains the information last if $tag->[0] eq '/html'; next unless exists $tag->[1]{'id'} and $tag->[1]{'id'} eq 'inhalt_ +large'; $p->get_tag('h1'); $school{'location'} = $p->get_text('/h1'); while (my $tag = $p->get_tag('div')) { last if exists $tag->[1]{'id'} and $tag->[1]{'id'} eq 'fusszei +le'; # get the school name from the heading next unless exists $tag->[1]{'class'} and $tag->[1]{'class'} e +q 'fm_linkeSpalte'; $p->get_tag('h2'); $school{'name'} = $p->get_text('/h2'); # verify format for school type $tag = $p->get_tag('span'); unless (exists $tag->[1]{'class'} and $tag->[1]{'class'} eq 's +chulart_text') { warn "unexpected format: parsing stopped"; last; } $school{'type'} = $p->get_text('/span'); # verify format for address $tag = $p->get_tag('p'); unless (exists $tag->[1]{'class'} and $tag->[1]{'class'} eq 'e +inzel_text') { warn "unexpected format: parsing stopped"; last; } $school{'address'} = clean_address($p->get_text('/p')); # find the description $tag = $p->get_tag('p'); $school{'description'} = $p->get_text('/p'); } } print qq/$school{'name'}\n/; print qq/$school{'location'}\n/; print qq/$school{'type'}\n/; foreach (@{$school{'address'}}) { print "$_\n"; } print qq/\nDescription: $school{'description'}\n/; sub clean_address { my $text = shift; my @lines = split "\n", $text; foreach (@lines) { s/^\s+//; s/\s+$//; } return \@lines; }


      Note: i can provide you with much further information - on what the script does!

      i look forward to any and all help! This is a very very great place to share knowlege!! MAny many thanks for this great plac3!
      perlbeginner1!
        This is probably beside the point, but instead of
        my @files = File::Find::Rule->file() ->name('einzelergebnis*.html') ->in( '.' );
        I would simply use
        my @files = <einzelergebnis*.html>;
        as all your files seem to be in one directory (which also seems to be your current working directory).

        The difference is that your code would also find files that reside in a subdirectories - and that may or may not be what you want. (I would not want subdirs as I then could simply create a subdir and move files I want exclude from processing there but you may think differently about this - you just have to be aware of it).

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://863070]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (5)
As of 2024-04-23 09:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found