Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re^4: path-names [a very easy question of a true beginner]

by Perlbeginner1 (Scribe)
on Oct 02, 2010 at 12:49 UTC ( #863065=note: print w/ replies, xml ) Need Help??


in reply to Re^3: path-names [a very easy question of a true beginner]
in thread path-names [a very easy question of a true beginner]

hello dear Khen1950fx

many many thanks for the quick reply: i am very happy to hear from you! i changed the paths also to absolut -but i did not have the good results here....

>
i found out that i made some mistakes while talking bout the html-files: Note: there are more than 20 000 Html files in the directory that is called htmlfiles Note i renamed it to htmlfiles - instead of html.files but the files itself are all named like the following sheme:

einzelergebnis1...
einzelergebnis2...
einzelergebnis3a...
einzelergebnis3b...
einzelergebnis3d...

and so forth...
so my question is: how is the convention to call it here in this line:

->name('*.einzel')

how to name this line

-> name('*.einzel')

or -> name('einzel*.')

I ask this question - since i guess that perl does not find the files - since i have the wrong naming convention...

Here the code that might run for you - as you have put some files into the folder - that are named with html*

#!/usr/bin/perl use strict; use warnings; use diagnostics; use File::Find::Rule; my @files = File::Find::Rule->file() ->name('*.einzel') ->in( '/home/usr/htmlfiles' ); foreach my $file(@files) { print $file, "\n"; }
i love to hear from you!
Btw - if i have to clear my question and ask more precisely - then let me know!

regards
perlbeginner1


Comment on Re^4: path-names [a very easy question of a true beginner]
Download Code
Re^5: path-names [a very easy question of a true beginner]
by Khen1950fx (Canon) on Oct 02, 2010 at 13:27 UTC
    Use  ->name('einzelergebnis*'). Like so:
    #!/usr/bin/perl use strict; use warnings; use diagnostics; use File::Find::Rule; my @files = File::Find::Rule->file() ->name('einzelergebnis*') ->in( '/home/usr/htmlfiles' ); foreach my $file(@files) { print $file, "\n"; }
      hello dear Khen1950fx

      guess that i have some bad luck today. Do i have this done to the wrong directory...!? I cannot start it... i always get the response:

      Results:
      Can't stat /home/usr/perl/htmlfiles: No such file or directory at /usr/lib/perl5/site_perl/5.12.1/File/Find/Rule.pm line 594

      i use

      #!/usr/bin/perl use strict; use warnings; use diagnostics; use File::Find::Rule; my @files = File::Find::Rule->file() ->name('einzelergebnis*.html') ->in( '/home/usr/perl/htmlfiles' ); foreach my $file(@files) { print $file, "\n"; }
        hello all

        many thanks to you! I did as you adviced me! And now i was successful! i changed from
        #!/usr/bin/perl use strict; use warnings; use diagnostics; use File::Find::Rule; my @files = File::Find::Rule->file() ->name('einzelergebnis*.html') ->in( '/home/usr/perl/htmlfiles' ); foreach my $file(@files) { print $file, "\n"; }


        to this

        #!/usr/bin/perl use strict; use warnings; use diagnostics; use File::Find::Rule; my @files = File::Find::Rule->file() ->name('einzelergebnis*.html') ->in( '.' ); foreach my $file(@files) { print $file, "\n"; }

        and then i got the following output:
        htmlfiles/einzelergebnis80b5.html<br> htmlfiles/einzelergebnisa0ef.html<br> htmlfiles/einzelergebnis1b42.html<br> htmlfiles/einzelergebnis5960.html<br> htmlfiles/einzelergebnise523.html<br> htmlfiles/einzelergebnis2c7e.html<br> htmlfiles/einzelergebnisdf57.html<br> htmlfiles/einzelergebnis2b53-2.html<br> htmlfiles/einzelergebnisb1c0-2.html<br> htmlfiles/einzelergebnis8e8b.html<br> htmlfiles/einzelergebnisdcc1.html<br> htmlfiles/einzelergebnis1dae-2.html<br> htmlfiles/einzelergebnisa70d.html<br> htmlfiles/einzelergebnis3cec.html<br> htmlfiles/einzelergebnis3f1f.html<br> htmlfiles/einzelergebnis1d2b.html<br> htmlfiles/einzelergebnis396c.html<br> htmlfiles/einzelergebnis2592.html<br> htmlfiles/einzelergebnisdee0.html<br> htmlfiles/einzelergebnis987b-2.html<br> htmlfiles/einzelergebnise20b.html<br>


        ...and 22 thousand lines further... ;-)

        This seems to be the starting point! now i can continue figuring out how i have to configure the script of Keath - see more here URL=http://forums.devshed.com/showpost.php?p=2538358&postcount=12see this link to another thread here in this great forum - with the little script/URL . As this previous thread is very very long i think that it is worth to begin a new one! Note: many many thanks to Keath and Axldrweil for their great and generous help!!! So after having nailed down the I-O handle-issues and the path names in General the parser-script has to be configured.

        well this means i have to define the paths in $file the file/directory incl. path and furthermore to define a path in $html_dir
        BTW what does the
        Array @html_files do


        here the full code or the html-parser:
        #!/usr/bin/perl use strict; use warnings; use HTML::TokeParser; my $file = 'school.html'; my $p = HTML::TokeParser->new($file) or die "Can't open: $!"; my %school; while (my $tag = $p->get_tag('div', '/html')) { # first move to the right div that contains the information last if $tag->[0] eq '/html'; next unless exists $tag->[1]{'id'} and $tag->[1]{'id'} eq 'inhalt_ +large'; $p->get_tag('h1'); $school{'location'} = $p->get_text('/h1'); while (my $tag = $p->get_tag('div')) { last if exists $tag->[1]{'id'} and $tag->[1]{'id'} eq 'fusszei +le'; # get the school name from the heading next unless exists $tag->[1]{'class'} and $tag->[1]{'class'} e +q 'fm_linkeSpalte'; $p->get_tag('h2'); $school{'name'} = $p->get_text('/h2'); # verify format for school type $tag = $p->get_tag('span'); unless (exists $tag->[1]{'class'} and $tag->[1]{'class'} eq 's +chulart_text') { warn "unexpected format: parsing stopped"; last; } $school{'type'} = $p->get_text('/span'); # verify format for address $tag = $p->get_tag('p'); unless (exists $tag->[1]{'class'} and $tag->[1]{'class'} eq 'e +inzel_text') { warn "unexpected format: parsing stopped"; last; } $school{'address'} = clean_address($p->get_text('/p')); # find the description $tag = $p->get_tag('p'); $school{'description'} = $p->get_text('/p'); } } print qq/$school{'name'}\n/; print qq/$school{'location'}\n/; print qq/$school{'type'}\n/; foreach (@{$school{'address'}}) { print "$_\n"; } print qq/\nDescription: $school{'description'}\n/; sub clean_address { my $text = shift; my @lines = split "\n", $text; foreach (@lines) { s/^\s+//; s/\s+$//; } return \@lines; }


        Note: i can provide you with much further information - on what the script does!

        i look forward to any and all help! This is a very very great place to share knowlege!! MAny many thanks for this great plac3!
        perlbeginner1!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://863065]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (14)
As of 2015-07-06 18:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (80 votes), past polls