Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Help, directories?

by Anonymous Monk
on Dec 05, 2012 at 14:59 UTC ( #1007306=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Dear PerlMonks,
I'm using the module Search::VectorSpace to query a list of documents and return the relevant documents. the description given in the module is as follows:-

use Search::VectorSpace; my @docs = ...; my $engine = Search::VectorSpace->new( docs =>\@docs, threshol +d => .04); $engine->build_index(); while ( my $query = <> ) { my %results = $engine->search( $query ); print join "\n", keys %results;
I have given a list of documents:-
$dr="C:\\Users\\Desktop\\collection2"; opendir(DR, "$dr") || die "error" ; @docs=<DR>; @docs=readdir DR;
But the search is done only on the file names, but not the contents of the file. How do I search for something within the files of the given directory?

Comment on Help, directories?
Select or Download Code
Re: Help, directories?
by choroba (Abbot) on Dec 05, 2012 at 15:01 UTC
    How is this question different from Vector space search?
    لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

      The main difference I see is that this post has a much worse title than the original, and will gather more downvotes despite (or due to) the identical content.

Re: Help, directories?
by Rudolf (Monk) on Dec 05, 2012 at 15:36 UTC

    Hello! I am a bit confused about what your trying to do EXACTLY but I made some assumptions. So I am assuming your just searching a bunch of files for a specific line or string and want to know which one/ones contain it. I wrote some code bellow, however since I dont know what kind of files your searching through the regex should be converted to your needs and if you're searching for strings that could be longer than one line, reading line by line won't catch it. Hope this helps atleast a little, unless I misunderstood your intentions, good luck!

    use v5.14; my $dir = "C:\\Users\\Desktop\\collection2"; opendir(DR, $dir) || die "could not open directory: $!\n"; my @docs = grep{/[^\.|..]/} readdir DR; closedir DR; while ( chomp(my $query = <>) ) { foreach my $doc (@docs){ open(DOC,'<',"$dir\\$doc") or die "$!\n"; while(<DOC>){ chomp; if($query =~ m/$_/i){ print "$query appears in $doc\n"; } } close DOC; } }

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1007306]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (3)
As of 2014-10-01 23:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    What is your favourite meta-syntactic variable name?














    Results (41 votes), past polls