http://www.perlmonks.org?node_id=1007306

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Dear PerlMonks,
I'm using the module Search::VectorSpace to query a list of documents and return the relevant documents. the description given in the module is as follows:-

use Search::VectorSpace; my @docs = ...; my $engine = Search::VectorSpace->new( docs =>\@docs, threshol +d => .04); $engine->build_index(); while ( my $query = <> ) { my %results = $engine->search( $query ); print join "\n", keys %results;
I have given a list of documents:-
$dr="C:\\Users\\Desktop\\collection2"; opendir(DR, "$dr") || die "error" ; @docs=<DR>; @docs=readdir DR;
But the search is done only on the file names, but not the contents of the file. How do I search for something within the files of the given directory?

Replies are listed 'Best First'.
Re: Help, directories?
by choroba (Cardinal) on Dec 05, 2012 at 15:01 UTC
    How is this question different from Vector space search?
    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

      The main difference I see is that this post has a much worse title than the original, and will gather more downvotes despite (or due to) the identical content.

Re: Help, directories?
by Rudolf (Pilgrim) on Dec 05, 2012 at 15:36 UTC

    Hello! I am a bit confused about what your trying to do EXACTLY but I made some assumptions. So I am assuming your just searching a bunch of files for a specific line or string and want to know which one/ones contain it. I wrote some code bellow, however since I dont know what kind of files your searching through the regex should be converted to your needs and if you're searching for strings that could be longer than one line, reading line by line won't catch it. Hope this helps atleast a little, unless I misunderstood your intentions, good luck!

    use v5.14; my $dir = "C:\\Users\\Desktop\\collection2"; opendir(DR, $dir) || die "could not open directory: $!\n"; my @docs = grep{/[^\.|..]/} readdir DR; closedir DR; while ( chomp(my $query = <>) ) { foreach my $doc (@docs){ open(DOC,'<',"$dir\\$doc") or die "$!\n"; while(<DOC>){ chomp; if($query =~ m/$_/i){ print "$query appears in $doc\n"; } } close DOC; } }