accessing files

by arunmep (Beadle)
on Jul 14, 2005 at 12:17 UTC ( [id://474844]=perlquestion: print w/replies, xml ) Need Help??

arunmep has asked for the wisdom of the Perl Monks concerning the following question:

hai everybody iam basically a biologist. iam developing a database for a particular plant. my problem is iam having a lot of text and pdf files i need to do create a search tools that will search all the files and give the files that have the given keyword. i want know is it be efficient to develop the tool in perl in terms of speed(searching 100's of files).please reply me

Re: accessing files
by mpeters (Chaplain) on Jul 14, 2005 at 13:28 UTC
    I would look at swish-e. It's an extemely fast and flexible tool for indexing and searching various kinds of documents (html, xml, text, pdf, doc, etc) and has a nice Perl interface.

Re: accessing files
by blazar (Canon) on Jul 14, 2005 at 12:28 UTC
    100's of files doesn't sound much like a thing that should scare perl. As far as text files are concerned, perl's own functions and operators are all that you need. For pdf files, you can search CPAN for pdf tools.
Re: accessing files
by straywalrus (Friar) on Jul 14, 2005 at 14:49 UTC
    You may also want to take a look at 'Begining Perl for BioInformatics' by James Tisdall from O'Reilly. Although this does not specifically answer your question, it may help with future problems you have as a biologist. Check it out Here
Re: accessing files
by garrison (Scribe) on Jul 14, 2005 at 14:24 UTC
    I use Perl to process thousands of PDF files and have no complaints about speed, although for maximal performance we store everything in a database and search that.

