I am new to perl. I am doing a project in school, where I'm creating a simple search engine.I give a query(a string of words) and a list of files are searched and the best matched file name is displayed. The basic plan is:-
1)preprocess the files
3)create term document matrix
I was able to write the pre-processing and clustering modules, but I have confusion regarding the term-document matrix. Should I create a separate array for each term, or should I use a 2-d array. And how do i search for terms from the array.(the document that contains maximum of the query terms is displayed)
And is there any better way to search than using a term-document matrix?
p.s. This is a pretty small project, so I don't need highly efficient search techniques, any easy ones would do.