I'm currently researching cross-lingual digital libraries and I use Perl, although I am fairly new to the language. I have just finished writing a light stemmer, some ngram code, some ngram comparaison code, and basically i'm at that 'generating stats' stage. I'm looking for similarities between documents, differences in them too, and then look at language and context, and so on. The idea is to make documents searchable in many different langauges. I did a masters where I used Java, and made a system that could retrieve a similar english document in french and german..it kinda worked ;)
I'm always interested in hearing what other are up to in that area, maybe we can swap some tools and share some ideas!!
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
Outside of code tags, you may need to use entities for some characters:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.
| & || & |
| < || < |
| > || > |
| [ || [ |
| ] || ] ||