Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: Perl and Linguistics

by Hanamaki (Chaplain)
on May 26, 2002 at 15:50 UTC ( #169401=note: print w/replies, xml ) Need Help??


in reply to Perl and Linguistics

Its defintely not wasted time to try linguistic analysis with Perl, but quit hard to answer your question, because we do not know your approach. Are you going to do analysis with handcoded rules or do you prefer an statistcal approach?

A good start for statistical language processing may be Dan Melamed's collection of linguistic tools. An TPJ article on Perl and Morphology may be of interest as well.

If you end up with really huge regular expressions it may be the time to implement an automata in C or whatever, but Perl is a great tool to produce linguistic prototypes.

If you want to do research with Hidden Markov Models, try HTK for a non Perl start.

Hanamaki

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://169401]
help
Chatterbox?
[Corion]: "flag problem" to me sounds like "contains UTF-8 bytes but was never properly decoded to an UTF-8 string"
[LanX]: not my code ...
[choroba]: yeah, sounds like one of the strings is not flagged as UTF-8
[choroba]: which usually means its input wasn't handled correctly
[Corion]: choroba: Yeah, I think that would be the good solution
[LanX]: I suspect the first string which comes from the DB ...
[LanX]: ... but this part is already in production for a year now
[Corion]: LanX: The "good" approach here would be to use the appropriate DBI parameters to make the driver decode strings properly. But that will have a ripple-on effect of messing up all the places where manual decoding happens ;)
[LanX]: which means albeit being broken UTF8 it'll be handled correctly
[LanX]: and the problem only occurs since we changed the emails to base64

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (11)
As of 2017-01-16 13:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you watch meteor showers?




    Results (150 votes). Check out past polls.