On of my pet projects is a word segmenter, meaning a
programm which gets sentences and spits out words. Easy to
get good results for the English language, but pretty hard
for languages like Japanese were you won't find any spaces
between words. A pretty though task were academic and commercial
research aren't that advanced.
I have to say for this task Perl is the perfect language. I
don't care on speed but only on results. If I get satisfied
by my project I may implement it in C, or may not. Call it
prototyping, call it research, call it whatever, I won't
try it with Prolog or Lisp.
Hanamaki
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|