Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re^4: numeric representation of string

by mhearse (Hermit)
on Aug 16, 2013 at 17:47 UTC ( #1049763=note: print w/ replies, xml ) Need Help??


in reply to Re^3: numeric representation of string
in thread numeric representation of string

I agree. My current code inserts email bodies to a compressed table. And that's it. Another simple idea I had was to break up the body by word boundary. Storing it in an array, then doing a bulk insert ignore into a unique column. Might look something like this.... although this a probably a pipe dream. But seems logical... at least based on my stunted repetitive vocabulary. Would have the benefit of being fast due to the lack of compression.

CREATE TABLE words ( rowid INT UNSIGNED NOT NULL AUTO_INCREMENT, word VARCHAR(255) NOT NULL UNIQUE ) ENGINE=InnoDB CHARACTER SET=utf8;
CREATE TABLE body ( rowid INT UNSIGNED NOT NULL AUTO_INCREMENT, word_order_num INT UNSIGNED NOT NULL, word_rowid FOREIGN KEY REFERENCES words(rowid) NOT NULL ) ENGINE=InnoDB CHARACTER SET=utf8;


Comment on Re^4: numeric representation of string
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1049763]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (10)
As of 2015-07-28 10:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (254 votes), past polls