Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re: Create a dictionary from wikipedia

by Old_Gray_Bear (Bishop)
on Jul 31, 2012 at 15:29 UTC ( #984610=note: print w/ replies, xml ) Need Help??


in reply to Create a dictionary from wikipedia

So what code do you have, and where are you having problems?

If I were going to approach the problem, I'd define "Meaningful Content" in terms of it's characteristics (it appears in a <title> or <subtitle>, it uses a particular CSS strophe, it has an XPATH that looks like ___, etc).

Then, just do it.

----
I Go Back to Sleep, Now.

OGB


Comment on Re: Create a dictionary from wikipedia
Re^2: Create a dictionary from wikipedia
by vit (Pilgrim) on Jul 31, 2012 at 16:13 UTC
    That's the point. Of course I can do everything from scratch using Perl, Python or Java, including trying to understand what is the meaningful content.
    But I kind of thought that there is a solution for this which somebody could share.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://984610]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (8)
As of 2015-07-02 21:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (45 votes), past polls