Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change

Re: Machine learning pattern matching...

by CountZero (Bishop)
on Dec 26, 2012 at 16:41 UTC ( #1010408=note: print w/replies, xml ) Need Help??

in reply to Machine learning pattern matching...

So you want the web-page you want to scrape to act as some kind of configuration file to define what content you want to retain. I doubt it that anyone already wrote such a program. I think it is a few levels above the state-of-the-art of AI technology.

But perhaps you are thinking of something more specific: real estate listings, catalogues, ...

If you can narrow down the scope of your research, there may be some hope yet.


A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

My blog: Imperial Deltronics
  • Comment on Re: Machine learning pattern matching...

Replies are listed 'Best First'.
Re^2: Machine learning pattern matching...
by cLive ;-) (Prior) on Dec 31, 2012 at 16:39 UTC

    Yes, it's going to be user suggested search results from shopping sites (honoring any robots.txt restrictions, obviously).

    Point is, I won't know what they're going to suggest until they do and, ideally, I'd like to automate additions where possible to minimize manual review.

    I was thinking of grabbing any possible matchces on the page and present them to the user adding the link as first step, but wondered what was out there already. Short of looking for patterns in the DOM, I'm not sure what else to do.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1010408]
[Discipulus]: Corion are you would able to realize such thing? O_O
[Corion]: In the same vein I have a script that automates Firefox to enter some data into another system. It's not faster than the people using the script if they were to do it manually, but they prefer not having to check the data and not having typos when ...
[Corion]: ... entering the data
[Corion]: Discipulus: I don't know whether I could really do that, but the init process itself mostly launches other processes, and the whole startup is just following a path of dependencies and making sure they are all running. Which basically is what ...
[Discipulus]: when at work my time is (temporarly) owned by the firm, so i do not care (coworkers whatch movies.. I code Perl)
[Corion]: ... make already does, except for files instead of programs. But maybe with some /proc hackery, that could be eliminated and one could use plain make :-D
[choroba]: systemd just makes is asynchronous
[choroba]: so, make -j
[Corion]: Discipulus: Yeah - but when writing Perl to save time (instead of having fun), it helps to look whether you're actually saving time ;) Why spend 5 minutes doing manually what you can spend three years automating? ;)
[Corion]: choroba: Oh, yeah :-D

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (7)
As of 2017-07-27 09:32 GMT
Find Nodes?
    Voting Booth?
    I came, I saw, I ...

    Results (408 votes). Check out past polls.