Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

We have all been to sites where they use various techniques to try to avoid bots using the service intended for human usage (for example Yahoo mail doesn't want spammers to setup bots to send out e-mail). In many of those tests, there is an image that is used - usually of alphanumeric characters that the user then needs to put into a form. The automated bots have started searching the code on the page for the image, then doing what is essentially OCR on the image and then using that data to submit the form.

I have written a way around that which uses only HTML and CSS to represent an image - but if you look at the source of the page, there is no image and no sign of the text that is seen on screen. This is done by treating DIVs like pixels and recreating the image that way.

I have a proof of concept page up here with a static example, a dynamic example (looking at the source code of either of those pages you will see no images and no text that matches what is on screen), and Perl source code for each (static and dynamic).

This technique could also be used on web pages for obfuscating e-mail addresses since the bots can't scan the source to pull out the text of the e-mail address.

There are some odd things afoot now, in the Villa Straylight.

In reply to Another way to get around automated bots by AssFace

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (1)
As of 2021-07-31 13:10 GMT
Find Nodes?
    Voting Booth?

    No recent polls found