Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re: Unable to connect

by hippo (Archbishop)
on Mar 22, 2025 at 16:10 UTC ( [id://11164376]=note: print w/replies, xml ) Need Help??


in reply to Unable to connect

It's just done the same again for about 10 minutes (came back at 16:01 UTC). It is an immediate rejection of the connection, as if the web server was not running. I've not seen that before the past couple of days so it looks like a new problem to me.

This applies to any page, of course, not just RAT.


🦛

Replies are listed 'Best First'.
Re^2: Unable to connect
by marto (Cardinal) on Mar 22, 2025 at 17:59 UTC

    I'm going to go out on a limb and suggest, based on recent reports from elsewhere that this could be being caused by the ever increasing swarms of content scrapers specifically feeding AI slop factories while actively ignoring and working around ant attempts to slow them down.

      I hate to say it but if this is the case then a human verification is needed WHEN abnormal page visiting patterns (APVP) are detected. Can we check if there is any APVP in the logs around this short time interval mentioned by choroba and others?

      I would consider a normal visiting pattern the following: open 5,10 posts from newest nodes (i am sending them to different tabs) on a short burst (e.g. when first landing on newests nodes or RAT). Then inactivity (voting/commenting does not count) while reading or doing other things. I don't think even someone who has not logged in for a year would open hundreds of posts in one short burst to read them all in a ... few days. Perhaps we could be allowed to read only 10 posts/day from the distant past. Of course this entails a cookie for anyone visiting not only for those logged in. Or keeping a track of what each IP (not user) does what and how often. thinking out loud.

        Can we check if there is any APVP in the logs...

        Good idea. So I just did some log file messin', and found that there have been a huge number of hits on the site from the address range 66.249.64. to 66.249.79.
        All of the hits are from Anonymous Monk, and all are submitting queries to Super Search — which, of course, is quite resource intensive.
        A lot of the queries are somewhat perl or perlmonks related, but many are not. In fact, it looks a lot like someone trying to use Super Search like google.

        Worth noting: No monks are logging in from this address range.

        Today's latest and greatest software contains tomorrow's zero day exploits.

      Unfortunately. I had to tighten the screws on my private server as well. Most of those scrapers are really, really, dumb, too. When encountering a public repository (both git and mercurial), instead of just pulling the repo (a rather efficient operation), they just follow through the web pages and generate every page in every which way. Still working on some smarter rules, but so far i managed to reduce traffic to my server by (very roughly) 90% without affecting most legitimate users.

      There are still a few things i want to implement to detect bot activity even better and have to ability to automatically block specific subnets when bot activity is detected from those IP's. But that's all very specific to my private server and unfortunately wont be applicable to the monastery.

      PerlMonks XP is useless? Not anymore: XPD - Do more with your PerlMonks XP
      Also check out my sisters artwork and my weekly webcomics

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11164376]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (3)
As of 2026-01-23 15:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What's your view on AI coding assistants?





    Results (125 votes). Check out past polls.

    Notices?
    hippoepoptai's answer Re: how do I set a cookie and redirect was blessed by hippo!
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.