Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

RE: RE: Slashdot Headline Grabber for Win32

by GridMonk (Acolyte)
on May 09, 2000 at 08:09 UTC ( #10711=note: print w/ replies, xml ) Need Help??


in reply to RE: Slashdot Headline Grabber for Win32
in thread Slashdot Headline Grabber for Win32

Well, my first post on perlmonks.org...
( and I FSK it up.... sheesh! )

I am working from behind a corporate firewall, so I tried the LWP example above with my proxy info, and am getting:

Protocol scheme: 'http://www.slashdot.org/slashdot.xml' is not supported.
along with what looks like a 501 response in the debugger.

I tried this with a couple of different URLs and got the same result, so I suspect it may be the proxy setup. Any ideas?

Just to confirm:

My proxy settings in Netscape show

pset.tgw.canon.co.jp/proxy.pac

so I used:
#set up your proxy here
$ua->proxy('http', 'http://pset.tgw.canon.co.jp:80');

I tried both 80 and 8080 for ports, with the same result.

Any advice appreciated.


Comment on RE: RE: Slashdot Headline Grabber for Win32
RE: RE: RE: Slashdot Headline Grabber for Win32
by marcos (Scribe) on May 09, 2000 at 21:31 UTC
    I think that the problem is that your company uses a proxy configuration script.
    If you check your Netscape proxy settings you should have the 'Automatic proxy configuration' option enabled and the 'Configuration Location (URL)' set to 'pset.tgw.canon.co.jp/proxy.pac'. If so my guess is correct: your are using an automatic proxy configuration script.
    You may download the proxy configuration script with a browser going to the URL http://pset.tgw.canon.co.jp/proxy.pac - or with a simple perl script that uses LWP and gets that URL :-).
    The script should not be too complicated to read: there should be a function that returns a proxy, more or less like this:
    return "PROXY 151.92.12.112:8080";
    the return value may be different if you are trying to get corporate intranet URLs or Internet URLs (for corporate intranet URLs you may have something like return "DIRECT").
    So all you have to do is find out in the proxy.pac the IP address (or the name), and the port of the real proxy your company is using to access the Internet, and then use this same proxy and port in the perl script.
    I hope this works. If you have problems, ask me, please.
    marcos
      Thanks.

      Netscape wouldn't give me the source right from the URL, so I went with using a quick LWP script like you suggested, and nabbed the file.

      Inserted address and port, and now it works nicely.

      I don't have a proxy at home, so last night I ripped it up to grab news headlines from slashdot, cnn, antionline, japantimes.co.jp, and yahoo news and spit them all out on a webpage. Not that hard, but the first time I ever got it to work right.

      Thanks.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://10711]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (13)
As of 2014-07-30 20:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (240 votes), past polls