Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re: LWP::UserAgent Bad and Forbidden requests

by gman (Friar)
on Dec 15, 2011 at 18:55 UTC ( #943812=note: print w/replies, xml ) Need Help??


in reply to LWP::UserAgent Bad and Forbidden requests

Here is the response I get:

" ScienceDirect.does.not.support.the.use.of.the crawler.software.If.you.have.any questions.please contact.your.helpdesk.."

Could try changing your user agent type

15:14:49.667268 IP web-editions.com.www > test.test.com.60155: Flags [ +P.], seq 1696:3144, ack 166, +win 4545, options [nop,nop,TS val 1125034391 ecr 518304537] + , length 1448 0x0000: 4500 05dc d2a0 4000 ee06 ed0d c651 c802 E.....@..... +.Q.. 0x0010: 976d a1ab 0050 eafb 1554 b73f 43ac 07c0 .m...P...T.? +C... 0x0020: 8018 11c1 c38f 0000 0101 080a 430e a997 ............ +C... 0x0030: 1ee4 b319 2f73 6369 6469 7269 6d67 2f6a ..../scidiri +mg/j 0x0040: 732f 7364 4c5f 636f 6d70 2e6a 7322 3e3c s/sdL_comp.j +s">< 0x0050: 2f73 6372 6970 743e 0a3c 7363 7269 7074 /script>.<sc +ript 0x0060: 2073 7263 3d22 2f73 6369 6469 7269 6d67 .src="/scidi +rimg 0x0070: 2f6a 732f 7364 582e 6a73 223e 3c2f 7363 /js/sdX.js"> +</sc 0x0080: 7269 7074 3e0a 3c73 6372 6970 7420 7372 ript>.<scrip +t.sr 0x0090: 633d 222f 7363 6964 6972 696d 672f 6a73 c="/scidirim +g/js 0x00a0: 2f73 642e 6a73 223e 3c2f 7363 7269 7074 /sd.js"></sc +ript 0x00b0: 3e0a 0a0a 3c21 2d2d 2073 7461 7469 6320 >...<!--.sta +tic. 0x00c0: 636f 6e73 7420 6368 6172 2067 656e 4572 const.char.g +enEr 0x00d0: 726f 7274 6d70 6c5f 7363 6373 4964 5b5d rortmpl_sccs +Id[] 0x00e0: 203d 2022 4028 2329 6765 6e45 7272 6f72 .=."@(#)genE +rror 0x00f0: 2e74 6d70 6c20 2020 342e 312e 322e 3120 .tmpl...4.1. +2.1. 0x0100: 2030 322f 3137 2f30 3020 3136 3a35 353a .02/17/00.16 +:55: 0x0110: 3138 2020 5265 7472 6965 7665 643a 2030 18..Retrieve +d:.0 0x0120: 322f 3232 2f30 3020 3039 3a30 313a 3338 2/22/00.09:0 +1:38 0x0130: 223b 202d 2d3e 0a3c 7469 746c 653e 0a0a ";.-->.<titl +e>.. 0x0140: 2020 0a0a 0a45 7272 6f72 0a0a 202d 2047 .....Error.. +.-.G 0x0150: 7565 7374 3c2f 7469 746c 653e 0a3c 2f68 uest</title> +.</h 0x0160: 6561 643e 0a0a 3c62 6f64 7920 6267 636f ead>..<body. +bgco 0x0170: 6c6f 723d 2223 6666 6666 6666 2220 7465 lor="#ffffff +".te 0x0180: 7874 3d22 2330 3030 3030 3022 3e0a 3c64 xt="#000000" +>.<d 0x0190: 6976 2073 7479 6c65 3d22 706f 7369 7469 iv.style="po +siti 0x01a0: 6f6e 3a72 656c 6174 6976 653b 223e 0a3c on:relative; +">.< 0x01b0: 6469 7620 6964 3d22 6865 6164 6572 2220 div.id="head +er". 0x01c0: 636c 6173 733d 226f 7574 6572 5772 6170 class="outer +Wrap 0x01d0: 7065 7222 3e0a 3c64 6976 2073 7479 6c65 per">.<div.s +tyle 0x01e0: 3d22 6261 636b 6772 6f75 6e64 2d63 6f6c ="background +-col 0x01f0: 6f72 3a23 4646 4646 4646 3b70 6f73 6974 or:#FFFFFF;p +osit 0x0200: 696f 6e3a 7265 6c61 7469 7665 3b7a 2d69 ion:relative +;z-i 0x0210: 6e64 6578 3a32 3b22 2063 6c61 7373 3d22 ndex:2;".cla +ss=" 0x0220: 6f75 7465 7257 7261 7070 6572 2220 3e0a outerWrapper +".>. 0x0230: 3c64 6976 2069 643d 226d 6173 7468 6561 <div.id="mas +thea 0x0240: 6422 3e0a 0a3c 212d 2d20 494e 5345 5254 d">..<!--.IN +SERT 0x0250: 5320 5448 4520 5445 4d50 4c41 5445 2056 S.THE.TEMPLA +TE.V 0x0260: 4152 4941 424c 4553 2050 4f50 554c 4154 ARIABLES.POP +ULAT 0x0270: 4544 2046 524f 4d20 4442 2066 6f72 2053 ED.FROM.DB.f +or.S 0x0280: 4349 5645 5253 4520 616e 6420 5052 4f44 CIVERSE.and. +PROD 0x0290: 5543 5420 6c61 6265 6c73 202d 2d3e 0a20 UCT.labels.- +->.. 0x02a0: 0a3c 6469 7620 6964 3d22 6c6f 676f 5344 .<div.id="lo +goSD 0x02b0: 223e 3c61 2068 7265 663d 222f 7363 6965 "><a.href="/ +scie 0x02c0: 6e63 653f 5f6f 623d 486f 6d65 5061 6765 nce?_ob=Home +Page 0x02d0: 5552 4c26 5f6d 6574 686f 643d 7573 6572 URL&_method= +user 0x02e0: 486f 6d65 5061 6765 265f 6c67 3d59 265f HomePage&_lg +=Y&_ 0x02f0: 7665 7273 696f 6e3d 3126 5f75 726c 5665 version=1&_u +rlVe 0x0300: 7273 696f 6e3d 3026 5f75 7365 7269 643d rsion=0&_use +rid= 0x0310: 3026 6d64 353d 3161 3264 3536 3837 6638 0&md5=1a2d56 +87f8 0x0320: 6633 3766 3666 3465 3832 3762 3930 6130 f37f6f4e827b +90a0 0x0330: 6366 6639 6465 2220 7461 7267 6574 3d22 cff9de".targ +et=" 0x0340: 5f74 6f70 223e 3c69 6d67 2073 7263 3d22 _top"><img.s +rc=" 0x0350: 2f73 6369 656e 6365 2f70 6167 652f 7374 /science/pag +e/st 0x0360: 6174 6963 2f73 6369 656e 6365 2f6c 6f67 atic/science +/log 0x0370: 6f2e 6769 6622 2074 6974 6c65 3d22 2d54 o.gif".title +="-T 0x0380: 6865 2077 6f72 6c64 2623 3339 3b73 206c he.world&#39 +;s.l 0x0390: 6561 6469 6e67 2066 756c 6c2d 7465 7874 eading.full- +text 0x03a0: 2073 6369 656e 7469 6669 6320 6461 7461 .scientific. +data 0x03b0: 6261 7365 2220 616c 743d 2222 2062 6f72 base".alt="" +.bor 0x03c0: 6465 723d 2230 223e 3c2f 613e 3c2f 6469 der="0"></a> +</di 0x03d0: 763e 0a0a 3c2f 6469 7620 3e0a 3c64 6976 v>..</div.>. +<div 0x03e0: 2069 643d 2273 7569 7465 7322 2063 6c61 .id="suites" +.cla 0x03f0: 7373 3d22 636c 6561 7266 6978 2220 7374 ss="clearfix +".st 0x0400: 796c 653d 226c 6566 743a 3233 3870 783b yle="left:23 +8px; 0x0410: 223e 0a0a 3c2f 6469 763e 0a3c 6469 763e ">..</div>.< +div> 0x0420: 0909 0a0a 0909 090a 2020 2020 3c64 6976 ............ +<div 0x0430: 2069 643d 2262 616e 6e65 722d 746f 7022 .id="banner- +top" 0x0440: 3e09 090a 2020 2020 0a20 2020 2020 2020 >........... +.... 0x0450: 203c 6120 6e61 6d65 3d22 536b 6970 2042 .<a.name="Sk +ip.B 0x0460: 7574 746f 6e73 223e 3c2f 613e 0a20 2020 uttons"></a> +.... 0x0470: 200a 2020 2020 3c2f 6469 763e 0a09 3c2f ......</div> +..</ 0x0480: 6469 763e 0a09 3c2f 6469 763e 0a0a 3c64 div>..</div> +..<d 0x0490: 6976 2069 643d 226e 6176 6967 6174 696f iv.id="navig +atio 0x04a0: 6e54 6f70 2220 636c 6173 733d 2263 6c65 nTop".class= +"cle 0x04b0: 6172 6669 7822 3e0a 2020 2020 2020 2020 arfix">..... +.... 0x04c0: 3c75 6c20 636c 6173 733d 226e 6176 206d <ul.class="n +av.m 0x04d0: 6169 6e22 3e0a 2020 2020 2020 2020 2020 ain">....... +.... 0x04e0: 2020 3c6c 6920 636c 6173 733d 226c 6173 ..<li.class= +"las 0x04f0: 7422 3e3c 6120 2068 7265 663d 222f 7363 t"><a..href= +"/sc 0x0500: 6965 6e63 653f 5f6f 623d 486f 6d65 5061 ience?_ob=Ho +mePa 0x0510: 6765 5552 4c26 5f6d 6574 686f 643d 7573 geURL&_metho +d=us 0x0520: 6572 486f 6d65 5061 6765 265f 6274 6e3d erHomePage&_ +btn= 0x0530: 5926 5f76 6572 7369 6f6e 3d31 265f 7572 Y&_version=1 +&_ur 0x0540: 6c56 6572 7369 6f6e 3d30 265f 7573 6572 lVersion=0&_ +user 0x0550: 6964 3d30 266d 6435 3d31 6132 6435 3638 id=0&md5=1a2 +d568 0x0560: 3766 3866 3337 6636 6634 6538 3237 6239 7f8f37f6f4e8 +27b9 0x0570: 3061 3063 6666 3964 6522 2074 6172 6765 0a0cff9de".t +arge 0x0580: 743d 225f 746f 7022 206e 616d 653d 2248 t="_top".nam +e="H 0x0590: 6f6d 6522 2074 6974 6c65 3d22 486f 6d65 ome".title=" +Home 0x05a0: 2220 3e48 6f6d 653c 2f61 3e3c 2f6c 693e ".>Home</a>< +/li> 0x05b0: 0a20 2020 2020 2020 2020 2020 203c 6c69 ............ +.<li 0x05c0: 3e0a 0909 093c 6120 2068 7265 >....<a..hre 15:14:49.667273 IP test.test.com.60155 > web-editions.com.www: Flags [ +.], ack 3144, win 1275, optio +ns [nop,nop,TS val 518304548 ecr 1125034391], length 0 0x0000: 4500 0034 b40e 4000 4006 bf48 976d a1ab E..4..@.@..H +.m.. 0x0010: c651 c802 eafb 0050 43ac 07c0 1554 bce7 .Q.....PC... +.T.. 0x0020: 8010 04fb c793 0000 0101 080a 1ee4 b324 ............ +...$ 0x0030: 430e a997 C... 15:14:49.701008 IP web-editions.com.www > test.test.com.60155: Flags [ +P.], seq 3144:4468, ack 166, +win 4545, options [nop,nop,TS val 1125034426 ecr 518304548] + , length 1324 0x0000: 4500 0560 d2ee 4000 ee06 ed3b c651 c802 E..`..@....; +.Q.. 0x0010: 976d a1ab 0050 eafb 1554 bce7 43ac 07c0 .m...P...T.. +C... 0x0020: 8018 11c1 5760 0000 0101 080a 430e a9ba ....W`...... +C... 0x0030: 1ee4 b324 4272 6f77 7365 4c69 7374 5552 ...$BrowseLi +stUR 0x0040: 4c26 5f74 7970 653d 616c 6c26 5f61 7574 L&_type=all& +_aut 0x0050: 683d 7926 5f62 746e 3d59 265f 7665 7273 h=y&_btn=Y&_ +vers 0x0060: 696f 6e3d 3126 5f75 726c 5665 7273 696f ion=1&_urlVe +rsio 0x0070: 6e3d 3026 5f75 7365 7269 643d 3026 6d64 n=0&_userid= +0&md 0x0080: 353d 3964 3638 3561 3435 3332 3461 3834 5=9d685a4532 +4a84 0x0090: 6166 3233 6266 6563 6533 3530 6239 6536 af23bfece350 +b9e6 0x00a0: 3766 2220 7461 7267 6574 3d22 5f74 6f70 7f".target=" +_top 0x00b0: 2220 6e61 6d65 3d22 4272 6f77 7365 2220 ".name="Brow +se". 0x00c0: 2074 6974 6c65 3d22 4272 6f77 7365 2220 .title="Brow +se". 0x00d0: 3e42 726f 7773 653c 2f61 3e0a 0909 093c >Browse</a>. +...< 0x00e0: 2f6c 693e 0a09 0909 0a09 0909 3c6c 693e /li>........ +<li> 0x00f0: 3c61 2020 6872 6566 3d22 2f73 6369 656e <a..href="/s +cien 0x0100: 6365 3f5f 6f62 3d4d 6961 6d69 5365 6172 ce?_ob=Miami +Sear 0x0110: 6368 5552 4c26 5f6d 6574 686f 643d 7265 chURL&_metho +d=re 0x0120: 7175 6573 7446 6f72 6d26 5f62 746e 3d59 questForm&_b +tn=Y 0x0130: 265f 7665 7273 696f 6e3d 3126 5f75 726c &_version=1& +_url 0x0140: 5665 7273 696f 6e3d 3126 5f75 7365 7269 Version=1&_u +seri 0x0150: 643d 3026 6d64 353d 3033 3934 6239 3664 d=0&md5=0394 +b96d 0x0160: 6230 6630 6466 3538 3435 3535 3566 3135 b0f0df584555 +5f15 0x0170: 3532 6561 3137 6563 2220 7461 7267 6574 52ea17ec".ta +rget 0x0180: 3d22 5f74 6f70 2220 6e61 6d65 3d22 5365 ="_top".name +="Se 0x0190: 6172 6368 2220 7469 746c 653d 2253 6561 arch".title= +"Sea 0x01a0: 7263 6822 203e 5365 6172 6368 3c2f 613e rch".>Search +</a> 0x01b0: 3c2f 6c69 3e0a 0a0a 2020 2020 2020 3c6c </li>....... +..<l 0x01c0: 693e 3c61 2020 6872 6566 3d22 2f73 6369 i><a..href=" +/sci 0x01d0: 656e 6365 3f5f 6f62 3d55 7365 7253 7562 ence?_ob=Use +rSub 0x01e0: 7363 7269 7074 696f 6e55 524c 265f 6d65 scriptionURL +&_me 0x01f0: 7468 6f64 3d62 6567 696e 265f 6274 6e3d thod=begin&_ +btn= 0x0200: 5926 5f7a 6f6e 653d 546f 704e 6176 4261 Y&_zone=TopN +avBa 0x0210: 7226 5f6f 7269 6769 6e3d 265f 7665 7273 r&_origin=&_ +vers 0x0220: 696f 6e3d 3126 5f75 726c 5665 7273 696f ion=1&_urlVe +rsio 0x0230: 6e3d 3126 5f75 7365 7269 643d 3026 6d64 n=1&_userid= +0&md 0x0240: 353d 6336 6532 3839 3032 3636 3065 6236 5=c6e2890266 +0eb6 0x0250: 3135 3131 3831 3237 6636 3634 6535 6562 15118127f664 +e5eb 0x0260: 3230 2220 7461 7267 6574 3d22 5f74 6f70 20".target=" +_top 0x0270: 2220 6e61 6d65 3d22 6163 636f 756e 7422 ".name="acco +unt" 0x0280: 2074 6974 6c65 3d22 4d79 2073 6574 7469 .title="My.s +etti 0x0290: 6e67 7322 203e 4d79 2073 6574 7469 6e67 ngs".>My.set +ting 0x02a0: 733c 2f61 3e3c 2f6c 693e 0a0a 2020 2020 s</a></li>.. +.... 0x02b0: 2020 3c6c 693e 3c61 2020 6872 6566 3d22 ..<li><a..hr +ef=" 0x02c0: 2f73 6369 656e 6365 3f5f 6f62 3d4d 6961 /science?_ob +=Mia 0x02d0: 6d69 5344 4955 524c 265f 6d65 7468 6f64 miSDIURL&_me +thod 0x02e0: 3d6c 6973 7441 6c65 7274 7326 5f62 746e =listAlerts& +_btn 0x02f0: 3d59 265f 7a6f 6e65 3d54 6f70 4e61 7642 =Y&_zone=Top +NavB 0x0300: 6172 265f 6f72 6967 696e 3d26 5f76 6572 ar&_origin=& +_ver 0x0310: 7369 6f6e 3d31 265f 7572 6c56 6572 7369 sion=1&_urlV +ersi 0x0320: 6f6e 3d30 265f 7573 6572 6964 3d30 266d on=0&_userid +=0&m 0x0330: 6435 3d34 6338 3433 3264 3434 3835 6337 d5=4c8432d44 +85c7 0x0340: 6533 3131 6133 6336 3931 3531 6535 6634 e311a3c69151 +e5f4 0x0350: 6566 3022 2074 6172 6765 743d 225f 746f ef0".target= +"_to 0x0360: 7022 206e 616d 653d 2261 6c65 7274 2220 p".name="ale +rt". 0x0370: 7469 746c 653d 2241 6c65 7274 7322 203e title="Alert +s".> 0x0380: 4d79 2061 6c65 7274 733c 2f61 3e3c 2f6c My.alerts</a +></l 0x0390: 693e 0a0a 2020 2020 2020 2020 0920 2020 i>.......... +.... 0x03a0: 2020 2020 2020 2020 0a20 2020 2020 2020 ............ +.... 0x03b0: 203c 2f75 6c3e 0a20 2020 2020 2020 200a .</ul>...... +.... 0x03c0: 2020 2020 2020 2020 0a20 203c 2f64 6976 ...........< +/div 0x03d0: 3e0a 3c2f 6469 763e 0a0a 2020 0a20 2020 >.</div>.... +.... 0x03e0: 203c 7461 626c 6520 626f 7264 6572 3d30 .<table.bord +er=0 0x03f0: 2063 656c 6c70 6164 6469 6e67 3d30 2063 .cellpadding +=0.c 0x0400: 656c 6c73 7061 6369 6e67 3d30 2077 6964 ellspacing=0 +.wid 0x0410: 7468 3d22 3130 3025 223e 0a20 2020 203c th="100%">.. +...< 0x0420: 7472 3e3c 7464 2073 7479 6c65 3d22 7061 tr><td.style +="pa 0x0430: 6464 696e 672d 746f 703a 302e 3065 6d22 dding-top:0. +0em" 0x0440: 3e3c 6120 6e61 6d65 3d22 536b 6970 2042 ><a.name="Sk +ip.B 0x0450: 7574 746f 6e73 223e 3c2f 613e 3c2f 7464 uttons"></a> +</td 0x0460: 3e3c 2f74 723e 0a20 2020 203c 2f74 6162 ></tr>.....< +/tab 0x0470: 6c65 3e0a 2020 0a0a 0a0a 0a3c 6469 7620 le>........< +div. 0x0480: 6964 3d22 7364 4865 6164 6572 2220 636c id="sdHeader +".cl 0x0490: 6173 733d 226f 7574 6572 5772 6170 7065 ass="outerWr +appe 0x04a0: 7222 3e0a 3c64 6976 2073 7479 6c65 3d22 r">.<div.sty +le=" 0x04b0: 7769 6474 683a 3130 3025 223e 0a0a 0a0a width:100%"> +.... 0x04c0: 0a3c 2f64 6976 3e0a 3c64 6976 3e0a 5363 .</div>.<div +>.Sc 0x04d0: 6965 6e63 6544 6972 6563 7420 2064 6f65 ienceDirect. +.doe 0x04e0: 7320 6e6f 7420 7375 7070 6f72 7420 7468 s.not.suppor +t.th 0x04f0: 6520 7573 6520 6f66 2074 6865 2063 7261 e.use.of.the +.cra 0x0500: 776c 6572 2073 6f66 7477 6172 652e 2020 wler.softwar +e... 0x0510: 4966 2079 6f75 2068 6176 6520 616e 7920 If.you.have. +any. 0x0520: 7175 6573 7469 6f6e 7320 706c 6561 7365 questions.pl +ease 0x0530: 2063 6f6e 7461 6374 2079 6f75 7220 6865 .contact.you +r.he 0x0540: 6c70 6465 736b 2e0a 3c2f 6469 763e 0a0a lpdesk..</di +v>.. 0x0550: 3c2f 626f 6479 3e0a 3c2f 6874 6d6c 3e0a </body>.</ht +ml>. 15:14:49.701016 IP test.test.com.60155 > web-editi
After changing the user agent I get this for the second URL; Also, might help you if you add Data::Dumper and view the output.
my $response = $browser->get( $url ); print Dumper($response);
<div class="errMsgText">Sorry, the requested document is unavailable +.Contact the Help Desk if the problem persists. [SD-007]</div>

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://943812]
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (5)
As of 2018-06-23 08:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?



    Results (125 votes). Check out past polls.

    Notices?