You have run afoul of the Robots Exclusion Protocol. Many websites prefer that real humans with real eyeballs to visit their site. Some feel strongly enough to ban software "robots" such as LWP::UserAgent. Sciencedirect.com is one of these. If you look at the robots.txt file for sciencedirect.com, you'll see they only let the big boys (Google, et. al.) spider their site. All others (including you) can go suck rocks. There is no (legit) solution to this problem except to call the webmasters and convince them that it is in their interest to allow your program to crawl their site. Good luck with that. Alternatively, see if the site has an RSS data feed or API that provides the data you need. APIs especially are less subject to interdiction by webmasters, since they are designed for program-to-program integration.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.
| & || & |
| < || < |
| > || > |
| [ || [ |
| ] || ] ||