j.goor has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,
Problem: I use FireFox 1.0(wheee!!!), and I want to get to my online O'Reilly books via http:\\ (not via file:\\\\).
But the very odd thing is: OReilly apparently did not read the w3c recommendations about 'how to construct a hyperlink'.
*All* hyperlinks (and other links as well) are in the backslash-notation, as if is were directory paths.

I want to write a script that:
1) slurp one page into an XML parser
2) convert the '\' 's to '/' 's
3) spit it out to another place.

I do not have much experience with XML, but this seemed a nice way to fool around with it.

My questions are:
a) do you recognize my problem regarding OReilly e-books?
b) could you provide me with a working code-snipped (this should be easy for an experienced perl/XML programmer)
c) If XML parsing is overdone, what's the best RE I can use?

P.S. InternetExplorer works just fine, but then again: who really want to use *that*?? ;-)

Thanks in advance!
  • Comment on Convert backslash to slash using XML parsing

Replies are listed 'Best First'.
Re: Convert backslash to slash using XML parsing
by Taulmarill (Deacon) on Nov 12, 2004 at 08:47 UTC
    if you only whant to translate '\' to '/', why don't you just use tr!\\!/!;?
      THere must be more to it than that, presumably the books contain other, non-url backslashes also.

      Snazzy tagline here
        yea ok, that may be. but why XML?!?
        if i wanted to be clean i would use HTML::Parser.

      Why would you assume that it is safe to convert every backslash in the entire Perl CD Bookshelf to a forward slash? This would blindly ruin every example in the suite of CD books! That's a case of throwing out the baby with the bathwater. Could you imagine reading the Camel book where someone has gone through it and changed every single backslash to a forward slash? The first hello world script would look like this:

      #!/usr/bin/perl -w print "Hello world!/n";

      ...and for the record, "/n" is not the same thing as "\n".

      No, the OP realy does probably need a token parser like HTML::TokeParser, and a routine a little smarter than blind transliteration.

      One thing about the OP's post does bother me though. Does this conversion of file:\\\ to http:// mean that he's going to be making available ONLINE the entire Perl CD Bookshelf, in violation of O'Reilly's copyright?


Re: Convert backslash to slash using XML parsing
by iburrell (Chaplain) on Nov 12, 2004 at 18:11 UTC
    What online O'Reilly books are you talking about? The ones at use normal slashes in the URLs.