Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Applying XSL stylesheet specified in XML file to the XML

by blm (Hermit)
on Mar 24, 2009 at 07:41 UTC ( #752807=perlquestion: print w/ replies, xml ) Need Help??
blm has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I have an xml file that I get using WWW::Mechanise and it is XML. It starts with:

<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="/foo/test.xsl"?>

I want to parse the xml with XML::LibXML, find the xsl file url, retrieve it and apply it. At least that is what I think if have to do. I don't know how to get XML::LibXML to give me the declarations at the top (ie the <? ?> things).

All I really want is the HTML that results from applying the XSL to the XML. (I know, all you really want is a pony but this is about my question at the moment ;-) )

Thanks for any and all help. I am not fixed on using XML::LibXML so I would be interested in any other useful modules.

Comment on Applying XSL stylesheet specified in XML file to the XML
Download Code
Re: Applying XSL stylesheet specified in XML file to the XML
by dHarry (Abbot) on Mar 24, 2009 at 08:50 UTC

    You need libxslt for that (in a libxml context). There are alternatives of course, see CPAN. I'm not too familiar with those alternatives. My personal favorite is Xalan.

    HTH
    dHarry

      Hi, Thanks for the reply. I realized that I need libxslt. But unless I am missing something I don't see how to pull the xsl uri out of the xml and feed it to libxslt (XML::LibXSLT). (Maybe I just need to grep for it.) That is my problem. Can you show me some code?

      Here is my code:
      use lib qw|/home/blm/perl/lib|; use strict; use WWW::Mechanize; use XML::LibXML; use XML::LibXSLT; my $mech = WWW::Mechanize->new(agent => 'Mozilla/5.0 (X11; U; Linux i6 +86; en-US;+ rv:1.9.0.1) Gecko/2008070206 Firefox/3.0.1' ); my $url = 'https://some.url.here/'; $mech->delete_header('accept-encoding'); $mech->get($url); $mech->update_html($mech->content()); print $mech->content; my $parser = XML::LibXML->new(); my $style_parser = XML::LibXML->new(); my $xslt = XML::LibXSLT->new(); my $doc = $parser->parse_string($mech->content()); print $doc->toString(); my $stylesheet_location = ***Here is my problem*** $mech->get($stylesheet_location); my $stylesheet_string = $mech->content(); my $styledoc = $style_parser->parse_string($stylesheet_string); my $stylesheet = $xslt->parse_stylesheet($styledoc); my $results = $xslt->transform($doc); print $results;

        Ah, I see. And I have to disappoint you, I use XML::Twig for XML processing in Perl and my own tools for XSLT stuff. I would not grep for it, instead there must be more XML-ish way of doing things. After all it's just a Node of a specific type, i.e. NodeType 'processing-instruction'. So I imagine parsing the file and retrieving the information should do the trick, i.e. walk the DOM tree. Suddenly the grepping doesn't sound so bad anymore;) Another option is to use SAX, I see a processingInstructionSAXFunc in the libxml2 API. However there is another Perl module that might come in handy: XML::LibXML::PI can't you do a getData?

        Mind you, in "my" environment it's as simple as one method call: getAssociatedStylesheet()!

Re: Applying XSL stylesheet specified in XML file to the XML
by ForgotPasswordAgain (Deacon) on Mar 25, 2009 at 09:26 UTC

    I don't know how to tell what href is relative to, but even just to get the value of href is a bit gimpy for "processing instructions", which is what <? ... ?> are.

    #!/usr/bin/perl -w use strict; use XML::LibXML; my $parser = XML::LibXML->new; my $doc = $parser->parse_string(<<'EOX'); <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href ='abc"efg'?> <_/> EOX foreach my $node ($doc->findnodes('//processing-instruction()')) { my $name = $node->nodeName; if ($name eq 'xml-stylesheet') { # getData is a string like q{type="text/xsl" href="/test.xsl"} # which is what makes it annoying my $attr_str = $node->getData; # manually parse the string like href='abc"efg'; # there might be a better way of doing this $attr_str =~ m{href\s*=\s*(['"])([^\1]+)\1}; my $href = defined $2 ? $2 : ''; print "$name href: >>>$href<<<\n"; } }

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://752807]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (11)
As of 2014-09-23 21:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (241 votes), past polls