I'm trying to fix a perl plugin for the Squeezebox server.
I need to split the following text into an id (from tag=) and the url (from url=).
These tag={id}&url={url} pairs are delimited by commas, but commas can also appear in the url, AND so can duplicate 'tag=' elements which must also stay so I cant split by that character alone.
an example of the text (all one line as it arrives)
itag=44&url=http://o-o---preferred---sn-u5a3u5a3-h5oe---v13---lscache3
+.c.youtube.com/videoplayback?upn=8kbZJLkF5PA&sparams=cp%2Cid%2Cip%2Ci
+pbits%2Citag%2Cratebypass%2Csource%2Cupn%2Cexpire&fexp=927101%2C92300
+6%2C922401%2C920704%2C912806%2C913419%2C913546%2C913556%2C919349%2C91
+9351%2C925109%2C919003%2C920201%2C912706&key=yt1&expire=1348823962&it
+ag=44&ipbits=8&sver=3&ratebypass=yes&mt=1348800611&ip=92.22.37.231&mv
+=m&source=youtube&ms=au&cp=U0hTTVhNUV9LTENOM19QR1VKOkFyQWNVSVFNbmNL&i
+d=1100a4b92b939cd6&type=video/webm;+codecs="vp8.0,+vorbis"&fallback_h
+ost=tc.v13.cache3.c.youtube.com&sig=8353F6329CDA8168C4F7F29E20F2AE3F6
+509D85F.C582D63C02534232CE8E28D5ADC5B119AAEF2963&quality=large,itag=3
+5&url=http://o-o---preferred---sn-u5a3u5a3-h5oe---v11---lscache4.c.yo
+utube.com/videoplayback?upn=8kbZJLkF5PA&sparams=algorithm%2Cburst%2Cc
+p%2Cfactor%2Cid%2Cip%2Cipbits%2Citag%2Csource%2Cupn%2Cexpire&fexp=927
+101%2C923006%2C922401%2C920704%2C912806%2C913419%2C913546%2C913556%2C
+919349%2C919351%2C925109%2C919003%2C920201%2C912706&expire=1348823962
+&algorithm=throttle-factor&burst=40&ip=92.22.37.231&itag=35&sver=3&ke
+y=yt1&mt=1348800611&mv=m&source=youtube&ms=au&ipbits=8&factor=1.25&cp
+=U0hTTVhNUV9LTENOM19QR1VKOkFyQWNVSVFNbmNL&id=1100a4b92b939cd6&type=vi
+deo/x-flv&fallback_host=tc.v11.cache4.c.youtube.com&sig=885C9C098DF9D
+80E780177E01CF944BC4F9564FE.9A374618A2BE8C2E562C8622DCB449A7071E37BD&
+quality=large,itag= ...AND SO ON
I'd like the data to end up in a hash of id,url.
I first tried this, but it only splits the first found pair, and not properly.
for my $stream (split(/itag=(.*)&url=/, $streams)) {
print $stream;
}
I expected it to print out
44
http://o-o---preferred---sn-u5a3u5a3-h5oe---v13---lscache3.c.youtube.c
+om/videoplayback?upn=8kbZJLkF5PA&sparams=cp%2Cid%2Cip%2Cipbits%2Citag
+%2Cratebypass%2Csource%2Cupn%2Cexpire&fexp=927101%2C923006%2C922401%2
+C920704%2C912806%2C913419%2C913546%2C913556%2C919349%2C919351%2C92510
+9%2C919003%2C920201%2C912706&key=yt1&expire=1348823962&itag=44&ipbits
+=8&sver=3&ratebypass=yes&mt=1348800611&ip=92.22.37.231&mv=m&source=yo
+utube&ms=au&cp=U0hTTVhNUV9LTENOM19QR1VKOkFyQWNVSVFNbmNL&id=1100a4b92b
+939cd6&type=video/webm;+codecs="vp8.0,+vorbis"&fallback_host=tc.v13.c
+ache3.c.youtube.com&sig=8353F6329CDA8168C4F7F29E20F2AE3F6509D85F.C582
+D63C02534232CE8E28D5ADC5B119AAEF2963&quality=large
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.