Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic

Difficulty with Logic parsing ICAL feed

by ChocolateCake (Initiate)
on Aug 17, 2012 at 16:16 UTC ( #988015=perlquestion: print w/replies, xml ) Need Help??
ChocolateCake has asked for the wisdom of the Perl Monks concerning the following question:

I have been working on a code that will parse event information from an Ical feed. It is a huge block of data that I want to divide by key term. I need it to be done in an orderly way. I tried indexing the key terms and then having the program print what is between those indexes. However for some reason it became in infinite loop that printed all the data. I don't know how to fix it. DO NOT RUN MY CODE IT KEEPS FREEZING MY COMPUTER. I was hoping someone could show me what my problem is.

DO NOT RUN THIS PROGRAM use strict; use warnings; use LWP::Simple; use HTML::TreeBuilder; use HTML::FormatText; my $URL= get(" +sponsor%5B%5D=&audience%5B%5D=&category%5B%5D="); my $Format=HTML::FormatText->new; my $TreeBuilder=HTML::TreeBuilder->new; $TreeBuilder->parse($URL); my $Parsed=$Format->format($TreeBuilder); open(FILE, ">UOTSUMMER.txt"); print FILE "$Parsed"; close (FILE); open (FILE, "UOTSUMMER.txt"); my @array=<FILE>; my $string ="@array"; my $offset = 0; # Where are we in the string? my $numResults = 0; while (1) { my $idxSummary = index($string, "SUMMARY", $offset); my $result = ""; my $idxDescription = index ($string, "DESCRIPTION", $offset); my $result2= ""; if ($idxSummary > -1) { $offset = $idxSummary + length("SUMMARY"); my $idxDescription = index($string, "DESCRIPTION", $offset); if ($idxDescription == -1) { print "(Data malformed: missing DESCRIPTION line.)\n"; last; } if ($idxDescription > -1) { $offset = $idxDescription+ length("DESCRIPTION"); my $idxLocation= index($string, "LOCATION", $offset); if ($idxLocation == -1) { print "(Data malformed: missing LOCATION line.)\n"; last; } my $length = $idxDescription - $offset; my $length2= $idxLocation - $offset; $result = substr($string, $offset, $length); $result2= substr ($string, $offset, $length2); $offset = $idxDescription + length("DESCRIPTION"); $result =~ s/^\s+|\s+$//g ; # Strip leading and trailing white #+space, $result2 =~ s/^\s+|\s+$//g ; # includng newlines. $numResults++; } else { print "(All done. $numResults result(s) found.)\n"; last; } open (FILE2, "UOT123.txt") print FILE2 "TITLE: <$result>\n DESCRIPTION: <$result2>\n";

Any guidance you may have will be greatly appreciated! Thanks!

Replies are listed 'Best First'.
Re: Difficulty with Logic parsing ICAL feed
by roboticus (Chancellor) on Aug 17, 2012 at 16:30 UTC


    It looks like it doesn't find "SUMMARY", so it keeps resetting $offset to the same location--character 10: -1 + length("DESCRIPTION"). I'd suggest terminating the loop if it can't find SUMMARY or DESCRIPTION.


    When your only tool is a hammer, all problems look like your thumb.



        I should've mentioned: If you're having trouble with a loop like this, you might find it helpful to print the "interesting" variables (e.g. $offset) at the top of the loop. For example:

        while (1) { my $idxSummary = index($string, "SUMMARY", $offset); my $result = ""; my $idxDescription = index ($string, "DESCRIPTION", $offset); my $result2= ""; print "idxSum:$idxSummary, idxDesc:$idxDescriptioni, offs:$offset\ +n";

        Then I'd imagine you'd see something like:

        idxSum:47, idxDesc: 62, offs:0 idxSum:122, idxDesc: 143, offs:73 idxSum:-1, idxDesc:-1, offs:10 idxSum:-1, idxDesc:-1, offs:10 .....

        Then you'd probably figure it processed the first two records, and had a problem with the third. I didn't download your code, nor the ICAL url or anything, so the numbers are entirely fictitious.


        When your only tool is a hammer, all problems look like your thumb.

Re: Difficulty with Logic parsing ICAL feed
by davido (Archbishop) on Aug 17, 2012 at 16:42 UTC

    Crossposted on Stack Overflow:

    There's nothing wrong with minimal crossposting. But it's polite and useful to link to the other copies so that people don't put effort into a question that already has a solution elsewhere, and so that the collaborative effort can be based on responses from all incarnations of the question.


      You are absolutely right, I am new to this and don't know the proper etiquette. Thank you for sharing that with me! How do I link the crosspost?
Re: Difficulty with Logic parsing ICAL feed
by tobyink (Abbot) on Aug 17, 2012 at 16:28 UTC

    Why on earth are you parsing iCalendar by hand? Use Text::vFile::asData.

    perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'
Re: Difficulty with Logic parsing ICAL feed
by davido (Archbishop) on Aug 17, 2012 at 16:55 UTC

    If you don't have https support installed this line will be a problem (maybe you do, as it's not the problem you're posting about):

    my $URL= get("");

    You should be checking the return value from LWP::Simple::get(). If it's undef, you didn't get a successful response. In this case you can solve the this part of the problem simply by switching to http:// instead of https://, or by installing LWP::Protocol::https. But still get in the habit of checking for undef after using LWP::Simple::get.

    You should be able to enable https support in LWP::Simple by following the advice in the README for LWP:

    If you want to access sites using the https protocol, then you need to install the LWP::Protocol::https module from CPAN.

    Once this is done, LWP::Simple is able to fetch the https request. (And maybe you've done this already).


Re: Difficulty with Logic parsing ICAL feed
by linuxkid (Sexton) on Aug 17, 2012 at 16:26 UTC

    take a look at the url you're trying to get, it looks malformed.


      It fetches an iCal record. Pretty easy to copy it and try it. What looks malformed about it? seemns like there's are some extra '='s


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://988015]
Approved by GrandFather
[Corion]: :-D
[ww]: or a lot more than (22*2) for /me
[ww]: er -- senior cognitive deficiency: ((22+(22*2))
[ww]: (but pls don't s/senior/senile/ ... not yet
[karlgoethebier]: A bit more audience now: The Impact of Information Sources on Code Security Very interesting IMHO...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (8)
As of 2017-09-25 14:37 GMT
Find Nodes?
    Voting Booth?
    During the recent solar eclipse, I:

    Results (280 votes). Check out past polls.