Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
I recently watched a thread in Meditations that got a serious case of "right-shifting", that is, lots of replies to replies, with a long sequence of "Re: Re: Re: Re:" as the result. This kinda disturbes the picture, and in this case, the Re:s came to dominate the titles. And I remembered the quite recent node Feature request: Collapsing, which spanned few replies, and no action - maybe not so surprising. There was little interest, and no code.

Anyways, I started wondering if it was possible to programatically collapse sequences of "Re: Re:" into the more readable (IMO) "Re(2):" etc. It turned out to be not that trivial, but some thinking/tinkering solved the problem, at least it appears so to me.

I have some code below that features two subs:

  • re_collapse takes a string (presumably a title) and mangles together all combinations of "Re: Re(2): (Author) Re: Re: " that I came up with (please add more tests if you wish).
  • re_add adds a "Re: " to the start of the title, with the added twist that if is already one "Re: " there, it switches to "Re(2): ", and if there is a "Re(2): " there, it switches to "Re(3): " etc.

Those are some mighty ugly regexps IMO, but whatever gets the job done... :) Update: Maybe I should mention that the combination of while and /g in some of the regexps are not really reduntant; it would suffice to just do the while, but this gets more done in each pass, when appropriate cases exists.

The collapsing sub will only collapse "pure" sequences of "Re", which means that when someone enters their name or such in the middle of the name to keep track, it simply finishes the job there, and starts to recompress on the other side. Nothing much to do about that without too seriously altering the authors intent.

Anyhow, this should probably be a User Setting, not the default. Some people may not like that you alter their stuff, but for myself, I'd really appreciate to have my titles collapsed, and also to by default get the "Re(2): " instead of "Re: Re: " when replying at the second level. Other probably want it as it is, thus a user setting.

#!/usr/bin/perl -w use strict; use Test; BEGIN { plan tests => 18; } # Collapses sequences of Re: together into one Re(\d): sub re_collapse { my $title = shift; # Normal 'Re: Re: ' sequences $title =~ s{(Re: ){2,}}{"Re(" . length($&)/4 . "): "}ge; # Already renumbered: # 'Re: Re(\d+): ' while($title =~ s{Re: Re\((\d+)\): }{"Re(" . ($1 + 1) . "): " }ge) +{}; # 'Re(\d+): Re: ' while($title =~ s{Re\((\d+)\): Re: }{"Re(" . ($1 + 1) . "): " }ge) +{}; # 'Re(\d+): Re(\d+): ' while($title =~ s{Re\((\d)\): Re\((\d)\): }{"Re(" . ($1 + $2) . ") +: "}ge){}; return $title; } # Adds a new Re: or Re(\d): at the start of a string: sub re_add { my $title = shift; $title =~ s{^(Re(\((\d+)\))?: )?}{($1) ? ("Re(" . (($3||1) + 1) . +"): ") : "Re: "}e; return $title; } ########################### # # Tests: # # Testing re_collapse: # Should not change: ok(&re_collapse("(DaP) Re: Foo is not Bar") eq "(DaP) Re: Foo is not B +ar"); ok(&re_collapse("(DaP) Foo is not Bar") eq "(DaP) Foo is not Bar"); ok(&re_collapse("Re(3): (DaP) Re: Foo is not Bar") eq "Re(3): (DaP) Re +: Foo is not Bar"); # Normal Re: sequences: ok(&re_collapse("Re: Re: Foo is not Bar") eq "Re(2): Foo is not Bar"); ok(&re_collapse("Re: Re: Re: Foo is not Bar") eq "Re(3): Foo is not Ba +r"); ok(&re_collapse("Re: Re: (DaP) Re: Foo is not Bar") eq "Re(2): (DaP) R +e: Foo is not Bar"); ok(&re_collapse("Re: Re: (DaP) Re: Re: Foo is not Bar") eq "Re(2): (Da +P) Re(2): Foo is not Bar"); # Already renumbered: ok(&re_collapse("Re: Re(2): Foo is not Bar") eq "Re(3): Foo is not Bar +"); ok(&re_collapse("Re(2): Re: Foo is not Bar") eq "Re(3): Foo is not Bar +"); ok(&re_collapse("Re: Re(2): (DaP) Re: Re(2): Foo is not Bar") eq "Re(3 +): (DaP) Re(3): Foo is not Bar"); ok(&re_collapse("Re(2): Re: Re: Re(2): Foo is not Bar") eq "Re(6): Foo + is not Bar"); ok(&re_collapse("Re: Re(2): (DaP) Re(2): Re: Foo is not Bar") eq "Re(3 +): (DaP) Re(3): Foo is not Bar"); ok(&re_collapse("Re: Re: Re(2): (DaP) Re(2): Re: Re(4): Re: Foo is not + Bar") eq "Re(4): (DaP) Re(8): Foo is not Bar"); # Example from node id 168373: # (Nothing much to do about such). ok(&re_collapse("Re: Re: (Someone) Re: (Someoneelse) Re: (Someotherels +e) Re: Re: Something") eq "Re(2): (Someone) Re: (Someoneelse) Re: (So +meotherelse) Re(2): Something"); ####################### # Testing re_add: ok(&re_add("Foo is not Bar") eq "Re: Foo is not Bar"); ok(&re_add("Re: Foo is not Bar") eq "Re(2): Foo is not Bar"); ok(&re_add("Re(2): Foo is not Bar") eq "Re(3): Foo is not Bar"); # This one has to be collapsed later, so this is ok: ok(&re_add("Re: Re: Foo is not Bar") eq "Re(2): Re: Foo is not Bar"); #######################
Update: Thanks to jeffa for pointing out a subtle bug in a few of my test cases, now fixed.

Update 2: There is now an updated version of this code posted at this node below, that handles and translates back and forth between the two major styles "Re(\d):" and "Re^\d:". It also removes the pesky $& as per request. :)

Is there anyone else that would like this implemented? Please speak up. :)

Also, if you can come up with examples that break the above code (preferably common cases), then let me know so I can try to fix it.


You have moved into a dark place.
It is pitch black. You are likely to be eaten by a grue.

In reply to Title Re: collapsing revisited by Dog and Pony

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (3)
As of 2021-09-27 14:36 GMT
Find Nodes?
    Voting Booth?

    No recent polls found