I recently watched a thread in Meditations that got a serious case of "right-shifting", that is, lots of replies to replies, with a long sequence of "Re: Re: Re: Re:" as the result. This kinda disturbes the picture, and in this case, the Re:s came to dominate the titles. And I remembered the quite recent node Feature request: Collapsing, which spanned few replies, and no action - maybe not so surprising. There was little interest, and no code.

Anyways, I started wondering if it was possible to programatically collapse sequences of "Re: Re:" into the more readable (IMO) "Re(2):" etc. It turned out to be not that trivial, but some thinking/tinkering solved the problem, at least it appears so to me.

I have some code below that features two subs:

Those are some mighty ugly regexps IMO, but whatever gets the job done... :) Update: Maybe I should mention that the combination of while and /g in some of the regexps are not really reduntant; it would suffice to just do the while, but this gets more done in each pass, when appropriate cases exists.

The collapsing sub will only collapse "pure" sequences of "Re", which means that when someone enters their name or such in the middle of the name to keep track, it simply finishes the job there, and starts to recompress on the other side. Nothing much to do about that without too seriously altering the authors intent.

Anyhow, this should probably be a User Setting, not the default. Some people may not like that you alter their stuff, but for myself, I'd really appreciate to have my titles collapsed, and also to by default get the "Re(2): " instead of "Re: Re: " when replying at the second level. Other probably want it as it is, thus a user setting.

#!/usr/bin/perl -w use strict; use Test; BEGIN { plan tests => 18; } # Collapses sequences of Re: together into one Re(\d): sub re_collapse { my $title = shift; # Normal 'Re: Re: ' sequences $title =~ s{(Re: ){2,}}{"Re(" . length($&)/4 . "): "}ge; # Already renumbered: # 'Re: Re(\d+): ' while($title =~ s{Re: Re\((\d+)\): }{"Re(" . ($1 + 1) . "): " }ge) +{}; # 'Re(\d+): Re: ' while($title =~ s{Re\((\d+)\): Re: }{"Re(" . ($1 + 1) . "): " }ge) +{}; # 'Re(\d+): Re(\d+): ' while($title =~ s{Re\((\d)\): Re\((\d)\): }{"Re(" . ($1 + $2) . ") +: "}ge){}; return $title; } # Adds a new Re: or Re(\d): at the start of a string: sub re_add { my $title = shift; $title =~ s{^(Re(\((\d+)\))?: )?}{($1) ? ("Re(" . (($3||1) + 1) . +"): ") : "Re: "}e; return $title; } ########################### # # Tests: # # Testing re_collapse: # Should not change: ok(&re_collapse("(DaP) Re: Foo is not Bar") eq "(DaP) Re: Foo is not B +ar"); ok(&re_collapse("(DaP) Foo is not Bar") eq "(DaP) Foo is not Bar"); ok(&re_collapse("Re(3): (DaP) Re: Foo is not Bar") eq "Re(3): (DaP) Re +: Foo is not Bar"); # Normal Re: sequences: ok(&re_collapse("Re: Re: Foo is not Bar") eq "Re(2): Foo is not Bar"); ok(&re_collapse("Re: Re: Re: Foo is not Bar") eq "Re(3): Foo is not Ba +r"); ok(&re_collapse("Re: Re: (DaP) Re: Foo is not Bar") eq "Re(2): (DaP) R +e: Foo is not Bar"); ok(&re_collapse("Re: Re: (DaP) Re: Re: Foo is not Bar") eq "Re(2): (Da +P) Re(2): Foo is not Bar"); # Already renumbered: ok(&re_collapse("Re: Re(2): Foo is not Bar") eq "Re(3): Foo is not Bar +"); ok(&re_collapse("Re(2): Re: Foo is not Bar") eq "Re(3): Foo is not Bar +"); ok(&re_collapse("Re: Re(2): (DaP) Re: Re(2): Foo is not Bar") eq "Re(3 +): (DaP) Re(3): Foo is not Bar"); ok(&re_collapse("Re(2): Re: Re: Re(2): Foo is not Bar") eq "Re(6): Foo + is not Bar"); ok(&re_collapse("Re: Re(2): (DaP) Re(2): Re: Foo is not Bar") eq "Re(3 +): (DaP) Re(3): Foo is not Bar"); ok(&re_collapse("Re: Re: Re(2): (DaP) Re(2): Re: Re(4): Re: Foo is not + Bar") eq "Re(4): (DaP) Re(8): Foo is not Bar"); # Example from node id 168373: # (Nothing much to do about such). ok(&re_collapse("Re: Re: (Someone) Re: (Someoneelse) Re: (Someotherels +e) Re: Re: Something") eq "Re(2): (Someone) Re: (Someoneelse) Re: (So +meotherelse) Re(2): Something"); ####################### # Testing re_add: ok(&re_add("Foo is not Bar") eq "Re: Foo is not Bar"); ok(&re_add("Re: Foo is not Bar") eq "Re(2): Foo is not Bar"); ok(&re_add("Re(2): Foo is not Bar") eq "Re(3): Foo is not Bar"); # This one has to be collapsed later, so this is ok: ok(&re_add("Re: Re: Foo is not Bar") eq "Re(2): Re: Foo is not Bar"); #######################
Update: Thanks to jeffa for pointing out a subtle bug in a few of my test cases, now fixed.

Update 2: There is now an updated version of this code posted at this node below, that handles and translates back and forth between the two major styles "Re(\d):" and "Re^\d:". It also removes the pesky $& as per request. :)

Is there anyone else that would like this implemented? Please speak up. :)

Also, if you can come up with examples that break the above code (preferably common cases), then let me know so I can try to fix it.


You have moved into a dark place.
It is pitch black. You are likely to be eaten by a grue.

Replies are listed 'Best First'.
Re: Title Re: collapsing revisited
by tadman (Prior) on Jun 08, 2002 at 15:36 UTC
    Great effort, but what about "depersonalising" the title as well? While some find it valuable to put their handle in the reply title, these are often left in there by careless responses. So, what you have is Re: (bob) Re: Donut instead of what should be Re(2): Donut.

    BTW, $&? Yikes.
      Well, I guess that could be done, as another user setting. Something like
      s{(?<=Re:) \(\w+\)}{}g;
      might even get the job done, although I haven't tried it enough. And then you put it through the collapse sub to put those Re:s together.

      $& has its uses too, in this case it was the best way I could come up with. ;-)

      You have moved into a dark place.
      It is pitch black. You are likely to be eaten by a grue.
        Like tadman++ noted the problem is that once you use it, all regexes anywhere in the script become slower - not likely to be tolerable in the context of the Everything engine.. And this one in particular is pretty simple: $title =~ s{((?:Re: ){2,})}{"Re(" . length($1)/4 . "): "}ge; Works a treat for your test cases.

        Makeshifts last the longest.

        I suppose what I was wondering was why you left your match unmemorized, and then used $&. I've been lead to believe that's bad form, since once you open that Pandora's box ($`, $& and $'), all your regexes become slower as a result.
      Actually, if we could just all agree to get rid of "vanity re's" like "*Re" and "(its_me) Re" entirely, we would be much better off re-wise. It's terribly re-dundant.


Re: Title Re: collapsing revisited
by Aristotle (Chancellor) on Jun 08, 2002 at 15:42 UTC
    A very good effort; I've just recently started going through my writeups and collapsing anything with more than double "Re:"s. My interjection is that I prefer the "Re^3:" style collapsing - but if collapsing happened automatically, I would likely not care enough to edit the node title. A bigger concern is that your code would break apart where it encounters a "Re^X:". It should probably try to match not /Re\((\d+)\):/ but rather something akin to /Re.(\d+).?:/.

    Makeshifts last the longest.

      Yep, jeffa pointed out a similar example ("2Re: "). That is another good reason to have it as a user setting. It isn't meant to enforce a style upon someone, after all...

      The solution would probably be to be able to set which style you prefer for out put, then have the engine look for all those cases and transforming them.

      I might take a stab at it later, but for adding that extra complexity, it would have to be both viable to do, and enough people that wants it.

      I had seen a few posts with Re^2 and friends, but I didn't think they were that common (plus this is more of a draft and first example). If we can have all the ways, I'm all for it. :)

      Personally though, I find the Re^2: style harder to read, even though perhaps more logically correct. :)


      Ok, I took the challenge. Following here is a new version, that handles both types (though not the "2Re:" style, at least not yet). It is about time to start refactoring though, it would seem, as it is getting really hairy. As a demonstration though, it will serve I think.

      Now we have some new subs, instead of the old ones, we call the same ones with an added _caret for the "Re^2:" style, and _paranthesis for the "Re(2):" style. This goes for both re_collapse_* and re_add_*. As an added bonus, it will also translate from one type to the chosen one, so you can have it all your style.

      Given this approach, and some refactoring, it should be possible to add yet more types. I'm not 100% sure this will hold for every possible case, but the tests I do have seems to indicate it would hold for most stuff.

      It also has removed the $& as was pointed out though I din't understand why at first... doh. :) But then, is @- and @+ free from this? I hope so.

      Anyhow, here it is - enjoy:

Re: Title Re: collapsing revisited
by Aristotle (Chancellor) on Jun 08, 2002 at 21:38 UTC
    This works against all your test cases.
    sub re_collapse { local $_ = $_[0]; # explode all collapsed in one go s{Re.(\d+).?: } { "Re: " x $1 }ge; # re-collapse (no pun intended) s{ ( (?:Re:\ ) {2,} ) } { "Re(" . length($1)/4 . "): " }xge; return $_; }
    A dedicated re_add seems pointless to me since you want to collapse all existing "Re:"s at that point anyway, so I suggest just passing it "Re: $title"; it seems cumbersome to me to implement that functionality using extra code.

    Makeshifts last the longest.

Re: Title Re: collapsing revisited
by mojotoad (Monsignor) on Jun 08, 2002 at 19:00 UTC
    I'd love to see this added. I tend to pay more attention to indentation as an indicator of depth, rather than counting Re: beads.

    While we're on the topic, why doesn't "Note Depth" work in the user settings? It appears to be hard-coded to max out at 3, despite your indicated preference. Perhaps more terse subject lines down in the murky depths would improve the look of indentation.


Re: Title Re: collapsing revisited
by Juerd (Abbot) on Jun 08, 2002 at 15:56 UTC

    I *like* the duplicated Re:s. They make a nice visual representation of the depth. Numbers don't. There's a space, so browsers can wrap the text. I'm not seeing the problem here. If Re-collapsing is going to be introduced can it please be made configurable? Update - It is proposed to be configurable and not default. Maybe I should learn to read.

    - Yes, I reinvent wheels.
    - Spam: Visit eurotraQ.

      Well, once someone's settings have collapsed the title, his note and those from everyone below his will appear collapsed regardless of their settings. Moreover, since subjects can be edited manually, you have that problem already. So there isn't much you can do for your cause, unfortunately.

      Makeshifts last the longest.

Re: Title Re: collapsing revisited
by cjf (Parson) on Jun 09, 2002 at 05:50 UTC

    Why not just leave the 'Re:'s out of the title field when submitting a post and have them added automatically after? I can see a few complications with this, but it would still be simpler than the alternatives, wouldn't it?

      You know, this is why I am starting to think that one should post all their code to this site, no matter what it does, at times. Every time smart people are coming up with different approaches, and possibly better or more elegant solutions. It is far too easy to continue down the path that you started on without ever stopping to think.

      One other such example is Screamer's suggestion here.

      I am not sure which, if any approach is the best or even viable yet, but it is always cool to see this process take place.

      The only complication I see with your suggestion is the authors names, that will be put together, even if someone else has put a post in the middle, like:

      Re: (Bob) Re: Re: (Bob's bro) Re: Foo is Bar, too!
      That would become:
      Re(4): (Bob) (Bob's bro) Foo is Bar, too!
      Which may not be what was intended. </code> Another thing is the fact that I don't really want to impose my or any others style upon others, but rather make it possible for me to display custom titles, while the titles themselves are unchanged for those that wants it raw (or some other way).

      You have moved into a dark place.
      It is pitch black. You are likely to be eaten by a grue.
      That would destroy the possibility of changing the subject when a branch of the node has gone offtopic; or, it wouldn't, but the Re:'s would show the wrong depth then.

      Makeshifts last the longest.

        Nope, it would only leave the 'Re:'s out of the title field, people could still change the titles, just not the number of 'Re:'s (stripping out any leading 'Re:'s would also be an idea). The depth wouldn't be off either if it was done properly, a thread's nodes aren't grouped by their titles.

        As for display, I like the Re(3): format.

        Update: Oops, might have misinterpreted your post. Are you referring to when a thread goes off-topic, someone changes the title to say "An off-topic node about cheese", and then the next reply is "Re: An off-topic node about cheese?" If this is the case, you're right, with my suggestion the reply might be "Re(3): An off-topic node about cheese" instead. It's a good point, but I don't think it's a major problem.