Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re: Sorting Strings By Vowel Sequence

by choroba (Cardinal)
on Nov 17, 2014 at 21:52 UTC ( [id://1107487]=note: print w/replies, xml ) Need Help??


in reply to Sorting Strings By Vowel Sequence

I used the Schwartzian Transform:
#!/usr/bin/perl use warnings; use strict; use feature qw{ say }; my @words = qw( fanfare apparate panacea parmesan albatross albacore false vermeil candelabra beans ); say for map $_->[0], sort { $a->[1] cmp $b->[1] } map [ $_, join q(), /[aeiou]/g ], @words;
لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

Replies are listed 'Best First'.
Re^2: Sorting Strings By Vowel Sequence
by Zucan (Beadle) on Nov 18, 2014 at 02:17 UTC

    UPDATED: Second edge case added.

    I am curious how this deals with words with vowel sequences being the same? For example, running the above script with the following words:
    my @words = qw( xaxexix babebib xaxexi babebibb );
    Produces the following output:
    xaxexix babebib xaxexi babebibb
    Since the vowel positions for xaxexix and babebib are the same, I would expect a secondary alpha-sort to occur here. Another potential issue is how the algorithm deals with words where the vowel sequences are in the same order but not in the same positions. For example:
    my @words = qw( babebib baebbib );
    Produces the following output:
    babebib baebbib
    Even though babebib is alphabetically before baebbib, the e occurs one position sooner in baebbib, which should put it before babebib.

    Maybe I am reading too much into the initial question. Having done Computer Programming Contests a number of times before, these are the kinds of things that the judges are looking for to see if the teams caught all the possibilities.

    Zucan

      Well it wasn't part of the original problem statement, so the solution allows sort to pick any order it chooses for elements with the same vowel string. You can tack on the original word to the end of the vowels string to sort by consonants when vowel strings match.

      Here is that idea, but using GRT Sort instead of ST:

      use 5.014; say join ', ', map {s/.*\0//r} sort map {tr/aeiou//cdr."\0$_"} qw( xexix babebib xaxexi babebibb );
      output:
      babebib, babebibb, xaxexi, xexix

        Nice. Since I added the second edge case, the above still doesn't address the sorting of "babebib" and "baebbib". Even though the first one is alphabetically first, the second should be first based on the positioning of the second vowel. That is at least how I would read the problem.

        You are right, however, the initial problem was loosely stated, so there is definitely room for interpretation.

        Update: The second case may not apply here because the initial question provides sample output that doesn't include the positioning of the vowels in the words. If positioning mattered, the output would actually be the following instead:

        apparate albacore albatross panacea fanfare false parmesan candelabra +beans vermeil
      You can combine many sorts. Here's something quick and dirty:
      #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; my @words = qw( xaxexix babebib xaxexi babebibb baebbib ); sub prepare_sort { my ( $vowels, $positions ) = ( '', '' ); while (/([aeiou])/g) { $vowels .= $1; $positions .= sprintf '%03d', pos; } return [ $_, $positions, $vowels ]; } say for map $_->[0], sort { $a->[2] cmp $b->[2] } # vowels sort sort { $a->[1] cmp $b->[1] } # positional sort sort { $a->[0] cmp $b->[0] } # alphabetical sort map prepare_sort(), @words;
      Output:
      baebibb babebib babebibb xaxexi xaxexix
        Now that I think of it, it can be greatly simplified:
        sub prepare_sort { return [ $_, tr/aeiou/z/cr, tr/aeiou//dr ]; }
        Thanks to Zucan for the tip!

        So we dont need a sub now:

        #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; my @words = qw( xaxexix babebib xaxexi babebibb baebbib ); say for map $_->[0], sort { $a->[2] cmp $b->[2] } # vowels sort sort { $a->[1] cmp $b->[1] } # positional sort sort { $a->[0] cmp $b->[0] } # alphabetical sort map [ $_, tr/aeiou/z/cr, tr/aeiou//dr ], @words;
      Hi Zucan

      If you want to sort alphabetically the words with the vowels at the same position, you can make just a very simple change to just one line of the original ST code suggested by choroba:

      #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; my @words = qw( fanfare apparate panacea parmesan albatross albacore false vermeil candelabra beans ); say for map $_->[0], sort { $a->[1] cmp $b->[1] || $a->[0] cmp $b->[0]} map [ $_, join q(), /[aeiou]/g ], @words;
      For each time comparison step in the sorting algorithm, if the vowels compare equal, then
      $a->[1] cmp $b->[1]
      return 0. And it returns -1 or +1 in all the other cases. If you get a 0, then the logical or (||) will execute the second part of the sort statement:
      $a->[0] cmp $b->[0]
      to try to see if the whole expression is true. But if you get -1 or +1 with the first part, then the whole expression if already known to be true, so that the second part of the statement will not be executed (short-circuited).

      I do not think that the other edge case you mention is compatible with the initial question.

      That is a very good point, I never stopped to think if the words were the same. Realizing now that could be a possibility I will fix that. Thank you for pointing that out. I am new to Perl so I am just taking programming contest-esque problems and am trying to work them out to learn.

Re^2: Sorting Strings By Vowel Sequence
by NewToPerl777 (Novice) on Nov 17, 2014 at 21:56 UTC

    Wow, I did not know this was a known method. Thank you for the link, insight, and example I am going to research this more. Thank you very much for your help!

      The Schwartzian Transform is a fairly common idiom in Perl and is, to a large extent, using some principles of functional programming (languages such as Lisp or Scheme) in Perl. If Choroba had not done it before, I would have suggested almost the same solution (with probably just some very minor syntactical variations).

      To understand such a construct, you have to read it from bottom to top (and sometimes, depending on the exact layout, from right to left).

      To understand it better, you can split it mentally into three separate instructions:

      my @temp_array = map [ $_, join q(), /[aeiou]/g ], @words; @temp_array = sort { $a->[1] cmp $b->[1] } @temp_array; say for map $_->[0], @temp_array;
      The first instruction creates a temporary array of array references whose content will be something like this:
      (["fanfare", "aae"], [apparate, "aaae"], etc.)
      The second instruction will sort the array references in accordance with the second field of each array ref (i.e. the vowels). At the end of this, the temporary array contains the array refs sorted according to the vowels.

      The final step is to print out the original words (stripped of the data added in the first step), which will be now sorted according to the vowels.

      This process is sometimes referred to as : "decorate, sort, undecorate", because it is adding new data to the original one to enable the sort to work efficiently, and then removes the added data.

      Now, the beauty of the Schwartzian Transform is that you don't really need the temporary array: you can just pipeline the data from one step to the next. So if you look back at the code suggested by choroba, the map at the bottom creates a list of array references which are fueled into the sort line above, which sorts in accordance with what the map at the bottom has added to the data, and the result of the sort is fed into the map of the first line, which removes what the first map had added and returns the original list duly sorted in accordance to the vowels of each word.

      I hope this helps understanding this slightly advanced construct. Once understood, it becomes very natural to use it.

        Thank you so much for taking the time to break this down. I am new to Perl and this breakdown really helps me understand in more detail to what is actually going on behind the scenes.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1107487]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (3)
As of 2024-04-25 06:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found