Are strings lists of characters?

Replies are listed 'Best First'.
Re: Are strings lists of characters? by Ovid (Cardinal) on Oct 17, 2002 at 18:54 UTC
If you really want a lazy list, could you just use an iterator? The following will return the individual characters, but only as you need them. Further, it won't do a split, but it does reverse the string internally, so a very long string may be an issue. I just hacked it together to demonstrate one strategy. It could use some clean up. `#!/usr/bin/perl -w use strict; sub NEXT { $_[0]->() } sub string_to_char_iter { my $string = shift; $string = reverse $string; sub { '' ne $string ? chop $string : undef } } my $string = join '', 'a' .. 'z'; my $iter = string_to_char_iter $string; while ( defined ( my $char = NEXT $iter ) ) { print "$char\n"; }` [download] Cheers, Ovid Update: Just in case it's not clear, this was just some demo code. Obviously, for a string of 26 characters, creating an iterator would be overkill. Iterators are going to be more useful if you have a large amount of data that is difficult to fit into memory, such as reading from a file, or if you need to keep track of where you are in your data while reading it. Oh, and I tweaked the code just a hair. Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.	[reply] [d/l]
Re: Re: Are strings lists of characters? by John M. Dlugosz (Monsignor) on Oct 17, 2002 at 20:11 UTC
That's fine for a while loop, but map wouldn't know what to do with it. That's why we need the iterator at the language level. I suppose the implementation of the iterator would not be any different, rather Perl 6 "knows" that the iterator is in fact an iterator and will use it transparently.	[reply]
Re: Re: Re: Are strings lists of characters? by Ovid (Cardinal) on Oct 17, 2002 at 20:48 UTC
Yes, but you can write your own version of map that takes a code reference as the first argument and an iterator as the second argument, thus solving your problem for Perl 5, rather than having to wait for Perl 6 to come out during Christmas :) For more information on this, you can go to http://perl.plover.com/book/, subscribe to the mailing list and read the sample chapter. While I don't think that Dominus would mind my posting a brief code snippet to illustrate, I'm not entirely certain if that's appropriate, because he has asked that the chapter not be distributed (or even saved). As a result, I'm not entirely certain if it would be appropriate to post the code. However, if you check it out, search for the `&imap` function. It seems to resolve what you're looking for. Again, I'd post it myself, but I'm not sure of what's appropriate there. Update: I contacted Dominus via email to inquire about the appropriateness of this and he replied that his only reason for wanting to prevent distribution is to revise and correct the chapter so as to avoid error-filled drafts floating around the 'Net. Posting a snippet is therefore okay. `#!/usr/bin/perl -w use strict; sub NEXT { $_[0]->() } sub imap (&$) { my ($transform, $it) = @_; return sub { my $next = NEXT($it); return unless defined $next; return $transform->($next); } } sub string_to_char_iter { my $string = shift; $string = reverse $string; sub { '' ne $string ? chop $string : undef } } my $string = join '', 'a' .. 'z'; my $iter = string_to_char_iter $string; my $uc_chars = imap { uc $_[0] } $iter; while ( my $char = NEXT $uc_chars ) { print "$char\n"; }` [download] For that code, we pass in a subref and an iterator (which is also a sub ref. We return yet another sub reference that will apply the first subref to the value returned from the iterator. In otherwords, we use the `imap()` function to transform one iterator into another, getting the results that you may need. Cheers, Ovid Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.	[reply] [d/l]
Re: Re: Re: Re: Are strings lists of characters? by Chmrr (Vicar) on Oct 18, 2002 at 05:43 UTC
Re: Re: Re: Re: Are strings lists of characters? by John M. Dlugosz (Monsignor) on Oct 18, 2002 at 14:34 UTC
(jeffa) 3Re: Are strings lists of characters? by jeffa (Bishop) on Oct 17, 2002 at 20:47 UTC
How about wrapping Ovid's while loop in another subroutine: `use strict; sub NEXT { $_[0]->() } sub string_to_char_iter { my $string = shift; $string = reverse $string; sub { chop $string } } sub get_all { my $iter = shift; my (@list,$char); push @list,$char while $char = NEXT $iter; return @list; } my $string = join '', 'a' .. 'z'; my $iter = string_to_char_iter $string; print $_,$/ for map uc, get_all($iter);` [download] UPDATE: changed while loop to one liner to irk the Java types >:) jeffa L-LL-L--L-LL-L--L-LL-L-- -R--R-RR-R--R-RR-R--R-RR B--B--B--B--B--B--B--B-- H---H---H---H---H---H--- (the triplet paradiddle with high-hat)	[reply] [d/l]
Re: (jeffa) 3Re: Are strings lists of characters? by John M. Dlugosz (Monsignor) on Oct 18, 2002 at 14:29 UTC
Re: Re: (jeffa) 3Re: Are strings lists of characters? by Ovid (Cardinal) on Oct 18, 2002 at 17:01 UTC
Some notes below your chosen depth have not been shown here
Re: Re: Re: Are strings lists of characters? by shotgunefx (Parson) on Oct 17, 2002 at 23:01 UTC
While I personally miss the built in support of iterators. (I was working on a patch to Want.pm for Iterator context, but haven't had the time to finish), you can get the DWIM with iterator closures. #!/usr/bin/perl use warnings; use strict; sub char_iterator(\$){ my $str = shift; my $count = 0; return sub { if (wantarray){ my ($tc,$len) = ($count, (length($$str) - $count) ); $count = length($$str); return split('', substr($$str,$tc,$len)); }else{ return substr($$str,$count++,1); } } } my $string = join('', ('a'..'z') x 3); my $chariter = char_iterator($string); # Get one char at a time. while(my $char = $chariter->() ){ print "Got $char\n"; } # Get it in list context. my $mapiter = char_iterator($string); my @upper = map { uc($_)."\n" } $mapiter->(); print @upper; [download] update I had a possible workaround to the "lazy evaluate" situation here. I still plan to explore this further as time permits. I'd be interested to hear your thoughts. -Lee "To be civilized is to deny one's nature."	[reply] [d/l]
(jeffa) Re: Are strings lists of characters? by jeffa (Bishop) on Oct 17, 2002 at 20:40 UTC
This reply is not an answer to your question, but instead another question. One of the first things that struck out and hit me from reading the Cookbook was the recipe for treating a string like an array of characters. Code was given, but with the caveat "don't do that." Why? Because you don't need to in Perl. I am curious to see some arguments that insist we need to treat strings as arrays in Perl. Are regexes that daunting? jeffa L-LL-L--L-LL-L--L-LL-L-- -R--R-RR-R--R-RR-R--R-RR B--B--B--B--B--B--B--B-- H---H---H---H---H---H--- (the triplet paradiddle with high-hat)	[reply]
Re: Are strings lists of characters? by Aristotle (Chancellor) on Oct 18, 2002 at 03:54 UTC
Why split? Perl 5 can already do all that with an iterator that is context aware to boot. `$_ = join '', 'a' .. 'z'; print "$1\n" while /(.)/sg; print map "$_\n", /(.)/sg;` [download] Did I miss anything? Makeshifts last the longest.	[reply] [d/l]
Re: Re: Are strings lists of characters? by petral (Curate) on Oct 18, 2002 at 21:06 UTC
Did I miss anything? `print map chr."\n", unpack "C", $_;` ? (TIMTOWTDI :-) ) update:* and: `print chr for unpack "C*", $_` p	[reply] [d/l] [select]
Re^3: Are strings lists of characters? by Aristotle (Chancellor) on Oct 18, 2002 at 21:08 UTC
Now put that in a `while` loop condition. Makeshifts last the longest.	[reply]
Re: Are strings lists of characters? by Juerd (Abbot) on Oct 18, 2002 at 12:34 UTC
package NeedsAName; sub TIEHASH { my ($class, $ref) = @_; return bless \$ref, $class; } sub FETCH { my ($self, $key) = @_; return substr $$$self, $key, 1; } sub STORE { my ($self, $key, $data) = @_; return substr($$$self, $key, 1) = $data; } sub CLEAR { my ($self) = @_; $$$self = ''; } sub DELETE { my ($self, $key) = @_: $self->STORE($key, ''); } sub EXISTS { my ($self, $key) = @_; return $key <= length $$$self; } sub FIRSTKEY { my ($self) = @_; return length $$$self ? 0 : undef; } sub NEXTKEY { my ($self, $lastkey) = @_ return length $$$self > $lastkey ? $lastkey + 1 : undef; } # tie %foo, 'NeedsAName', \$string; [download] I'm in a hurry, so I did not test and I used a hash because implementing FETCHSIZE, STORESIZE, EXTEND, PUSH, POP, SHIFT, UNSHIFT and SPLICE is too much work :) And there's no error checking. Hmmm... maybe I should just have said 'Why not tie?'... - Yes, I reinvent wheels. - Spam: Visit eurotraQ.	[reply] [d/l]
Re: Are strings lists of characters? by Dominus (Parson) on Oct 18, 2002 at 18:10 UTC
Says juerd: `sub DELETE { my ($self, $key) = @_: $self->STORE($key, ''); }` [download] That, unfortunately, doesn't work well. Suppose `%h` is tied to the string `converted`, and then you do `delete @h{2,5}`. You'd like to delete the `n` and the `r`, yielding `coveted`, but that's not what happens. Instead, Perl calls `DELETE(2)`, which deletes the `n`, leaving `coverted`, and then `DELETE(5)`, which deletes the `t`, not the `r`, leaving `covered` instead of `coveted`. Of course, that's not your fault, but at present it can't really be made to work right. I was going to put in a patch to fix this (motivated by the same problem using `delete` with `Tie::File`) but I haven't gotten around to it yet. The easy solution is that if you're deleting a list of values, Perl should delete them in order from last to first instead of from first to last. That fixes the `delete @h{2,5}` problem, but unfortunately the same problem persists with `delete @h{2,5,3}`. The patch I planned to make would allow the tied hash class to request that Perl call a special `DELETESLICE` method instead of making multiple calls to `DELETE` in such cases. It would follow the same form as the `NEGATIVE_INDICES` feature in the current bleadperl. -- Mark Dominus Perl Paraphernalia	[reply] [d/l]
Re: Re: Are strings lists of characters? by Thelonius (Priest) on Oct 29, 2002 at 20:58 UTC
`package StringArray; require Tie::Array; use base 'Tie::Array'; sub TIEARRAY { bless $_[1], $_[0] } sub FETCH { substr(${$_[0]}, $_[1], 1) } sub STORE { substr(${$_[0]}, $_[1], 1) = $_[2] } sub FETCHSIZE { length(${$_[0]}) } sub STORESIZE { $$self = substr(${$_[0]}, 0, $_[1]) } sub DELETE { substr(${$_[0]}, $_[1], 1) = '' } 1;` [download] Example: `#!perl -w use StringArray; use strict; my $test = "Hello dolly"; my @testa; tie @testa, 'StringArray', \$test; print "\$#testa = $#testa\n"; print "testa[1] = $testa[1]\n"; $testa[1] = 'b'; print "testa[1] = $testa[1]\n"; map {$_ = uc $_} @testa; print "after map: test = $test\n"; delete $testa[1]; print "after delete: test = $test\n"; push @testa, "C"; print "after push: test = $test\n";` [download]	[reply] [d/l] [select]
Re^3: Are strings lists of characters? by Aristotle (Chancellor) on Oct 29, 2002 at 22:52 UTC
Nope - still the same problem. #!/usr/bin/perl -w use strict; $_ = "converted"; tie my @test, 'StringArray', $_; print map "$_\n", map { delete @$_[2,5]; join '', grep defined, @$_; } \@test, [ /(.)/sg ]; package StringArray; require Tie::Array; use base 'Tie::Array'; sub TIEARRAY { my $str = pop; bless \$str, shift } sub FETCH { substr(${$_[0]}, $_[1], 1) } sub STORE { substr(${$_[0]}, $_[1], 1) = $_[2] } sub FETCHSIZE { length(${$_[0]}) } sub STORESIZE { $$_[0] = substr(${$_[0]}, 0, $_[1]) } sub DELETE { substr(${$_[0]}, $_[1], 1) = '' } 1; __END__ covered coveted [download] Makeshifts last the longest.	[reply] [d/l]
Re: Are strings lists of characters? by jepri (Parson) on Oct 18, 2002 at 14:45 UTC
Petruchio was expanding on this in the chatterbox a while back...but he wanted it as part of a grander plan to have polymorphic functions that worked on all data types. e.g. delete should delete entries from arrays and strings, we should be able to push and pop strings, length to work properly on arrays, etc. So in that sense, it's a desire for the language to be a bit more consistent, rather than an implementation issue. ____________________ Jeremy I didn't believe in evil until I dated it.	[reply]
Re: Are strings lists of characters? by gjb (Vicar) on Oct 20, 2002 at 17:09 UTC
There happens to be a CPAN module that allows one to tie a list to a string: Tie::CharArray. I've never used it since I generally got around with `substr` for iterating over strings, but some tests I ran just now seem to show that it works pretty well. There have been a few situations where I really wished I could treat strings as arrays, but since `substr` can be assigned to I could get around this limitation. A specific example that comes to mind is the implementation of a genetic programming algorithm where I prefered to have strings rather than lists as datatypes for the chromosomes. Hope this helps, -gjb-	[reply]
Re: Are strings lists of characters? by Aristotle (Chancellor) on Oct 20, 2002 at 17:53 UTC
I had to think very long and hard for a case where having strings be character arrays would offer syntatically superior, more concise ways of expressing than using substr. I have finally come up with something. Consider this: `my @string = "hubris" =~ /(.)/sg; @string[0,2,5] = "etv" = ~ /(.)/sg; print reverse @string; __END__ virtue` [download] This would be very awkward to achieve with substrs, particularly for more complex examples. (Bioinformatics might be an area where such could be useful.) But thanks to `/(.)/sg` expressiveness doesn't suffer much even here; the only concern I see is efficiency, if you do this a lot. But if that really is a probably, use a class with a real array in its guts and an overloaded stringification operator would probably suffice. And that one is downright trivial, something like: `sub stringify { local $"; "@{$_[0]}" }` Assignment needs to be overloaded too, I guess, and would be slightly less trivial. All in all, I conclude that there's not much need for such a feature at the language level. A module should suffice. Makeshifts last the longest.	[reply] [d/l] [select]
Re: Re: Are strings lists of characters? by gjb (Vicar) on Oct 21, 2002 at 16:00 UTC
A few quite common tasks map more naturally to a string-as-array approach. As an example, consider determining the common prefix (or suffix) of two strings. It can be done by regex matching: `my @str = ('ABCD', 'ABEF'); my $str = join('-', @str); if ($str =~ /^([A-Z]+)[A-Z]\-\1[A-Z]$/) { print "common: '$1'\n"; } else { print "no match\n"; }` [download] but it's much more natural to do it with a simple `for` over the characters. Of course one can get around with `substr`, but it looks decidedly weird. Regards, -gjb-	[reply] [d/l]
Re^3: Are strings lists of characters? by Aristotle (Chancellor) on Oct 21, 2002 at 18:02 UTC
Way too hackish, not to mention it breaks if the chosen delimiter appears in your input strings - I'll get back to that in a bit though. To find the common prefix, you have to iterate over two variables; be that scalars or arrays. Using a `for` loop: `my (@str1, @str2); my ($i, @prefix) = (0); for(@str1) { last if $_ ne $str2[$i++]; push @prefix, $_; }` [download] Or a `while` loop: `my (@str1, @str2); my ($i, @prefix) = (0); push @prefix, $str1[$i++] while($str1[$i] eq $str2[$i]);` [download] I'd definitely prefer the `while` version, simply because the arrays are treated equally. Now let's look at how you'd do that over scalars: `my ($s1, $s2) = ("ABCD", "ABEF"); my $i = 0; $i++ while substr($s1, $i, 1) eq substr($s2, $i, 1); my $prefix = substr $s1, 0, $i;` [download] That's hardly any different to read, way clearer than the regex solution, shorter and more idiomatic to boot, and doesn't break regardless of input. Ok, using the ternary operator for your code would shorten the regex approach, but if anything, it would probably conceil the code's intent even further. No, the only advantage strings-as-arrays would offer as far as I can see is for simulatenously replacing multiple non-contiguous parts of the string with parts of some other string. But then, that's such a rare circumstance that it shouldn't be unacceptably painful to just listify the strings using `/(.)/sg` for that, then glue the result back together. I do see compelling reasons to syntactically extend push, shift and friends for dealing with strings (although I also see reasons not to), but definitely not for making strings fullblown arrays. Makeshifts last the longest.	[reply] [d/l] [select]