Beefy Boxes and Bandwidth Generously Provided by pair Networks Cowboy Neal with Hat
Do you know where your variables are?
 
PerlMonks  

Best Way to Get Length of UTF-8 String in Bytes?

by Jim (Curate)
on Apr 23, 2011 at 20:53 UTC ( #900994=perlquestion: print w/ replies, xml ) Need Help??
Jim has asked for the wisdom of the Perl Monks concerning the following question:

What's the best way to get the length of a UTF-8 string in bytes? Is there a canonical "right" way?

perldoc -f length states:

Like all Perl character operations, length() normally deals in logical characters, not physical bytes. For how many bytes a string encoded as UTF-8 would take up, use "length(Encode::encode_utf8(EXPR))" (you'll have to "use Encode" first). See Encode and perlunicode.

Is this what I want?

Comment on Best Way to Get Length of UTF-8 String in Bytes?
Re: Best Way to Get Length of UTF-8 String in Bytes?
by ikegami (Pope) on Apr 23, 2011 at 22:45 UTC
    Yes. To find the number of bytes text would take encoded as UTF-8, encode it using UTF-8, then use length.
    use charnames qw( :full ); use feature qw( say ); use Encode qw( encode_utf8 ); my $text = "\N{LATIN SMALL LETTER E WITH ACUTE}"; say length $text; # 1 my $utf8 = encode_utf8($text); say length $utf8; # 2

      Thank you, ikegami.

      Here's what I had tried before posting my inquiry:

      #!perl
      
      use strict;
      use warnings;
      use open qw( :utf8 :std );
      use utf8;
      
      # 'China' in Simplified Chinese
      #          中        国
      # Unicode  U+4E2D    U+56FD
      # UTF-8    E4 B8 AD  E5 9B BD
      
      my $text = '中国';
      my $length_in_characters = length $text;
      print "Length of text '$text' in characters is $length_in_characters\n";
      
      {
          use bytes;
          my $length_in_bytes = length $text;
          print "Length of text '$text' in bytes is $length_in_bytes\n";
      }
      
      {
          require Encode;
          my $bytes = Encode::encode_utf8($text);
          my $length_in_bytes = length $bytes;
          print "Length of text '$bytes' in bytes is $length_in_bytes\n";
      }
      

      And here's its output:

      Length of text '中国' in characters is 2
      Length of text 'šł≠Ś›Ĺ' in bytes is 6
      Length of text 'šł≠Ś›Ĺ' in bytes is 6
      

      (I couldn't use <code> tags here due to the Chinese characters in both the script and its output.)

      Jim

        Are you trying to suggest you could use bytes? That would be incorrect. bytes does not give UTF-8, it gives the internal storage format of the string. That may be utf8 (similiar to UTF-8) or just bytes. Here's an example of it giving the incorrect answer:

        #!perl use strict; use warnings; use open qw( :encoding(cp437) :std ); use utf8; my $text = chr(0xC9); my $length_in_characters = length $text; print "Length of text '$text' in characters is $length_in_characters\n +"; { use bytes; my $length_in_bytes = length $text; print "Length of text '$text' in bytes is $length_in_bytes\n"; } { require Encode; my $bytes = Encode::encode_utf8($text); my $length_in_bytes = length $bytes; print "Length of text '$bytes' in bytes is $length_in_bytes\n"; }
        Length of text '…' in characters is 1 Length of text '…' in bytes is 1 "\x{00c3}" does not map to cp437 at a.pl line 22. "\x{0089}" does not map to cp437 at a.pl line 22. Length of text '\x{00c3}\x{0089}' in bytes is 2
Re: Best Way to Get Length of UTF-8 String in Bytes?
by tchrist (Pilgrim) on Apr 23, 2011 at 23:11 UTC
    Is this what I want?

    Probably. The problem is that it depends on one important thing:

    Whatever in the world do you want it for, anyway?

    I cannot ever remember needing it myself: dealing with low-level bytes instead of logical characters is nearly always the wrong way to go about matters.

    Itís quite possible that there might be a better approach you just donít know about.

      The problem is that it depends on one important thing: Whatever in the world do you want it for, anyway?

      To fit UTF-8 text into a column in a database management system that does not quantify the size of text in characters, but in bytes. The database management system is not relational, not conventional, and probably not one you've ever heard of.

      Itís quite possible that there might be a better approach you just donít know about.

      Very possible. So if I have a VARCHAR column limit of 32,767 bytes, not characters, how do I trim a UTF-8 string to ensure I don't wrongly try to put more than 32,767 bytes worth of it into a column?

      Thank you for your reply. I appreciate it.

      Jim

        To fit UTF-8 text into a column in a database management system that does not quantify the size of text in characters, but in bytes. The database management system is not relational, not conventional, and probably not one you've ever heard of.
        I feel lucky that all my database experiences in recent memory have involved ones that had no fixed limits on any of their sizes. One still had to encode/decode to UTF-8, but I didnít have your particular problem.
        Itís quite possible that there might be a better approach you just donít know about.

        Very possible. So if I have a VARCHAR column limit of 32,767 bytes, not characters, how do I trim a UTF-8 string to ensure I don't wrongly try to put more than 32,767 bytes worth of it into a column?

        Well, Jim, thatís quite a pickle. I think Iím going to renege about my advice in perlfunc. If you encode to UTF‑8 bytes, then you wonít know whether and most especially, where to truncate your string, because youíve lost the character information. And you really have to have character information, plus more, too.

        It is vaguely possible that you might be able to arrange something with the \C regex escape for an octet, somehow combining a bytewise assertion of \A\C{0,32767} with one that fits a charwise \A.* or better yet a grapheme-wise ^\X* within that.

        But that isnít the approach I finally ended up using, because that sounded too complicated and messy. I decided to do something really simple. My ďPac‑ManģĒ algorithm is simple: chop until short enough. More specifically, remove the last grapheme until the string has fewer than your maximum bytes.

        You can do a bit better than blind naÔvetť by realizing that even at maximum efficiency of one byte per character (pure ASCII), if the actual char length is more than the maximum allowed byte length, you can pre-truncate the character count. That way you donít come slowly pac-manning back from 100k strings.

        There are a few things complicating your life. Just as you do not wish to chop off a byte in the middle of a character, neither do you want to chop of a character in the middle of a grapheme. You donít want "\cM\cJ" to get split if thatís in your data, and you very most especially do not wish to lose a Grapheme_Extend code point like a diacritic or an underline/overline off of its Grapheme_Base code point.

        What I ended up doing, therefore, was this:

        require bytes; if (length($s) > $MAX) { substr($s, $MAX) = ""; } $s =~ s/\X\z// until $MAX > bytes::length($s);
        That assumes that the strings are Unicode strings with their UTF‑8 flags on. I have done the wickedness of using the bytes namespace. This breaks the encapsulation of abstract characters. I am relying on knowing that the internal byte length is in UTF-8. If that changes ó and there is no guarantee at all that it will not do so someday ó then this code will break.

        Also, it is critical that it be required, not used. You do not want byte semantics for your operations; you just want to be able to get a bytes::length on its own.

        I havenít benchmarked this against doing it the ďpureĒ way with a bunch of calls to encode_utf8. You might want to do that.

        But what I have done is run the algorithm against a bunch of strings: some in NFD form, some in NFC; some French and Chinese; some with multiple combining characters; some even with very fat graphemes from up in the SMP with their marks (math letters with overlines). I ran it with MAX == 25 bytes, but I see no reason why it shouldnít work set to your own 32,767. Here are some examples of the traces:

        String was <NFD: tÍte‐ŗ‐tÍte tÍte‐ŗ‐tÍte>
                start string has graphlen 28, charlen 34, bytelen 48
                CHARLEN 34 > 25, truncating to 25 CHARS
                bytelen 33 still too long, chopping last grapheme
                deleted grapheme <e> U+0065, charlen -1, bytelen -1
                bytelen 32 still too long, chopping last grapheme
                deleted grapheme <t> U+0074, charlen -1, bytelen -1
                bytelen 31 still too long, chopping last grapheme
                deleted grapheme <Í> U+0065.0302, charlen -2, bytelen -3
                bytelen 28 still too long, chopping last grapheme
                deleted grapheme <t> U+0074, charlen -1, bytelen -1
                bytelen 27 still too long, chopping last grapheme
                deleted grapheme < > U+0020, charlen -1, bytelen -1
                bytelen 26 still too long, chopping last grapheme
                deleted grapheme <e> U+0065, charlen -1, bytelen -1
                final string has graphlen 15, charlen 18, bytelen 25
        Trunc'd is <NFD: tÍte‐ŗ‐tÍt>
        
        String was <NFD 蓝 lŠn and 绿 lǜ>
                start string has graphlen 18, charlen 21, bytelen 28
                bytelen 28 still too long, chopping last grapheme
                deleted grapheme <ǜ> U+0075.0308.0300, charlen -3, bytelen -5
                final string has graphlen 17, charlen 18, bytelen 23
        Trunc'd is <NFD 蓝 lŠn and 绿 l>
        
        String was <Chinese: 青天,白日,满地红>
                start string has graphlen 18, charlen 18, bytelen 36
                bytelen 36 still too long, chopping last grapheme
                deleted grapheme <红> U+7EA2, charlen -1, bytelen -3
                bytelen 33 still too long, chopping last grapheme
                deleted grapheme <地> U+5730, charlen -1, bytelen -3
                bytelen 30 still too long, chopping last grapheme
                deleted grapheme <满> U+6EE1, charlen -1, bytelen -3
                bytelen 27 still too long, chopping last grapheme
                deleted grapheme <,> U+FF0C, charlen -1, bytelen -3
                final string has graphlen 14, charlen 14, bytelen 24
        Trunc'd is <Chinese: 青天,白日>
        
        String was <NFD: h„̂Á̌k h„̂Á̌k hẫÁ̌k hẫÁ̌k>
                start string has graphlen 24, charlen 35, bytelen 51
                CHARLEN 35 > 25, truncating to 25 CHARS
                bytelen 37 still too long, chopping last grapheme
                deleted grapheme <Á̌> U+00E7.030C, charlen -2, bytelen -4
                bytelen 33 still too long, chopping last grapheme
                deleted grapheme <ẫ> U+1EAB, charlen -1, bytelen -3
                bytelen 30 still too long, chopping last grapheme
                deleted grapheme <h> U+0068, charlen -1, bytelen -1
                bytelen 29 still too long, chopping last grapheme
                deleted grapheme < > U+0020, charlen -1, bytelen -1
                bytelen 28 still too long, chopping last grapheme
                deleted grapheme <k> U+006B, charlen -1, bytelen -1
                bytelen 27 still too long, chopping last grapheme
                deleted grapheme <Á̌> U+0063.0327.030C, charlen -3, bytelen -5
                final string has graphlen 12, charlen 16, bytelen 22
        Trunc'd is <NFD: h„̂Á̌k h„̂>
        
        String was <𝐂̅ = sqrt[𝐀̅≤ + 𝐁̅≤]>
                start string has graphlen 17, charlen 20, bytelen 34
                bytelen 34 still too long, chopping last grapheme
                deleted grapheme <]> U+005D, charlen -1, bytelen -1
                bytelen 33 still too long, chopping last grapheme
                deleted grapheme <≤> U+00B2, charlen -1, bytelen -2
                bytelen 31 still too long, chopping last grapheme
                deleted grapheme <𝐁̅> U+1D401.0305, charlen -2, bytelen -6
                final string has graphlen 14, charlen 16, bytelen 25
        Trunc'd is <𝐂̅ = sqrt[𝐀̅≤ + >
        Hereís the complete program. Iíve uniquoted the strings, so the program itself is actually in pure ASCII, which means I can put in in <code> tags here instead of messing around with icky <pre> and weird escapes. You can download it easily enough if you want, play with the numbers and all, but the heart of algorithm is just the one liner to throw out the last grapheme and check the byte length. Youíll see that Iíve left the debugging in.
        #!/usr/bin/env perl use 5.12.0; use strict; use autodie; use warnings; use utf8; use open qw<:std :utf8>; use charnames qw< :full >; require bytes; my $MAX_BYTES = 25; my ($MIN_BPC, $MAX_BPC) = (1, 4); my $MAX_CHARS = $MAX_BYTES / $MIN_BPC; sub bytelen(_) { require bytes; return bytes::length($_[0]); } sub graphlen(_) { my $count = 0; $count++ while $_[0] =~ /\X/g; return $count; } sub charlen(_) { return length($_[0]); } sub shorten(_) { my $s = $_[0]; printf "\tstart string has graphlen %d, charlen %d, bytelen %d\n", graphlen($s), charlen($s), bytelen($s); if (charlen($s) > $MAX_CHARS) { printf "\tCHARLEN %d > %d, truncating to %d CHARS\n", length($s), $MAX_BYTES, $MAX_CHARS; substr($s, $MAX_CHARS) = ""; } while (bytelen($s) > $MAX_BYTES) { printf "\tbytelen %d still too long, chopping last grapheme\n" +, bytes::length($s); $s =~ s/(\X)\z//; printf "\tdeleted grapheme <%s> U+%v04X, charlen -%d, bytelen +-%d\n", $1, $1, length($1), bytes::length($1); } printf "\tfinal string has graphlen %d, charlen %d, bytelen %d\n", graphlen($s), charlen($s), bytelen($s); return $s; } my @strings = ( "this lines starts a bit too long", "NFC: cr\N{LATIN SMALL LETTER E WITH GRAVE}me br\N{LATIN SMALL LET +TER U WITH CIRCUMFLEX}l\N{LATIN SMALL LETTER E WITH ACUTE}e et cr\N{L +ATIN SMALL LETTER E WITH GRAVE}me br\N{LATIN SMALL LETTER U WITH CIRC +UMFLEX}l\N{LATIN SMALL LETTER E WITH ACUTE}e", "NFC: t\N{LATIN SMALL LETTER E WITH CIRCUMFLEX}te\N{HYPHEN}\N{LATI +N SMALL LETTER A WITH GRAVE}\N{HYPHEN}t\N{LATIN SMALL LETTER E WITH C +IRCUMFLEX}te t\N{LATIN SMALL LETTER E WITH CIRCUMFLEX}te\N{HYPHEN}\N{ +LATIN SMALL LETTER A WITH GRAVE}\N{HYPHEN}t\N{LATIN SMALL LETTER E WI +TH CIRCUMFLEX}te", "NFD: cre\N{COMBINING GRAVE ACCENT}me bru\N{COMBINING CIRCUMFLEX A +CCENT}le\N{COMBINING ACUTE ACCENT}e et cre\N{COMBINING GRAVE ACCENT}m +e bru\N{COMBINING CIRCUMFLEX ACCENT}le\N{COMBINING ACUTE ACCENT}e", "NFD: te\N{COMBINING CIRCUMFLEX ACCENT}te\N{HYPHEN}a\N{COMBINING G +RAVE ACCENT}\N{HYPHEN}te\N{COMBINING CIRCUMFLEX ACCENT}te te\N{COMBIN +ING CIRCUMFLEX ACCENT}te\N{HYPHEN}a\N{COMBINING GRAVE ACCENT}\N{HYPHE +N}te\N{COMBINING CIRCUMFLEX ACCENT}te", "NFC \N{U+84DD} l\N{LATIN SMALL LETTER A WITH ACUTE}n and \N{U+7EF +F} l\N{LATIN SMALL LETTER U WITH DIAERESIS AND GRAVE}", "NFD \N{U+84DD} la\N{COMBINING ACUTE ACCENT}n and \N{U+7EFF} lu\N{ +COMBINING DIAERESIS}\N{COMBINING GRAVE ACCENT}", "XXX NFC q\N{LATIN SMALL LETTER I WITH MACRON}ng ti\N{LATIN SMALL +LETTER A WITH MACRON}n, b\N{LATIN SMALL LETTER A WITH ACUTE}i r\N{LAT +IN SMALL LETTER I WITH GRAVE}, m\N{LATIN SMALL LETTER A WITH CARON}n +d\N{LATIN SMALL LETTER I WITH GRAVE} h\N{LATIN SMALL LETTER O WITH AC +UTE}ng", "XXX NFD qi\N{COMBINING MACRON}ng tia\N{COMBINING MACRON}n, ba\N{C +OMBINING ACUTE ACCENT}i ri\N{COMBINING GRAVE ACCENT}, ma\N{COMBINING +CARON}n di\N{COMBINING GRAVE ACCENT} ho\N{COMBINING ACUTE ACCENT}ng", "Chinese: \N{U+9752}\N{U+5929}\N{FULLWIDTH COMMA}\N{U+767D}\N{U+65 +E5}\N{FULLWIDTH COMMA}\N{U+6EE1}\N{U+5730}\N{U+7EA2}", "normal \N{FULLWIDTH LATIN SMALL LETTER W}\N{FULLWIDTH LATIN SMALL + LETTER I}\N{FULLWIDTH LATIN SMALL LETTER D}\N{FULLWIDTH LATIN SMALL +LETTER E} normal \N{FULLWIDTH LATIN SMALL LETTER W}\N{FULLWIDTH LATIN + SMALL LETTER I}\N{FULLWIDTH LATIN SMALL LETTER D}\N{FULLWIDTH LATIN +SMALL LETTER E}", "NFC: h\N{LATIN SMALL LETTER A WITH TILDE}\N{COMBINING CIRCUMFLEX +ACCENT}\N{LATIN SMALL LETTER C WITH CEDILLA}\N{COMBINING CARON}k ha\N +{COMBINING TILDE}\N{COMBINING CIRCUMFLEX ACCENT}c\N{COMBINING CEDILLA +}\N{COMBINING CARON}k h\N{LATIN SMALL LETTER A WITH CIRCUMFLEX AND TI +LDE}\N{LATIN SMALL LETTER C WITH CEDILLA}\N{COMBINING CARON}k ha\N{CO +MBINING CIRCUMFLEX ACCENT}\N{COMBINING TILDE}c\N{COMBINING CEDILLA}\N +{COMBINING CARON}k", "NFD: ha\N{COMBINING TILDE}\N{COMBINING CIRCUMFLEX ACCENT}c\N{COMB +INING CEDILLA}\N{COMBINING CARON}k ha\N{COMBINING TILDE}\N{COMBINING +CIRCUMFLEX ACCENT}c\N{COMBINING CEDILLA}\N{COMBINING CARON}k ha\N{COM +BINING CIRCUMFLEX ACCENT}\N{COMBINING TILDE}c\N{COMBINING CEDILLA}\N{ +COMBINING CARON}k ha\N{COMBINING CIRCUMFLEX ACCENT}\N{COMBINING TILDE +}c\N{COMBINING CEDILLA}\N{COMBINING CARON}k", "\N{MATHEMATICAL BOLD CAPITAL C}\N{COMBINING OVERLINE} = sqrt[\N{M +ATHEMATICAL BOLD CAPITAL A}\N{COMBINING OVERLINE}\N{SUPERSCRIPT TWO} ++ \N{MATHEMATICAL BOLD CAPITAL B}\N{COMBINING OVERLINE}\N{SUPERSCRIPT + TWO}]", "4\N{FRACTION SLASH}3\N{INVISIBLE TIMES}\N{GREEK SMALL LETTER PI}\ +N{INVISIBLE TIMES}r\N{SUPERSCRIPT THREE} 4\N{FRACTION SLASH}3\N{INVIS +IBLE TIMES}\N{GREEK SMALL LETTER PI}\N{INVISIBLE TIMES}r\N{SUPERSCRIP +T THREE} 4\N{FRACTION SLASH}3\N{INVISIBLE TIMES}\N{GREEK SMALL LETTER + PI}\N{INVISIBLE TIMES}r\N{SUPERSCRIPT THREE} 4\N{FRACTION SLASH}3\N{ +INVISIBLE TIMES}\N{GREEK SMALL LETTER PI}\N{INVISIBLE TIMES}r\N{SUPER +SCRIPT THREE}", ); printf "MAX byte length is %d\n\n", $MAX_BYTES; for my $line (@strings) { chomp $line; say "String was <$line>"; my $trunk = shorten($line); say "Trunc'd is <$trunk>\n"; } exit 0;
        There are other ways to go about this, but this seemed to work well enough. Hope it helps.

        Oh, BTW, if you really want to do print-columns instead of graphemes, look to the Unicode::GCString module; it comes with Unicode::LineBreak. Both are highly recommended. I use them in my unifmt program to do intelligent linebreaking of Asian text per UAX#14.

        It is an intentional properly of UTF-8 encoding that although variable length, you can easily figure out that you're in the middle of a character and where whole characters begin. Continuation bytes always start with the bits 10xxxxxx. Single-byte characters always have a high bit of 0 (0xxxxxxx), and multi-byte characters always start with a byte that has as many leading 1 bits as there are bytes total: 110xxxxx for two bytes, 1110xxxx for three bytes, etc.

        So, start at position N of the utf-8 encoded byte string that is the maximum length. While the byte at position N is a continuation byte, decrement N. Now you can truncate to length N.

        To prevent clipping the accents off a base character or something like that, you can furthermore look at the whole character beginning at N. Check the Unicode Properties to see if it's a modifier or something. If it is, decrement N again repeat.

      Whatever in the world do you want it for, anyway?

      Obtaining knowledge of the storage requirements for a piece of data does not seem such an unusual requirement to me. Whether it is for sizing a buffer for interfacing to a C (other) language library; or for length-prefixing a packet for a transmission protocol; or for indexing a file; or any of a dozen other legitimate uses.

      Indeed, given that this information is readily & trivially available to Perl:

      #! perl -slw use strict; use Devel::Peek; use charnames qw( :full ); my $text = "\N{LATIN SMALL LETTER E WITH ACUTE}"; print length $text; Dump $text; __END__ C:\test>junk.pl 1 SV = PVMG(0x27fea8) at 0x237ab8 REFCNT = 1 FLAGS = (PADMY,SMG,POK,pPOK,UTF8) IV = 0 NV = 0 PV = 0x3c78e8 "\303\251"\0 [UTF8 "\x{e9}"] CUR = 2 LEN = 8 MAGIC = 0x23c648 MG_VIRTUAL = &PL_vtbl_utf8 MG_TYPE = PERL_MAGIC_utf8(w) MG_LEN = 1

      the absence of a simple built-in mechanism for obtaining it seems both remiss and arbitrary.

      But then, this is just another in a long list of reasons why the whole Unicode thing should be dumped in favour of a standard that throws away all the accumulated crude of transitional standards and yesteryears physical and financial restrictions.

      Given the cheapness of today's ram, variable length encodings make no sense given the restrictions and overheads they impose. And any 'standard' that means that it is impossible to tell what a piece of data actually represents without reference to some external metadata is an equal nonsense.

      With luck, the current mess will be consigned to the bitbucket of history along with all the other evolutionary dead ends like 6-bit bytes and 36-bit words.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        But then, this is just another in a long list of reasons why the whole Unicode thing should be dumped in favour of a standard that throws away all the accumulated crude of transitional standards and yesteryears physical and financial restrictions.

        Given the cheapness of today's ram, variable length encodings make no sense given the restrictions and overheads they impose. And any 'standard' that means that it is impossible to tell what a piece of data actually represents without reference to some external metadata is an equal nonsense.

        With luck, the current mess will be consigned to the bitbucket of history along with all the other evolutionary dead ends like 6-bit bytes and 36-bit words.

        Yes and no.

        The no parts are that you seem to have confused the UTF‑8 with Unicode. Unicode is here to stay, and does not share in UTF‑8ís flaws. But realistically, you are simply never going to get rid of UTF‑8 as a transfer format. Do you truly think that people are going to put up with the near-quadrupling in space that the gigabytes and gigabytes of large corpora would require if they were stored or transfered as UTF‑32? That will never happen.

        The yes part is that I agree that int is the new char. No character data type of fixed width should ever be smaller than the number of bits needed to store any and all possible Unicode code points. Because Unicode is a 21‑bit charset, that means you need 32‑bit characters.

        It also means that everyone who jumped on the broken UCS‑2 or UTF‑16 bandwagon is paying a really wicked price, since UTF‑16 has all the disadvantages of UTF‑8 but none of its advantages.

        At least Perl didnít make that particular brain-damaged mistake! It could have been much worse. UTF‑8 is now the de facto standard, and I am very glad that Perl didnít do the stupid thing that Java and so many others did: just try matching non-BMP code points in character classes, for example. Canít do it in the UTF-16 languages. Oops! :(

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://900994]
Approved by toolic
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (10)
As of 2014-04-16 06:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (416 votes), past polls