Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re^5: passing data structures from java to perl

by exussum0 (Vicar)
on Jul 28, 2010 at 23:25 UTC ( #851823=note: print w/ replies, xml ) Need Help??


in reply to Re^4: passing data structures from java to perl
in thread passing data structures from java to perl

That's really really bad advice. You can get encoding errors like that on the java side. Yeah, the bytes in memory are all that matter and you're fine to interpret, but on the way in and out, you're playing with things.

It's the same thing like binmode. You're affecting the data as the IO occurs to get into our out of memory. See...

http://www.docjar.com/docs/api/java/nio/charset/UnmappableCharacterException.html

I've run into this exact problem using the XML feeds for perlmonks while working in java.


Comment on Re^5: passing data structures from java to perl
Re^6: passing data structures from java to perl
by almut (Canon) on Jul 28, 2010 at 23:37 UTC

    Not sure what you're talking about.  UTF-8 is a variable-width (multi-byte-if-needed) encoding that can encode the full unicode character set, so why should there be encoding errors, or an UnmappableCharacterException?  An UnmappableCharacterException is thrown if a certain character can't be represented in the specified target encoding, but as all unicode characters can be encoded in UTF-8, this exception cannot occur.

    UnmappableCharacterExceptions may happen if you try to encode unicode data to Latin-1, for example, but not with UTF-8.

      One instance of using the Java API incorrectly is probably fine. Sure, there's lots of clever things you can do such as insert the same 2 items into a HashSet and get the same result every single time if you iterate over them and expect an order. (You're achieving the same state)

      You can serialize a singleton, and deserialize it to get 2 copies.

      I can also do 86400 seconds in a day, which is innacurate twice a year, but most of the time it's fine.

      Sure, in perl I can do... new Foo instead of Foo->new.

      Yeah, in the end it all may work out. In the IO case, java is validating the output as it's writing it and if someone does do Unicode->Latin1 because of that exact pattern, even if it works for this instance, you're teaching people a bad habit that can yield errors if they rubber stamp it all over the place.

      You seem convinced to do this anyhow in your code. That's fine. Don't be surprised if other java devs don't look at it and go.. yeah, that looks wrong.

        And what has all this got to do with creating UTF-8 output?  I was explicitly talking about UTF-8, not Latin-1.

        If you think there's something wrong with what I suggested, please provide a compilable, self-contained sample that demonstrates an UnmappableCharacterException when encoding a unicode string to UTF-8.  Otherwise this is plain FUD.

Re^6: passing data structures from java to perl
by Anonymous Monk on Jul 29, 2010 at 03:38 UTC
    That's really really bad advice. You can get encoding errors like that on the java side.

    How do you figure, please explain?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://851823]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (7)
As of 2014-09-16 23:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (53 votes), past polls