Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

char2oem

by Anonymous Monk
on Sep 15, 2001 at 22:28 UTC ( #112645=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello, I'm looking for a Win-API-like function to convert a string (i. e. with ANSI char's) into OEM-char's. I need this 'cause I've got a linux database full of this ANSI-char's and I want to convert them for a DOS program. So the charsets don't match. Any Suggestions? Thanks

Replies are listed 'Best First'.
Re: char2oem
by Anonymous Monk on Sep 16, 2001 at 04:33 UTC
    Hello, I'm looking for a Win-API-like function to convert a string (i. e. with ANSI char's) into OEM-char's.
    If your conversion program is running on Windows, you can get the AnsiToOEM Win32 API function in Perl via the Win32::API module. However, a more portable (and perhaps even simplier, considering the complexity of using Win32::API) solution is to write a function to convert CP1252 (WinLatin1) to CP437 (DOSLatinUS). To do this download the textfiles from those links at Czyborra and write a program like this:
    my ($from_name, $to_name, $from_map, $to_map); $from_name = "cp1252.txt"; $to_name = "cp437.txt"; # Load a codepage file into a Perl data structure sub load_cp { my ($filename) = @_; my (@map); open(CP, "<$filename") || die "load_cp: $filename: $!"; while(<CP>) { my ($byte, $unicode) = m/^=(..)\tU[+](....)/; $map[hex $byte] = hex($unicode); } return \@map; } # Map characters in a file from one codepage to another sub map_cp { my ($to, $from, $text) = @_; my ($new_text, $map, %to, %from, @text); # First map codepage "$from" characters to Unicode characters foreach my $char (split //, $text) { if (!defined $from->[ord $char]) { warn "no to=$to_name char for U+$char\n"; } push @text, $from->[ord $char]; } # Now map non-ASCII Unicode characters to codepage "$to" foreach my $char (@text) { if (!defined $to->[$char]) { warn "no from=$from_name char for U+$char\n"; } $new_text .= chr $to->[$char]; } return $new_text; } # Load to and from codepages $from_map = load_cp($from_name); $to_map = load_cp( $to_name); # Replace \x80 with your text print map_cp($from_map, $to_map, "\x80");
    Note that CP1252 has characters that CP437 doesn't have, and vice versa. You may want to replace all your ANSI character codes with UTF-8 Unicode characters. It makes things much easier, and then you could use the cp437.txt from Czyborra to generate text capable of being viewed in a DOS environment with the CP437 codepage.
Re: char2oem
by RhetTbull (Curate) on Sep 16, 2001 at 07:34 UTC
    The solution above recommending code pages and unicode sounds robust and I would explore using a similar solution. However, just for grins, here is a poor man's version that might work for you. I just created a mapping of the DOS extended characters to the Windows ASCII extended characters and use that mapping to replace the appropriate characters. It's not overly robust and I'm not 100% sure it's correct (since I did the mapping myself by looking at DOS and ASCII character charts). Your mileage may vary.

    Update: Note that there is not a 1 to 1 mapping between the DOS and ANSI code pages. I mostly just mapped the accented letters and a couple of symbols (such as "cents" etc.) For most text (including Latin foreign languages) this mapping should work fairly well. However, it's not a very robust solution and not very pretty code so if you need to do a lot of this I would recommend one of the other solutions suggested in this thread.

    Update #2 In case it's not clear, the hash %asc2dos maps the ANSI (e.g. Windows) ASCII value to the equilavent DOS ASCII value. I then reverse the hash so %dos2asc contains the mapping from DOS back to Windows. As an aside, does anyone have any suggestions for a better or more idiomatic way to reverse the hash (i.e. use the keys as values and vice versa) than what I did below?
    #!/usr/local/bin/perl use strict; use warnings; #mapping of ASCII to DOS my %asc2dos = (131,159,149,250,150,196,161,173,162,155,163,156,165,157 +,170,166,171,174,172,170,176,248,177,241,178,253,183,249,186,167,187, +175,188,172,189,171,191,168,196,142,197,143,198,146,199,128,201,144,2 +09,165,214,153,220,154,223,225,224,133,225,160,226,131,228,132,229,13 +4,230,145,231,135,232,138,233,130,234,136,235,137,236,141,237,161,238 +,140,239,139,241,164,242,149,243,162,244,147,246,148,247,246,249,151, +250,163,251,150,252,129,255,152); #create the reverse mapping (DOS to ASCII) my %dos2asc; foreach my $key (sort keys %asc2dos) { $dos2asc{$asc2dos{$key}} = $key; } #here's a test: #create a string with some accented characters my $string = pack("C10",223,224,225,232,231,236,237,241,243,244); print "ASCII string = $string\n"; $string = asc2dos($string); print "DOS string = $string\n"; $string = dos2asc($string); print "ASCII string = $string\n"; #convert ASCII extended characters to DOS extended characters sub asc2dos { my $str = shift; foreach my $i (0..length($str)-1) { my $val = ord substr($str,$i,1); substr($str,$i,1) = chr $asc2dos{$val} || $val; } return $str; } #convert DOS extended characters to ASCII characters sub dos2asc { my $str = shift; foreach my $i (0..length($str)-1) { my $val = ord substr($str,$i,1); substr($str,$i,1) = chr $dos2asc{$val} || $val; } return $str; }
Re: char2oem
by John M. Dlugosz (Monsignor) on Sep 16, 2001 at 08:53 UTC
    I've got a linux database full of this ANSI-char's and I want to convert them for a DOS program

    You might also just change the code page used in the Console window to match what the program's generating.

    Thta's SetConsoleCP and SetConsoleOutputCP in the Win32 API.

    —John

Re: char2oem
by John M. Dlugosz (Monsignor) on Sep 16, 2001 at 08:50 UTC
    Aren't there already modules that do all that? I recall seeing mentions of them on XML-related questions.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://112645]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (3)
As of 2021-04-18 12:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?