Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Writing data into columns

by $new_guy (Acolyte)
on Aug 25, 2010 at 22:30 UTC ( #857322=perlquestion: print w/ replies, xml ) Need Help??
$new_guy has asked for the wisdom of the Perl Monks concerning the following question:

Currently the file looks like this : Current file:

0. 1 1 7 2 2 9 4 6 8 10 2 1. 10 1 6 3 4 9 2. 5 8 59 77 81 3 3. 16 17 19 23 3
Desired output:
0. 1 2 - 4 - 6 8 9 10 - - - - - - 0. 1 2 - - - - - - - - - - - - - 0. - 2 - - - - - - - - - - - - - 0. - 2 - - - - - - - - - - - - - 1. 1 - 3 4 - 6 - 9 10 - - - - - - 2. - - 3 - 5 - 8 - - - - - -59 77 81 3. - - 3 - - - - - - 17 19 23 - - -

PS: the numbers eg 1 2 3 77 81 etc could have letters in them eg 1s 2w1 3_f 77_gg 81_str7 The are about 700 columns and 4000 rows of indices. PS: the first column is an index of the entries. the numbers/entries eg 1 2 3 77 81 etc could have letters and hyphens in them eg 1s 2w1 3_f 77_gg 81_str7 etc.

Comment on Writing data into columns
Select or Download Code
Re: Writing data into columns
by psini (Deacon) on Aug 25, 2010 at 22:34 UTC
Re: Writing data into columns
by toolic (Chancellor) on Aug 25, 2010 at 22:39 UTC
    • Read perlintro
    • Write some code which tries to do what you want.
    • If you still have problems, post the code you have tried.
    • Clarify the description of your problem. I do not understand how you get that output from that input.
Re: Writing data into columns
by graff (Chancellor) on Aug 25, 2010 at 22:51 UTC
    To expand a bit on that last point raised by toolic above: I have reason to suspect that your "Desired output" might contain some typos (missing numbers, bad spacing, etc). But it would be unwise for me to guess at what the correct form should be, because I have no way of knowing that. I'm just reacting to some apparent discrepancies in your "data" that aren't explained in your "description".

    Also, when you say "... the numbers ... could have letters in them ... the numbers/entryes ... could have letters and hyphens in them ...", it really is impossible for us to make any sense of this, unless you actually show some relevant examples of actual input and (correct) desired output.

    Apart from that, you will need to show some perl code that you have tried.

Re: Writing data into columns
by davido (Archbishop) on Aug 26, 2010 at 01:51 UTC

    This feels like one of those IQ tests where you have to look at a sequence and figure out what comes next. Unlike those tests, the more I look at your desired output, the less I understand how it relates to the input.


    Dave

Re: Writing data into columns
by dasgar (Deacon) on Aug 26, 2010 at 02:52 UTC

    As others have eluded to, you haven't given enough information for others to understand what you're trying to do and/or where you're encountering problems. I'll venture a guess that you're wanting formatted output. If I'm right, you'll want to check out sprintf. However, if you need more help than that, you've got to take the first step and provide more information.

Re: Writing data into columns
by JavaFan (Canon) on Aug 26, 2010 at 09:52 UTC
    Assuming your example contains some typos, you may want to go for:
    use 5.010; use strict; use warnings; use List::Util 'max'; my (%data, %all, @keys); while (<DATA>) { chomp; my @chunks = split; my $key = shift @chunks; push @keys, $key unless exists $data{$key}; foreach my $chunk (@chunks) { $data{$key}{$chunk}++; $all{$chunk} = 1; } } my @fields = sort {$a <=> $b} keys %all; my $l = max map {length} @fields; foreach my $key (@keys) { while (keys %{$data{$key}}) { print $key, " "; foreach my $field (@fields) { if ($data{$key}{$field}) { printf "%${l}s ", $field; delete $data{$key}{$field} unless --$data{$key}{$field +}; } else { printf "%${l}s ", "-"; } } print "\n"; } } __DATA__ 0. 1 1 7 2 2 9 4 6 8 10 2 1. 10 1 6 3 4 9 2. 5 8 59 77 81 3 3. 16 17 19 23 3
    The result:
    0. 1 2 - 4 - 6 7 8 9 10 - - - - - - - 0. 1 2 - - - - - - - - - - - - - - - 0. - 2 - - - - - - - - - - - - - - - 1. 1 - 3 4 - 6 - - 9 10 - - - - - - - 2. - - 3 - 5 - - 8 - - - - - - 59 77 81 3. - - 3 - - - - - - - 16 17 19 23 - - -
    Changing the sorting of fields if you have non-numeric fields is left as an exercise to the reader.
      Hi JavaFan, Thanks a lot for the useful reply. I have run the script it works well! I modified the script so that it can read a file and write a file to a .txt file (any improvements appreciated):
      use 5.010; use strict; use warnings; use List::Util 'max'; if (scalar(@ARGV) != 1) { print "\n"; print "Usage: monk.pl <input_file>\n"; print "\n"; exit(); } # Read in the file my ($FILENAME) = @ARGV; open(DATA, $FILENAME); my (%data, %all, @keys); while (<DATA>) { chomp; my @chunks = split; my $key = shift @chunks; push @keys, $key unless exists $data{$key}; foreach my $chunk (@chunks) { $data{$key}{$chunk}++; $all{$chunk} = 1; } } #now make a file for the ouput my $outputfile = "organised_columns.txt"; if (! open(POS, ">>$outputfile") ) { print "Cannot open file \"$outputfile\" to write to!!\n\n" +; exit; } my @fields = sort {$a <=> $b} keys %all; #my @fields = sort {lc($a) cmp lc($b)} keys %all; my $l = max map {length} @fields; foreach my $key (@keys) { while (keys %{$data{$key}}) { print POS $key, " "; foreach my $field (@fields) { if ($data{$key}{$field}) { printf POS "%${l}s ", $field; delete $data{$key}{$field} unless --$data{$key}{$field +}; } else { printf POS "%${l}s ", "-"; } } print POS "\n"; } }
      The actual file a working on looks like this:
      0 3850_1_12_00403 3850_1_12_01148 3850_1_12_01185 3850_1_12_01591 3850 +_1_12_01967 3850_1_12_01968 3850_1_2_00804 3850_1_2_01074 3850_1_3_00 +078 3850_1_3_01217 3850_1_3_01911 3850_1_3_02337 3850_2_11_00015 3850 +_2_11_00064 3850_2_11_00267 3850_2_11_00831 3850_2_11_01101 3850_2_11 +_01229 3850_2_11_02298 3850_2_12_00022 3850_2_12_00972 3850_2_12_0097 +3 3850_2_12_01106 3850_2_12_01255 3850_2_12_01566 3850_2_12_01938 385 +0_2_1_00600 3850_2_1_00601 3850_2_1_00779 3850_2_1_00781 3850_2_1_012 +43 3850_2_1_01309 3850_2_1_01310 3850_2_1_01515 3850_2_1_01770 3850_2 +_2_00784 3850_2_2_00959 3850_2_2_01130 3850_2_2_01131 3850_3_10_00057 + 3850_3_10_00409 3850_3_10_01230 3850_3_10_01375 3850_3_12_00117 3850 +_3_12_00766 3850_3_12_01370 3850_3_12_01956 3850_3_1_00816 3850_3_1_0 +0868 3850_3_1_01284 3850_3_1_01663 3850_3_1_01670 3850_3_1_01994 3850 +_3_1_02128 3850_3_5_01110 3850_3_5_01111 3850_3_5_01647 3850_3_6_0001 +0 3850_3_6_00306 3850_3_6_01151 3850_3_6_01188 3850_3_6_01928 3850_3_ +7_00081 3850_3_7_00105 3850_3_7_00565 3850_3_7_01010 3850_3_7_01166 3 +850_3_8_00221 3850_3_8_01011 3850_3_8_01047 3850_3_8_01198 3850_3_9_0 +0014 3850_3_9_00782 3850_3_9_00958 3850_3_9_01566 3850_3_9_01904 3850 +_6_12_00046 3850_6_12_01028 3850_6_12_01063 3850_6_12_01064 3850_6_2_ +01091 3850_6_2_01126 3850_6_2_01127 3850_6_2_01128 3850_6_2_01826 385 +0_6_3_00696 3850_6_3_01055 3850_6_3_02294 3850_6_4_00056 3850_6_4_009 +14 3850_6_5_00014 3850_6_5_01061 3850_6_6_00477 3850_6_6_00897 3850_6 +_6_01041 3850_6_7_00537 3850_6_7_00538 3850_6_7_01713 3850_7_10_01021 + 3850_7_10_01164 3850_7_10_01661 3850_7_11_00630 3850_7_11_01744 3850 +_7_12_00670 3850_7_12_01111 3850_7_12_01114 3850_7_12_01251 3850_7_12 +_01523 3850_7_1_00576 3850_7_1_00972 3850_7_1_01006 3850_7_1_01008 38 +50_7_2_00943 3850_7_2_01586 3850_7_3_00695 3850_7_3_01086 3850_7_3_01 +119 3850_7_3_01829 3850_7_4_01013 3850_7_4_01672 3850_7_5_01043 3850_ +7_5_01174 3850_7_6_00019 3850_7_6_00055 3850_7_6_00866 3850_7_6_01008 + 3850_7_6_01513 3850_7_7_00177 3850_7_7_00332 3850_7_7_00795 3850_7_8 +_01029 3850_7_8_01063 3850_7_8_01193 3850_7_8_01499 3850_7_8_01802 38 +50_7_8_01803 3850_7_9_01793 3850_8_10_00107 3850_8_10_00688 3850_8_10 +_01026 3850_8_10_01836 3850_8_11_00715 3850_8_11_00716 3850_8_11_0120 +5 3850_8_11_01660 3850_8_12_00167 3850_8_12_00168 3850_8_12_01209 385 +0_8_1_01024 3850_8_1_01577 3850_8_2_00980 3850_8_2_01126 3850_8_3_000 +06 3850_8_3_00101 3850_8_3_00729 3850_8_3_01058 3850_8_3_01770 3850_8 +_3_01773 3850_8_4_00112 3850_8_4_01157 3850_8_4_01158 3850_8_4_01306 +3850_8_9_00146 3850_8_9_00260 3850_8_9_01205 3850_8_9_01206 3850_8_9_ +01352 3850_8_9_01837 1 3850_1_12_00470 3850_1_12_01878 3850_1_12_01879 3850_1_12_01880 3850 +_1_12_02239 3850_1_12_02240 3850_1_2_00310 3850_1_2_01813 3850_1_2_01 +814 3850_1_2_02173 3850_1_3_00300 3850_1_3_02100 3850_2_11_02142 3850 +_2_12_00256 3850_2_12_00317 3850_2_12_02333 3850_2_12_02335 3850_2_1_ +00014 3850_2_1_00183 3850_2_1_00248 3850_2_1_02169 3850_2_2_00060 385 +0_2_2_00347 3850_2_2_00405 3850_2_2_02185 3850_2_2_02186 3850_2_2_021 +87 3850_3_10_00332 3850_3_10_00466 3850_3_10_01851 3850_3_10_02175 38 +50_3_10_02176 3850_3_12_00043 3850_3_12_00107 3850_3_12_00325 3850_3_ +12_00407 3850_3_12_00910 3850_3_12_02237 3850_3_1_00436 3850_3_1_0237 +3 3850_3_1_02374 3850_3_1_02375 3850_3_5_01952 3850_3_6_00283 3850_3_ +6_01859 3850_3_6_01861 3850_3_6_02226 3850_3_6_02227 3850_3_7_00115 3 +850_3_8_00211 3850_3_8_00295 3850_3_8_01759 3850_3_8_01760 3850_3_8_0 +2200 3850_3_9_00151 3850_3_9_00464 3850_3_9_01820 3850_3_9_01821 3850 +_3_9_02176 3850_3_9_02177 3850_6_12_00143 3850_6_12_00151 3850_6_12_0 +2052 3850_6_12_02053 3850_6_2_00317 3850_6_2_02085 3850_6_3_01682 385 +0_6_3_01683 3850_6_3_02060 3850_6_3_02062 3850_6_4_00084 3850_6_4_019 +04 3850_6_5_01754 3850_6_5_02116 3850_6_5_02118 3850_6_6_01835 3850_6 +_7_02071 3850_7_10_00124 3850_7_10_00157 3850_7_10_01956 3850_7_10_01 +958 3850_7_10_01959 3850_7_11_02105 3850_7_12_00023 3850_7_12_00076 3 +850_7_12_01745 3850_7_12_02087 3850_7_12_02088 3850_7_1_00010 3850_7_ +1_00101 3850_7_1_00113 3850_7_1_02000 3850_7_1_02001 3850_7_2_00171 3 +850_7_2_01930 3850_7_2_01932 3850_7_3_01761 3850_7_3_02133 3850_7_3_0 +2134 3850_7_4_00173 3850_7_4_00188 3850_7_4_01600 3850_7_4_02006 3850 +_7_5_00054 3850_7_5_02098 3850_7_5_02099 3850_7_6_00184 3850_7_6_0044 +8 3850_7_7_00103 3850_7_7_02055 3850_7_7_02056 3850_7_8_01714 3850_7_ +8_01720 3850_7_8_02048 3850_7_8_02049 3850_7_9_02111 3850_8_10_00075 +3850_8_10_00327 3850_8_10_02050 3850_8_11_00133 3850_8_11_01911 3850_ +8_11_01912 3850_8_11_02436 3850_8_12_00214 3850_8_12_00223 3850_8_12_ +02055 3850_8_1_00134 3850_8_1_00181 3850_8_1_00238 3850_8_1_01576 385 +0_8_1_01824 3850_8_1_01825 3850_8_2_02040 3850_8_3_00223 3850_8_3_020 +75 3850_8_3_02076 3850_8_4_00179 3850_8_4_00189 3850_8_4_00246 3850_8 +_4_02182 3850_8_4_02183 3850_8_4_02184 3850_8_9_00233 3850_8_9_01835 +3850_8_9_02200
      So I want to sort them into columns using their prefixes: eg 3850_8_9_ 's should all be in one column and so should 3850_8_4_'s .... etc. Note: That was my first post to perlmonks and am really sorry if it was insensible.

      Have a splendid day!

      Many thanks,

      $new_guy
        Could you supply the data that gives you the repeating/non-aligned output?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://857322]
Approved by planetscape
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (5)
As of 2014-08-30 12:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (293 votes), past polls