Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Sorting the numbers: A little tricky.

by oxydeepu (Novice)
on Jan 25, 2013 at 08:40 UTC ( #1015288=perlquestion: print w/ replies, xml ) Need Help??
oxydeepu has asked for the wisdom of the Perl Monks concerning the following question:

Hi all perl monks,

I have a basic question. I have a file of numbers with two columns

##############
109026 3
109027 28
109028 30
116958 15
116960 35
116961 39
116962 70
116963 72
147184 2
147588 1
153087 32
#############

like this. So i have to format them in such a way that. I should only consider those line which have a value greater than 4 in the second column. The output should be like this. I tried different things it is not working out.

#############

109027 109078 (which is 109028 + 50) 30
116958 117013 (which is 116963 + 50) 72
153087 153137 (which is 153087 + 50) 32

#############

I hope it explains the problem. I couldn't make it work.
This will be a great deal of help and this not an assignment. I am trying to learn perl by myself and it is a problem i just came across with.

Thank you in advance,
Deepak

Comment on Sorting the numbers: A little tricky.
Re: Sorting the numbers: A little tricky.
by Anonymous Monk on Jan 25, 2013 at 09:28 UTC

    I tried different things it is not working out.

    Show your efforts :)

Re: Sorting the numbers: A little tricky.
by choroba (Abbot) on Jan 25, 2013 at 09:53 UTC
    Can you explain the algorithm in greater detail? Why did you pick 116963 after 116958?
    لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: Sorting the numbers: A little tricky.
by Rahul6990 (Beadle) on Jan 25, 2013 at 10:07 UTC
    Sorry But I didn't get your question.
      We don't understand what you are trying to do. Picking up the number when it is highter than 4 is ok ... but, the output - please explain how you come to those results ? We don't see the math.
Re: Sorting the numbers: A little tricky.
by flexvault (Parson) on Jan 25, 2013 at 14:30 UTC

    oxydeepu,

    Like everyone else, it is hard to understand the math. But your original question can be answered by reading the file one line at a time, do a 'chomp' and then 'split' the line into 2 strings. Then using a hash, populate it with keys and values where the value is greater than 4.

    Now you can process your hash using a 'foreach' with a numeric 'sort' of the 'keys' of the hash, and use whatever Perl math code you want to get whatever results you want.

    Like all things in Perl, this is one of many ways to do it, so experiment.

    Good Luck

    "Well done is better than well said." - Benjamin Franklin

Re: Sorting the numbers: A little tricky.
by thundergnat (Deacon) on Jan 25, 2013 at 14:40 UTC

    Your problem is poorly specified. When we have to make guesses as to how to derive your output from your input, it is likely that we will guess the simplest thing that could possibly work and do that, or, more likely, just ask for clarification. Since there have already been several of the latter, I'll take a shot at the former.

    Using the following specs - for a file with 2 columns of numbers; let's call them pointer and value:

    • look for consecutive runs where the value (in the second column) is:
      1. Greater than 4
      2. Greater than the value of the previous entries. *** <-- ASSUMPTION
    • Print the pointer of the start of the run, the pointer + 50 of the end of the run, and the value of the end of the run

    If it was me, I would do something like:

    use warnings; use strict; my ($start, $lastp, $lastv); while ( my $line = <DATA> ){ my ( $pointer, $value ) = split /\s+/, $line; flush() if ( $value <= 4 or $value > 4 && $value < $lastv ); $start = $pointer unless ( defined $start || $value <= 4 ); ( $lastp, $lastv ) = ( $pointer, $value ); } flush(); sub flush { if ( defined $start ){ printf "%d %d (which is %d + 50) %d\n", $start, $lastp + 50, $ +lastp, $lastv; } undef $_ for ( $start, $lastp, $lastv ); } __DATA__ 109026 3 109027 28 109028 30 116958 15 116960 35 116961 39 116962 70 116963 72 147184 2 147588 1 153087 32

    Yields:

    109027 109078 (which is 109028 + 50) 30
    116958 117013 (which is 116963 + 50) 72
    153087 153137 (which is 153087 + 50) 32
    
Re: Sorting the numbers: A little tricky.
by oxydeepu (Novice) on Jan 25, 2013 at 15:13 UTC

    Thank you all for the comments.

    So what I want is, only consider those lines which have a value greater than 4 in the 2 column.

    #########
    109027 28
    109028 30
    116958 15
    116960 35
    116961 39
    116962 70
    116963 72
    147184 2
    147588 1
    153087 32
    ########
    in this set
    109028 - 109027 is not greater 50
    116958 - 109028 is greater than 50

    so I take
    109027 109078(109028 + 50)
    so the next one will start from 116958
    similiarly like above
    153087 - 116963 > 50
    so
    116958 117013 (which is 116963 + 50)
    Since 153087 did not have any neighbours.
    153087 153137

    i hope that will make a little bit more sense.
    I am sorry guys for a vague explanation..
    Thank you in advance,
    Deepak

      And if this problem is "A little tricky", you must have already gotten a start on it. Please show the code that you've started with so we can diagnose where you're going wrong, and how to correct it.


      Dave

Re: Sorting the numbers: A little tricky.
by choroba (Abbot) on Jan 25, 2013 at 21:46 UTC
    I am still not sure I understand your specification. You should test this code with more data to see whether it behaves well in all the border cases:
    #!/usr/bin/perl use warnings; use strict; my ($first, $last, $last_small) = ('0e0', 0, 0); while (<DATA>) { my ($big, $small) = split; next if 4 >= $small; if ($big > $first + 50) { show($first, $last, $last_small); $first = $big; } $last = $big; $last_small = $small; } show($first, $last, $last_small); sub show { my ($first, $last, $last_small) = @_; print "$first ", $last + 50, " $last_small\n" unless '0e0' eq $fir +st; } __DATA__ 109026 3 109027 28 109028 30 116958 15 116960 35 116961 39 116962 70 116963 72 147184 2 147588 1 153087 32
    لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: Sorting the numbers: A little tricky.
by oxydeepu (Novice) on Jan 28, 2013 at 09:39 UTC

    Hi all,
    I will try and explain it the last time. So the file have two columns first is postion and second column is the cumulative frequencies within 50 numbers of the postions.

    for ex,

    109026 3
    109027 25
    109028 2

    became

    109026 3
    109027 28
    109028 30.

    So what I have to do is iterate through the postions, which is column 1, get the postions which go increasing and till there is a difference between the current postion and (latter postion + 50) becomes > 50.

    for example

    109027 28
    109028 30
    116958 15
    116960 35
    116961 39
    116962 70
    116963 72
    153087 32

    in the above set,
    i will start with 109027, then 109028 and then 116958. Here 116958 - (109028 + 50) is greater than 50
    so the first line in the output will be

    109027 109078 (which is 109028 + 50) 30

    the 30 is the value of position 109028.

    next step i have to start from 116958 go through 116963 till 153087, since the difference 153087 - (116963 + 50 ) becomes > 50
    So i will stop the iteration and output the next line, which is

    116958 117013 (which is 116963 + 50) 72

    where 72 is the value for 116963

    then i will start from 153087, since there no increasing. I have to stop the iteration and out like this

    153087 153137 (which is 153087 + 50) 32

    This is the problem. I don't know whether i explained it better than last time. I don't have a code, i'm still stuck with how to implement. Hoping for help.

    Thank you in advance.
    regards,
    Deepak

      As far as I understand, that is what my code here does.
      لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1015288]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (5)
As of 2014-12-22 04:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (110 votes), past polls