Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: Text file manupulation to create test files

by ELISHEVA (Prior)
on Jan 10, 2011 at 13:43 UTC ( [id://881455]=note: print w/replies, xml ) Need Help??


in reply to Text file manupulation to create test files

If I'm reading what you wrote above correctly you want to split the first line of each subscriber loop into four or more lines, let's say 1a, 1b, 1c, 1d. These will be followed by the remaining lines of the subscriber loop.

To do this task, you are going to need to review working with hashes if you haven't done so already. The algorithm would look something like this:

  1. Read line 1 of subscriber loop - convert into hash with hash entries being the property-value pairs found on the first subscriber line. Thus line 1 would look like this at the end of this step.
    $hLine = { SBSB_ID => '123456782' , First_name => "Ryan" , Last_name => "George" , WMDS_SEQ_NO1 => 2 , WMDS_SEQ_NO22 => 3 #, .... and so on for each field ... };
  2. Next print line 1a, extracting and printing property names and values you want (SBSB_ID, First_name, Last_name) from the hash.
  3. Delete the hash keys that you've printed. What you'll have left is the properties that you want to have one per line.
  4. For each remaining key-value pair in the hash, print one per line. That completes the processing of line 1 of the subscriber loop.
  5. For the remaining lines of the subscriber loop, read in each line and print as is
  6. When you detect the end of the subscriber loop, return to step 1, unless you are at the end of the file.

Of course, there are many ways to do this, depending on what you know about the properties that belong on lines 1b, 1c, 1d, etc. For example, in step 3, if you know that the one-property-per-line properties always have the format WMDS_SEQ_NO# where # is some number, then you can skip the "delete each key" part above and just do a map with a regex, something like this (not checked for typos):

foreach my $k (grep { /^WMDS_SEQ_NO\d+$/ } sort keys %$hLine) { print $fh "$k=".$hLine->{$k}."\n" }

Update: I just noticed the requirement to replicate each subscriber loop. This requires a some changes to the above algorithm.

  • In step 1, use two hashes: one for the keys that belong in line 1a and another for the keys that belong one-per-line, i.e. $hLine1a and $hOnePerLine. This way you don't need to delete any keys.
  • In step 2, print the property-value pairs in $hLine1a
  • Step 3, skip - no longer applicable.
  • In step 4, print the property-value pairs in $hOnePerLine
  • In step 5, instead of immediately printing each line as is, save it in an array @aRestOfSubscriberLoop and then print as is. This will preserve the remaining lines of the subscriber loop. At the end of step 5, duplicate the subscriber loop as follows:
    1. Replace the value of $hLine1a->{SBSB_ID} with a random value
    2. Repeat steps 2 & 4 to print out line 1a,1b,1c, etc with the new SBSB_ID
    3. Print the remaining lines of the subscriber loop by printing out all of the lines in @aRestOfSubscriberLoop, one per line.
    4. Repeat for as many times as you want to duplicate this particular subscriber loop
  • In step 6, no change :-)

Update: fixed instruction numbering error.

Replies are listed 'Best First'.
Re^2: Text file manupulation to create test files
by ajguitarmaniac (Sexton) on Jan 10, 2011 at 15:10 UTC

    Thanks Elisheva, your reply was very helpful.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://881455]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (7)
As of 2024-04-19 15:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found