Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic

Re: Removing Duplicates from a multiline entry

by Kenosis (Priest)
on Feb 27, 2013 at 19:21 UTC ( #1020950=note: print w/ replies, xml ) Need Help??

in reply to Removing Duplicates from a multiline entry

...I need to remove duplicate product entries...

Perhaps your mentioning the address comparison was only a solution proposal. If I'm understanding you correctly--that you only want to "remove duplicate product entries"--then consider the following:

use strict; use warnings; local $/ = ''; my ( %products, %records ); while (<>) { if (/(Product.+)/) { $products{$1}++; $records{$1} = $_; } } print $records{$_} for grep $products{$_} == 1, keys %records;

Usage: perl dataFile [>outFile]

Output on your data set:

Product 3 ------------------------------------------------------------------ storeId = 2123 phoneNumber = (111) 111-1111 availbilityCode = 1 stockStatus = Limited stock distance = 8.83 city = some city fullStreet = some address Product 2 ------------------------------------------------------------------ storeId = 2117 phoneNumber = (111) 111-1111 availbilityCode = 2 stockStatus = In stock distance = 7.49 city = some city fullStreet = some address

The script builds two hashes: one to track the number of times a product number occurs (%products) and one for the records (%records) keyed on the product number. A record is printed only if the product number was seen only once.

Hope this helps!

Comment on Re: Removing Duplicates from a multiline entry
Select or Download Code

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1020950]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (5)
As of 2015-11-28 01:36 GMT
Find Nodes?
    Voting Booth?

    What would be the most significant thing to happen if a rope (or wire) tied the Earth and the Moon together?

    Results (737 votes), past polls