Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: Remove duplicate from the same line..

by rpnoble419 (Pilgrim)
on Jun 01, 2013 at 15:14 UTC ( [id://1036457]=note: print w/replies, xml ) Need Help??


in reply to Remove duplicate from the same line..

Are you getting this duplication only on the Company Name line? If so, wrap the solution from Athanasius in an if test when you read the line from your file. Otherwise you can damage any address information as warned by Athanasius. Can you get a look at the system that is causing the problem in the first place? That might be your better long term solution..

  • Comment on Re: Remove duplicate from the same line..

Replies are listed 'Best First'.
Re^2: Remove duplicate from the same line..
by Anonymous Monk on Jun 01, 2013 at 15:36 UTC

    Thx. Athanasius... That was good.

    but now I got 1 more problem, I have a company name like "Goldman Sachs Group, ... Goldman Sachs Group, Inc." Here I want only 'Goldman Sachs Group'. Is there any option for that?

    Basically If a word appear second time in a single line delete the rest of the line including that word and trim out if any , or . is present!. Is that possible?

    & Thx rpnoble, I will do it like that... :)

    Thx.

      As a counter example I submit: 'Smith Smith & Feeley LLP'
      -- gam3
      A picture is worth a thousand words, but takes 200K.

        This is a very good objection to the whole exercise. There is not really any way to know whether Smith Smith, Inc is a duplication or a valid company name. Without some real world knowledge I cannot see a way to distinguish between the two. As a remediation one could write all replacements into a log file for review and build a list of exceptions.

      Adding .* after \1 in Atanasius' solution should do the trick as it matches everything up to the trailing \n.

        No, it didnt worked!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1036457]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (3)
As of 2024-04-18 23:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found