Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much

Regex to remove data

by Anonymous Monk
on Nov 06, 2012 at 14:04 UTC ( #1002478=perlquestion: print w/replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have a large file, I'd like to know how to create a regex which will remove lines if the consist of only Upper case alphabetic characters, examples:

Replies are listed 'Best First'.
Re: Regex to remove data
by Athanasius (Chancellor) on Nov 06, 2012 at 14:13 UTC

    Something like this should do the trick:

    #! perl use strict; use warnings; my @lines = <DATA>; s/ ^ [A-Z\s]+ $ //x for @lines; print for @lines; __DATA__ AAAAAAAA AAAA AAAAAAA AAAA AAA AAA Leave me intact PLEASE


    0:10 >perl Leave me intact PLEASE 0:13 >

    Update 1: Note that this will also remove blank (i.e. empty) lines.

    Update 2: Changed

    print "@lines";


    print for @lines;

    to address the issue of leading spaces raised by Anonymous Monk, below.

    Hope that helps,

    Athanasius <°(((><contra mundum

      This is good, but the OP did say he wanted to remove the lines from a "large file", but your approach reads the entire file into an in-memory array (@lines) and processes that.

      Check out my reply below for an example that loads only one line into memory at a time (the -n switch assumes while (<>)).

      Thanks,though it adds a leading space to each line
Re: Regex to remove data
by sundialsvc4 (Abbot) on Nov 06, 2012 at 14:37 UTC

    Also, don’t overlook the obvious grep (or egrep) commands, if you have them on your system . . . You might not have to “write a program” to do this at all.   Simply use the -v option to output all lines which don’t match the pattern.

Re: Regex to remove data
by rjt (Deacon) on Nov 06, 2012 at 17:10 UTC

    It looks like you want to remove lines that (optionally) contain spaces in addition to uppercase. This one-liner will do the trick:

    perl -ne 'print unless /^[A-Z\s]+$/' <in.txt >out.txt

    Of course if you are including this in a larger Perl program, you can just nab the regex out of that, and use it in a loop construct of some kind. For example:

    while (<>) { print if !/^[A-Z\s]+$/ }
Re: Regex to remove data
by space_monk (Chaplain) on Nov 06, 2012 at 17:10 UTC
    perl -pe 's/ ^ [A-Z\s]+ $ //x' <your_data, anybody?
Re: Regex to remove data
by trizen (Hermit) on Nov 07, 2012 at 07:55 UTC
    Keeps the lines that are empty or that contain only whitespace characters:

    perl -ne 'print if not /[A-Z]/ && /^[A-Z\s]+$/' file.txt

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1002478]
Approved by Athanasius
[LanX]: never liked it, but a friend of my mother was a big "fan" of Mr T xD
[LanX]: (talking about minority chicks)
[LanX]: Hollywood is built on stereotypes which sell, this includes stereotypes about English, French, russians and ... Americans
[Your Mother]: I think to some degree, maybe a large one, the stereotypes are purely products OF Hollywood and not what would sell best.
[LanX]: well those products which sold best are replicated
[Your Mother]: They get almost everything wrong on every level. I think they create and restrict the market and fundamentally misunderstand audiences.
[Your Mother]: Consider how long, for example, superhero movies were kept at bay because they weren't commercially viable. They always were, just Hollywood couldn't see it or understand how to make one because there is no management talent in the town.
[LanX]: Erich von Strohheim built his career on beeing the most hated guy (The man you love to hate)
[Your Mother]: You see these amazing set, costumes, performances, etc, etc, etc all ruined by production and script decisions from the top down.
[LanX]: well ... the traget audiance is 15-25

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (13)
As of 2018-03-19 14:56 GMT
Find Nodes?
    Voting Booth?
    When I think of a mole I think of:

    Results (240 votes). Check out past polls.