Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Removing extra spaces

by rickoy (Initiate)
on Jul 31, 2012 at 00:59 UTC ( [id://984543]=perlquestion: print w/replies, xml ) Need Help??

rickoy has asked for the wisdom of the Perl Monks concerning the following question:

I have a string: 2 0 1 2 - 7 - 2 7 9 : 3 7 : 3 1 As you can see, each character in the string are separated by 1 space and the date and time are separated by 2 spaces. I would wish to reduce the spaces so that if there are 2 spaces in between characters, it will become 1 space and if there is 1 space in between characters, then it will be gone.

Replies are listed 'Best First'.
Re: Removing extra spaces
by davido (Cardinal) on Jul 31, 2012 at 02:26 UTC

    s/\s(\s?)/$1/g

    Match a single space, and optionally a second space. Capture that second space if it exists. Replace with the capture, which will be either nothing, or the second space.


    Dave

Re: Removing extra spaces
by Rudolf (Pilgrim) on Jul 31, 2012 at 01:56 UTC

    Being lazy, I would abuse the power of regex's and say:

    my $string = '2 0 1 2 - 7 - 2 7 9 : 3 7 : 3 1'; $string =~ s/ /x/g; $string =~ s/ //g; $string =~ s/x/ /g; print $string;

    just did it out in steps.. since you want to remove all the spaces I put a spot holder where all the double spaces are supposed to be, then later replaced the 'x' with ' '. perhaps give tr/// a look, that switches out sets but I'm not sure how to switch out spaces with it.

Re: Removing extra spaces
by johngg (Canon) on Jul 31, 2012 at 09:16 UTC

    You could use a negative look-ahead to replace any space that is not followed by a space with nothing. This will break down if there are more than two spaces though.

    knoppix@Microknoppix:~$ perl -E ' > $dateStr = q{ 2 0 1 2 - 7 - 2 7 9 : 3 7 : 3 1 }; > $dateStr =~ s{\s(?!\s)}{}g; > say $dateStr;' 2012-7-27 9:37:31 knoppix@Microknoppix:~$

    Cheers,

    JohnGG

Re: Removing extra spaces
by NetWallah (Canon) on Jul 31, 2012 at 01:28 UTC
    Try this regex:
    s/\s\s?(\S)/$1/g
    Update: See the correction below. Thanks Anonymonk and davido.

                 I hope life isn't a big joke, because I don't get it.
                       -SNL

      Close, just drop the \s? and it works.

      $ perl -E '$s="2 0 1 2 - 7 - 2 7 9 : 3 7 : 3 1"; $s =~ s/\s\s?(\S)/$1 +/g; say $s' 2012-7-279:37:31 $ perl -E '$s="2 0 1 2 - 7 - 2 7 9 : 3 7 : 3 1"; $s =~ s/\s(\S)/$1/g; + say $s' 2012-7-27 9:37:31
Re: Removing extra spaces
by Athanasius (Archbishop) on Jul 31, 2012 at 02:09 UTC

    Update: rickoy, welcome to the Monastery!

    The specification is a little unclear, but assuming you want to (a) remove all single spaces, and (b) squash all sequences of 2 or more spaces down to a single space:

    #! perl use strict; use warnings; my $string = ' 2 0 1 2 - 7 - 2 7 9 : 3 7 : 3 1 '; # NB: 2 spaces here ^^ # (a) Remove single spaces 1 while $string =~ s/(^|[^ ])[ ]([^ ]|$)/$1$2/g; # (b) Squash multiple spaces down to one $string =~ s/[ ]{2,}/ /g; print "'", $string, "'\n";

    Outputs:

    '2012-7-27 9:37:31'

    HTH,

    Athanasius <°(((><contra mundum

Re: Removing extra spaces
by GrandFather (Saint) on Aug 02, 2012 at 01:47 UTC

    Where did your string come from? Strangeness of that sort looks like 16 bit Unicode strings or some such imported in some odd fashion into Perl where the high 0 byte (for an ASCII character) has been replaced by a space. Maybe you would be better to get the conversion right if possible rather than try to fix it up later?

    True laziness is hard work
Re: Removing extra spaces
by harangzsolt33 (Chaplain) on Aug 25, 2019 at 05:32 UTC
    I know, this question was asked more than 7 years ago, but I would
    like to post a sub that I wrote that does exactly what you want:

    sub CollapseWhitespace{@_ or return'';my$T=shift;defined$T or return'';my$L=length($T);$L or return'';my$c;my$N=0;my$P =0;my$U=1;for(my$i=0;$i<$L;$i++){$c=vec($T,$i,8);if($c<33){ $U=0;if($N++==1){vec($T,$P++,8)=32;}}else{$N=0;$U or vec($T ,$P,8)=$c;$P++;}}return$U?$T:substr($T,0,$P);}

    ^^ This looks a bit obfuscated, so here is a nicer expanded version:

    ############################################################## # # This function removes single instances of whitespace and # converts multiple adjacent whitespace characters to a single # space. In this function, "whitespace" is defined as a character # whose ASCII value is less than 33. (This includes many special # characters such as new line characters, nul, bel, etc.) # # Usage: STRING = CollapseWhitespace(STRING) # # Example: # CollapseWhitespace("\n\t abc 123 xxx\n") --> " abc123 xxx" # sub CollapseWhitespace { @_ or return ''; my $T = shift; defined $T or return ''; my $L = length($T); $L or return ''; my $c; my $N = 0; # consecutive whitespace counter my $P = 0; # target pointer to overwrite original str $T my $U = 1; # string length will be left unchanged for (my $i = 0; $i < $L; $i++) { $c = vec($T, $i, 8); if ($c < 33) { $U = 0; if ($N++ == 1) { vec($T, $P++, 8) = 32; } } else { $N = 0; $U or vec($T, $P, 8) = $c; $P++; } } return $U ? $T : substr($T, 0, $P); }

      A more concise alternative is:

      c:\@Work\Perl\monks>perl -wMstrict -le "use warnings; use strict; ;; use Test::More 'no_plan'; use Test::NoWarnings; ;; use Data::Dump qw(pp); ;; note qq{perl version: $]}; ;; my @TESTS = ( [ undef , qq{} ], [ qq{} , qq{} ], [ qq{ } , qq{} ], [ qq{\n} , qq{} ], [ qq{\n\t} , qq{ } ], [ qq{\n\t\x00} , qq{ } ], [ qq{\n\t \x00} , qq{ } ], [ qq{\n\t abc 123 xxx\n} , qq{ abc123 xxx} ], [ qq{\nabc 123\a\b\fxxx\n\t }, qq{abc123 xxx } ], [ qq{abc 123\n\r xxx} , qq{abc123 xxx} ], ); ;; note 'special case'; is CollapseWhitespace(), '', 'no arguments'; ;; note 'general cases'; VECTOR: for my $ar_vector (@TESTS) { if (not ref $ar_vector) { note $ar_vector; next VECTOR; } ;; my ($str, $expected) = @$ar_vector; ;; is CollapseWhitespace($str), $expected, pp($str) . ' -> ' . pp($expected) ; } ;; done_testing; ;; exit; ;; sub CollapseWhitespace { my $s = shift; return '' unless defined $s; $s =~ s{ [\x00-\x20]+ }{ $+[0] - $-[0] == 1 ? '' : ' ' }xmsge; return $s; } " # perl version: 5.008009 # special case ok 1 - no arguments # general cases ok 2 - undef -> "" ok 3 - "" -> "" ok 4 - " " -> "" ok 5 - "\n" -> "" ok 6 - "\n\t" -> " " ok 7 - "\n\t\0" -> " " ok 8 - "\n\t \0" -> " " ok 9 - "\n\t abc 123 xxx\n" -> " abc123 xxx" ok 10 - "\nabc 123\a\b\fxxx\n\t " -> "abc123 xxx " ok 11 - "abc 123\n\r xxx" -> "abc123 xxx" 1..11 ok 12 - no warnings 1..12
      If you have Perl version 5.14+, a slightly conciserer variation is:
      sub CollapseWhitespace { my $s = shift; return defined $s ? $s =~ s{ [\x00-\x20]+ }{ $+[0] - $-[0] == 1 ? '' : ' ' }xmsger : '' ; }
      See the  s///  /r modifier in perlop. I leave it to you to Benchmark whether the  s///e version is actually faster than the for-loop version.


      Give a man a fish:  <%-{-{-{-<

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://984543]
Approved by davido
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (4)
As of 2024-04-24 20:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found