ZWcarp has asked for the wisdom of the Perl Monks concerning the following question:
I am having trouble splitting a file properly because of some weird spacing. The structure of each "header" line is as follows :
>cds:ADD75048 A/Brussels/INS71/2009 2009/10/30 HA
>cds:ADF58353 A/Germany-MV/HGW4/2009 2009/12/ HA
>cds:ADF58351 A/Germany-MV/HGW6/2009 2009/12/ HA
>cds:ADU76781 A/England/94780010/2009 2009/10/22 HA
>cds:AEA30293 A/Netherlands/2223b/2009 2009/11/18 HA
>cds:ADD23250 A/District of Columbia/INS17/2009 2009/10/26 HA
>cds:ADX98640 A/San Diego/INS13/2009 2009/10/19 HA
>cds:ADD74978 A/San Diego/INS54/2009 2009/10/12 HA
>cds:ADF27925 A/Texas/JMS407/2010 2010/01/11 HA
>cds:ADM95824 A/Finland/661/2009 2009/10/26 HA
>cds:ADD97035 A/Wisconsin/629-D00036/2009 2009/09/15 HA
Normally you could just split by space, but i realized that there is sometimes a space in the location(San(space)Diego for example). I want to remove these spaces specifically. I think this can be done by telling perl to substitute all spaces between the first and second forward slashes it encounters. Does anyone know how to do this, or even better how to do it in bash?This is the structure of the headers, and my goal is to remove the spaces ONLY from D:
A:B C/D/E/F G/H/I J
Any ideas? hope this is more clear. Thanks so much!
|
---|