Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: =~ matches non-existent symbols

by graff (Chancellor)
on Nov 17, 2014 at 02:12 UTC ( [id://1107364]=note: print w/replies, xml ) Need Help??


in reply to =~ matches non-existent symbols

Here's how I would modify the OP script:
#!/usr/bin/perl use strict; use warnings; $/ = undef; # slurp-mode for input, just in case while ( <> ) { # reads stdin or all file names in ARGV s/\s+//g; # remove whitespace tr/ACGTacgt//d; # remove all acgt if ( length() ) { # anything left? print "$ARGV bad content: $_\n"; } else { print "$ARGV all clean!\n"; } }
Using  while (<>) is good (even with slurp-mode input) because that way you can pipe data from any other process as input to the script, or you can put one or more file names on the command line (e.g. "*.txt").

When you read multiple input files in one run, putting "$ARGV" in the print statements tells you which files are good or bad.

Replies are listed 'Best First'.
Re^2: =~ matches non-existent symbols
by Anonymous Monk on Nov 17, 2014 at 04:20 UTC
    Thanks, graff! But what if I need to do further manipulations with the data from the file later in the same program?
      what if I need to do further manipulations with the data from the file later in the same program?

      Presumably, the manipulation will depend on whether the file content is "good" or "bad" - in either case, just save a copy of $_ to some other variable after white-space removal but before removing "acgt"; then pass that copy to whatever function you write to do the manipulation (either good or bad).

      This will be for a second step. Right now, you are saying that your file contains only /ACGT/i but that your validation procedure fails. Many of us think that it is likely that your file contains at least one line feed or carriage return character or a combination of both. The important thing right now is to find out what are the hidden characters that lead your validation subroutine to fail. Once you know that, you can modify your original program or your regex to take the findings into account.
Re^2: =~ matches non-existent symbols
by ikegami (Patriarch) on Nov 19, 2014 at 15:31 UTC

    That considers acegt a valid input.
      I saved my script as posted to "/tmp/j.pl", and ran it as follows:
      echo acegt | /tmp/j.pl
      The output was:
      - bad content: e
      Did you find some other way to run it that yields different results?
        Me bad. Comment deleted.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1107364]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (12)
As of 2024-04-16 07:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found