Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling

Not A Rockstar File Manipulator Today

by koolgirl (Hermit)
on Nov 14, 2008 at 22:13 UTC ( #723735=perlquestion: print w/replies, xml ) Need Help??
koolgirl has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

OK, according to Randal, (Learning Perl 1st Edition), and brian d foy, (perlfaq5), this code I've got should work. It prompts an in file, an out file, a pattern and a replacement from the user, then executes the replacement based on STDIN between the two files.

#!usr/bin/perl use strict; my $in_file; my $out_file; my $answer; my $pattern; my $replacement; # This program prompts user for an input file, an output file, a searc +h pattern and a replacement string, then executes the search and repl +acement between the two files. print "Enter the in_put file"; chomp($in_file = <STDIN>); print "Enter the out_put file"; chomp($out_file = <STDIN>); print "Enter the pattern"; chomp($pattern = <STDIN>); print "Enter the replacement"; chomp($replacement = <STDIN>); open (IN, "$in_file") || die "not a rockstar today"; die "will not overwrite $out_file" if -e $out_file; open (OUT, ">$out_file") || die "not a rockstar today"; while (<IN>) { # S/$pattern/$replacement/g; print OUT $_; } # end while close (IN); close (OUT);

Please note, I understand how ridiculous it is for the user to have enter the entire path of the file, and everything else exactly, but that's just really not the point here, I'm trying to learn basic file manipulation. I had a separate program without the substitution going on, or the user prompts, but everything else the same, and it worked fine, however in this one, even if I adjust it just to copy lines from the in file to the out, it still won't...?

As I said, in all examples and documentations I can find, all the examples are the same as my code, aside from the user prompts and the substitutions of course, so I'm rather confused at this point, not too mention my currently glazed over eyes are slowly dying from staring at the screen for so, anyone, buellar, buellar....?

update: sorry brian, fixed it, I think

Replies are listed 'Best First'.
Re: Not A Rockstar File Manipulator Today
by toolic (Bishop) on Nov 14, 2008 at 22:36 UTC
    Can you be specific as to how it doesn't work?
    • what does your input file contain?
    • what is your pattern?
    • what is your replace string?
    • what OS are you on?
    • what output do you get (or error messages)?
    • what output do you expect to get?
    • what is the meaning of life?
    S///g should be lowercase: s///g

    use warnings; is good too.

    If your pattern has special characters, you may need to escape them (quotemeta).

Re: Not A Rockstar File Manipulator Today
by runrig (Abbot) on Nov 14, 2008 at 22:38 UTC
    Making all of your error messages be "not a rockstar today" is not very helpful if there is an error. On system calls (e.g. open) you should include $!, e.g. die "not a rockstar today: $!";
Re: Not A Rockstar File Manipulator Today
by oko1 (Deacon) on Nov 15, 2008 at 04:28 UTC

    First, try simplifying your code. Troubleshooting overly-verbose code larded with unnecessary variables and routines is difficult.

    perl -wpe's/my_pattern/my_replacement/g' in_file > out_file

    This will do what you're looking for (the '-p' option creates a printing loop around the specified in_file.) After that, if you want to expand it, you can - one step at a time and checking the output after every change. That should be a pretty good way to learn about these functions and their command line equivalents (see perlrun for more info on the various options to perl.)

    "Language shapes the way we think, and determines what we can think about."
    -- B. L. Whorf
      You have a very good point about my code needing to be less verbose, however, just for future reference, because in all of my write-ups this point usually comes into discussion, Perl is my first programming language. Because of this, I lack the structure most programmers have, because I've yet to work with a strongly typed language. When you have a skill, it's perfectly understandable, to omit it when helpful or unnecessary, however, if you are trying to learn a skill (such as coding with structure and discipline), using it even when not absolutely necessary, helps develop good programming habits.

      In short, my code is always going to be verbose, because I am trying to teach myself good habits, as well as be careful not to learn bad ones, such as getting used to using lots of short cuts and having bad structure, or rather, lose structure, because my research, as well as my fellow programmer's, have shown me that Perl being your first programming language, can cause lots of problems unless you are careful.

      I suppose perhaps ya'll might think I go over-board, but I'd rather do that, than become a messy coder, and wind up having serious issues when I begin learning strongly typed languages in school this spring. However, I do realize it's just as important to learn the short cuts and usefulness of Perl as the discipline, structure, good habits, etc., so I do appreciate the other monks pointing these things out, so my knowledge expands in those areas as well, I just believe, for now, until I become seasoned as far as other languages are concerned, the discipline of being very structured and verbose in all the code I write, is something I will stick with, in an attempt to become a Rockstar Programmer ;)

        Don't fret too much about style and clarity. It's not that there's no such thing as ugly code; but a learner has a lot of trouble knowing which is which. To be honest, even among the Enlightened, "clean code" is not a universally agreed-upon standard. And academic computing (i.e. what you learn in school) isn't what you do in industry. I work at a university, and what they teach the students varies significantly from what actually keeps the system running.

        If you take the advice here, you'll have clean code for some acceptable value of "clean". The Monks aren't infallible, but they seem to have reached concensus on important points.

        Whatever you do, don't let the urge to write "clean" or "maintainable" code lure you into writing some ugly abomination of brain-deadness. Don't be afraid to write good code just because it might be hard to understand. "Easy to maintain" is confused far too often with "grossly inefficient and overly verbose", especially in academic circles. Far too many people are scared off by map or grep, because they wrongly consider it obscure. Good code is frequently not immediately transparent: that's not a bad thing. If you set out to write beautiful code, you'll find it's both clear and efficient, for acceptable values of "clear" and "efficient".

        And Perl's more strongly typed than people seem to think. We just blur the issue with "contexts" and limit the types to scalars, lists, hashes, functions, etc. Lisp, sh, and Forth are loosely typed: Perl is somewhere between them and Java.

Re: Not A Rockstar File Manipulator Today
by Illuminatus (Curate) on Nov 14, 2008 at 22:28 UTC
    Could you provide your sample input, as well as pattern/replacement inputs? I tried the following:
    line1 line2 line3
    as the input file, 'line' as the pattern, and 'foo' as the replacement. All my line's get replaced with foo's. Isn't that what you expect?

      OK, well mine was a bit different, but just to see if my input was the problem, I changed the input file to the same as yours, numbered lines. But when I ran it as such, with line as pattern and foo as replacement, the result was a completely empty output file :|. Which is even stranger, to me, because that's the first thing it did, and it makes no sense anyway...I'm gonna need a straight jacket soon...:\

Re: Not A Rockstar File Manipulator Today
by ikegami (Pope) on Nov 15, 2008 at 01:06 UTC

    Please note, I understand how ridiculous it is for the user to have enter the entire path of the file

    What makes you think they need to do that? Paths relative to the current directory also works. For example, if the script is called from the same directory as the file to manipulate, only the file name needs to be entered.

    And why die "not a rockstar today" instead of die $!? (Oops, someone already mentioned this)

Re: Not A Rockstar File Manipulator Today
by ikegami (Pope) on Nov 15, 2008 at 01:08 UTC

    Oh and the capital S on my s/pat/rep line is a typo, not actually in my code

    So you're asking us to fix a problem in code we haven't seen? Don't do that. Show us the code you tried, or try the code you're going to show.

      No, I didn't re-type the code, you are looking at the code I'm using, when I pasted, I realized the typo, which was also in my code, but when I changed it, with high hopes of course that I wouldn't even need to ask because I had found the problem, and it made no difference either way, I just forgot to change it on the posted code as well. I am a newbie, as I'm sure all can see, and retarded in many areas, but give me some credit, I can cut and paste :p ;)

      Thanks for the responses on debugging, though...I swear I'm having the hardest time with that part...

        Thanks for the responses on debugging, though...I swear I'm having the hardest time with that part...

        Welcome to the wonderful world of programming. Debugging is usually the hardest part. When debugging, one of the most useful techniques I've found is to verify the pieces that I know beyond a shadow of a doubt are right. Quite often they are not and debugging the rest of the code won't get you anywhere.

        Several of the other responses suggest adding print statements to show what is happening. Have you done this? Don't skip one because you know it's right.

        G. Wade
Re: Not A Rockstar File Manipulator Today
by koolgirl (Hermit) on Nov 14, 2008 at 23:33 UTC
    OK, when asked for input, I give the path to the input and output file which happens to be (E:/passwords/moonlight_mile) and (E:/passwords/jumpin_jack_flash) then the pattern I give is "line", and the replacement I give is "foo". Both files merely contain these four lines.
    line1 line2 line3 line4
    I only get an empty output file and an unchanged input file afterwards. Please anyone? Everyone keeps telling me it works but it doesn't....should I just put the straight jacket on now or....

    UPDATE: Oh and the capital S on my s/pat/rep line is a typo, not actually in my code :) sorry, that's not the mystery.

      S a typo? Does that mean you didn't copy and paste the code but wrote it down a second time? In that case are you sure both programs are identical now?

      One thing you might do now is debugging. I.e. put print statements into you code to check the changes in $_. Your loop might look like this:

      print "REPLACE <$pattern> with <$replacement>\n"; while (<IN>) { # print "<$_>\n"; S/$pattern/$replacement/g; print "AFTER <$_>\n"; print OUT $_; } # end while

      Or you might use perls debugger where you can execute a line, look at any and all variables you like, execute the next line, look at .... Just call your script with perl -d and use 's' or 'n' to single step and 'p $_' to see the contents of variable $_, for example. 'h' gives you a help page. It is really simple and easy to work with, read 'man perldebug if you want to know more'

        Inline debugging is really, really helpful in cases like this. You might also want to change the

        print OUT $_;
        print ;
        to print your lines on the screen.

        And for a sanity check, you might put something like:

        print "\$in_file: $in_file \n"; print "\$out_file: $out_file \n"; print "\$pattern : $pattern\n"; print "\$replacement: $replacement\n";
        before your open commands. Just to make sure you're opening the file(s) you think you're opening.

Re: Not A Rockstar File Manipulator Today
by ptoulis (Scribe) on Nov 15, 2008 at 16:17 UTC
    The only problem I had with your code is the 'S' symbol. Under the 'strict' pragma the program requires 's' for the substitution expression.
    Now I can't match oko1 in code size (nor anywhere else), but inspired from the funny discovery that your arguments are in alphabetic order, here is my code
    use strict; my @params=qw(input output pattern replacement); my %settings; @settings{@params}=("0")x @params; foreach my $key (sort keys %settings) { print "$key="; chomp($settings{$key}=<>); } open IN,"<",$settings{"input"} || die "not a rockstar today"; -e $settings{"output"} and die("File Exists..will not override!") or o +pen(OUT,">",$settings{"output"}) or die "Error while opening file: $! +"; while (<IN>) { # s/$settings{"pattern"}/$settings{"replacement"}/g; print OUT $_; } # end while close (IN); close (OUT);

      The good:

      • 3 argument open. Very much safer than the two arg flavor.
      • automating the parameter collection

      The bad:

      • input parsing is overly complicated.
      • Big chains of and/or logic are hard to maintian.
      • <> has extra magic. I'd just use STDIN. Unless you want the magic.
      • No quotes needed in hash keys.
      • High precedence logical or (||) in the line where you open the input file doesn't do what you think it does.
      use strict; use warnings; my %settings; foreach my $key (qw(input output pattern replacement)) { print "$key = "; my $setting = <STDIN>; chomp $setting; $settings{$key}=$setting; } open IN, '<', $settings{input} or die "Can't open input file $settings{input}: $!"; die "Output file $settings{output} already exists\n" if -e $settings{output}; open OUT, '>', $settings{output} or die "Can't open output file $settings{output}: $!"; while (<IN>) { s/$settings{pattern}/$settings{replacement}/g; print OUT $_; } close (IN); close (OUT);

      Now, there are still some problems with this code. The big one is that using a hash and keys for storing the settings, looses us the benefits of using strictures in the first place. If I misspell 'output' when I use it, I'll get no warnings. Perl happily autovivifies a value in the hash and returns an undef.

      There are ways around this issue. The easiest is to use Hash::Util to lock the hash, so that attempts to change the hash or access nonexistent keys become a fatal error.

      use Hash::Util qw(lock_hash); my %settings; foreach my $key (qw(input output pattern replacement)) { print "$key = "; my $setting = <STDIN>; chomp $setting; $settings{$key}=$setting; } lock_hash %settings;

      Another thing is the use of global file handles. I should really be using lexical filehandles. Global handles are global variables, with all the attendant problems.

      TGI says moo

        Wow! I see your points TGI, thanks! About the '||' and 'or' thing this was my bad, I was aware of the difference. I think using and's and or's is pretty cool. Coming from a Java/C# background I'd say they are most expressive features. I have not been into big Perl projects yet, so I guess I have not faced the drawbacks.
        Anyway, I cleaned up more code using the map{} magic and current 'version' is:
        use strict; use warnings; my @params=qw(input output pattern replacement); my %settings; @settings{@params}=("0")x @params; foreach my $key (sort keys %settings) { print "$key="; chomp($settings{$key}=<>); } open IN,"<",$settings{input} or die "not a rockstar today"; -e $settings{output} and die("File Exists..will not override!") or open(OUT,">",$settings{output}) or die "Error while opening file: $!"; print OUT map { if(s/$settings{pattern}/$settings{replacement}/g) {$_} else {$_} } (<IN>); close (IN) and close (OUT);

        As a freshman, I just can't enough of this kind of writing! :)

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://723735]
Approved by GrandFather
Front-paged by Arunbear
and dust plays in a shaft of sunlight...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (2)
As of 2018-05-28 04:33 GMT
Find Nodes?
    Voting Booth?