Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
Hi, your code did not output strange signs when running it, I would suspect there is something wrong with the input file, but cannot tell without further information.

I would like to focus on your other problem (inserting the tabs in some places, not in others). My suggestion would be as follows, for completeness I include big parts of what you already have written. It's late so the code is not completely polished, e.g. the first two substitutions should be written with inline comments and there may be more elegant ways to write some of the regexes, but here it is:

print "\nThis program reformats scripts produced by SQL Server 2000 En +terprise Manager\n"; print "to remove brackets and tab out data types and null settings.\n\ +n"; print "You provide a file name, this program reads it and produces a n +ew file\n"; print "with a .out extension.\n\n"; print "File name to process? (<enter> to end program.) "; chomp($sqlfile = <stdin>); $outfile = $sqlfile . ".out"; open(IN, $sqlfile) || die "cannot open $sqlfile for input: $!"; open(OUT, ">$outfile") || die "cannot open $outfile for output: $!"; while (<IN>) { #remove square brackets s/(\[|\])//g; #remove whitespace before round brackets... s/\s+((\(|\)))/$1/g; #...and commas s/\s+,/,/g; #remove some keywords s/COLLATE SQL_Latin1_General_CP1_CI_AS//g; s/ON PRIMARY//g; #remove duplicate whitespace s/\s+(\s)/$1/g; #THE MOST INTERESTING PART: #For lines not starting with non-whitespace (should hopefully be #the case only for the first line, otherwise you have to track #the line number lest you analyze keywords): #replace (single) whitespace character before word by three #tabs in case the following expression is neither "NULL" #nor "NOT NULL" s/\s+(?!(?:NOT )?NULL)([a-zA-Z]\w*)/\t\t\t$1/g if !/^\S/; print; print OUT; } END { close OUT || die "problem closing new $outfile: $!"; close IN || die "problem closing original $sqlfile: $!"; }
Some comments:
1. The most interesting part is the negative lookahead used for inserting the three tabs in the places described above.
2. You do not need to chomp the input lines since otherwise you need to add newlines afterwards again.

Hope this helps a bit and gave you some new ideas.

In reply to Re: Problem with larger files (and s/) by jds17
in thread Problem with larger files (and s/) by Cloudster

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others avoiding work at the Monastery: (3)
    As of 2019-08-25 02:22 GMT
    Find Nodes?
      Voting Booth?

      No recent polls found