Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses


by Marshall (Abbot)
on Sep 28, 2012 at 05:49 UTC ( #996112=note: print w/replies, xml ) Need Help??


The problem with a tab delimited file is that the tabs are hard to see in a normal text editor. Is that '" "\t"' or '" "\t" or whatever?

So the basic problem is that tabs are not easily "visible". My programming editor also converts "tabs" to "spaces" when I write a program file. No program file that I work with has tab characters in it. When I "save it" all the tabs disappear.

There is not a "standard" for the number of spaces for a tab character. In the "olden days", this made a difference because it saved disk space. This makes no difference now. Or in a practical sense, the space saving makes no difference. And it is "hard to read" the output.

Many of the DB output formats that I work with use "|" as the field separator. That is not a valid character for a name or an address. This works well for many types of DB fields that you might want to import/export and you can just use a simple split() for input. Perl has a number of .CSV parsers and they do work very, very well. That is another option.

This tab idea is a problem because it is hard to see! Yes, I can deal with it and I can set editor settings to allow me to see the difference between 2 spaces versus one space and tab, but this is a hassle.

Replies are listed 'Best First'.
by tobyink (Abbot) on Sep 28, 2012 at 07:06 UTC

    In many situations it's hard to tell the difference between lower-case L and the number one, or upper-case O and naught (the proximity of the latter pair on the keyboard makes this a particularly dangerous issue). But I don't eschew those characters; I choose fonts that make it easier to distinguish between them, and my text-editor's syntax highlighting will often (though not always) catch the difference. The tools can save you if you let them.

    Similarly my text editor has an easy toggle (Ctrl+Shift+A) which can be done with one hand (almost with one finger on this laptop keyboard!) to show or hide whitespace characters (and Ctrl+Shift+D does line break characters) when I need to do a quick visual check.

    But for the most part, when working on files that I've authored, I don't need to visually check because I know which characters will be tabs, and which will be spaces. In source code, the indents will all be tabs, and everywhere else will be spaces.

    'There is not a "standard" for the number of spaces for a tab character.'

    Indeed; that's kind of the point of them. You can set tab stops to whatever is most convenient for you. I like to use 3 column tab stops; other people might prefer 2, 4 or 8. If we all use a single tab character to indent source code, then we can all work on the same source code and see it with our preferred indentation.

    "Or in a practical sense, the space saving makes no difference."

    Indeed; if I were using tabs as a compression mechanism, I'd be an idiot. (Bzip2 works much better.) But that's not what I use them for; I use them because they make more sense in certain contexts (delimiting fields; indenting source code) than space characters. If there were a filesize penalty for using tabs, I'd continue to use them.

    perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'
      There is a disagreement of opinion here - not any disagreement on the facts of the situation.

      Ok, there is more than one way to do it. I think fine.

      I personally prefer fixed width font and no tabs within code. My normal program editor actually converts tabs to the appropriate number of spaces when I save the code to a file. I indent the code like I want. When I work in MS Visual Studio, it doesn't do that and I find it annoying - sometimes I want to take a MS .C file and use it on a Unix system and then we get into this "how many spaces does a tab mean?" thing. You see it as a plus. I see it as a hassle.

      So I guess mileage varies. I have personally found the "|" (pipe character) to be a good field separator in many circumstances. When that doesn't work, then I go to full blown CSV with all the complications that involves. But there are some very good Perl modules that can parse this out albeit slower than simple split() or match global.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://996112]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (5)
As of 2018-05-23 19:28 GMT
Find Nodes?
    Voting Booth?