Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: Thoughts on designing a file format.

by greenFox (Vicar)
on Sep 13, 2005 at 06:15 UTC ( [id://491485]=note: print w/replies, xml ) Need Help??


in reply to Thoughts on designing a file format.

All good points ++

I read a paper once that explained very clearly why the two character line endings (CR, LF) in DOS was a mistake, I have no idea where it was but Wikipedia echoes the sentiment. Either way documenting it is OK but using the line endings appropriate for your OS is a better approach.

Two points I would add

  • Allow comments -my preference is for # comments
  • Ignore blank and whitespace only lines

Hence any data file parsing I do usually ends up beginning like this-

next if /^\s*#/; next if /^\s*$/;

--
Murray Barton
Do not seek to follow in the footsteps of the wise. Seek what they sought. -Basho

Replies are listed 'Best First'.
Re^2: Thoughts on designing a file format.
by demerphq (Chancellor) on Sep 13, 2005 at 07:27 UTC

    I prefer to use network line endings because that is the standard netowrk line ending, and because quite simply there will come a day when your file needs to be read by someone whos most advanced tool for reading it will be Excel. Likewise I tend to use csv so that cut and pasting from the file to an Excel workbook works correctly, not to mention the fact that for the type of data I use embedded tabs are never a problem, but occasionally embedded commas are.

    ---
    $world=~s/war/peace/g

      I'm missing something here. On DOS if you write print FILE "some text\n"; you will get "\r\n" in the file. If you do the same thing on Unix you get just "\n". What are you outputing? Are you setting $INPUT_RECORD_SEPERATOR and $OUTPUT_RECORD_SEPERATOR to something other than default? Otherwise chomp is going to break for example, it will remove "\r\n" on DOS and just "\n" on Unix leaving a "\r" at the end of every line. It seems like a lot of trouble to deal with something that ftp clients do automatically... if I copy your program and data file over to Unix I have to then change the line endings back to CR/LF before it works???

      --
      Murray Barton
      Do not seek to follow in the footsteps of the wise. Seek what they sought. -Basho

Re: Thoughts on designing a file format.
by jonadab (Parson) on Sep 13, 2005 at 17:30 UTC
    I read a paper once that explained very clearly why the two character line endings (CR, LF) in DOS was a mistake

    Now, let me explain why two-character line endings in DOS was *not* a mistake...

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://491485]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (3)
As of 2024-04-24 03:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found