Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re: Thoughts on designing a file format.

by greenFox (Vicar)
on Sep 13, 2005 at 06:15 UTC ( #491485=note: print w/ replies, xml ) Need Help??


in reply to Thoughts on designing a file format.

All good points ++

I read a paper once that explained very clearly why the two character line endings (CR, LF) in DOS was a mistake, I have no idea where it was but Wikipedia echoes the sentiment. Either way documenting it is OK but using the line endings appropriate for your OS is a better approach.

Two points I would add

  • Allow comments -my preference is for # comments
  • Ignore blank and whitespace only lines

Hence any data file parsing I do usually ends up beginning like this-

next if /^\s*#/; next if /^\s*$/;

--
Murray Barton
Do not seek to follow in the footsteps of the wise. Seek what they sought. -Basho


Comment on Re: Thoughts on designing a file format.
Download Code
Replies are listed 'Best First'.
Re^2: Thoughts on designing a file format.
by demerphq (Chancellor) on Sep 13, 2005 at 07:27 UTC

    I prefer to use network line endings because that is the standard netowrk line ending, and because quite simply there will come a day when your file needs to be read by someone whos most advanced tool for reading it will be Excel. Likewise I tend to use csv so that cut and pasting from the file to an Excel workbook works correctly, not to mention the fact that for the type of data I use embedded tabs are never a problem, but occasionally embedded commas are.

    ---
    $world=~s/war/peace/g

      I'm missing something here. On DOS if you write print FILE "some text\n"; you will get "\r\n" in the file. If you do the same thing on Unix you get just "\n". What are you outputing? Are you setting $INPUT_RECORD_SEPERATOR and $OUTPUT_RECORD_SEPERATOR to something other than default? Otherwise chomp is going to break for example, it will remove "\r\n" on DOS and just "\n" on Unix leaving a "\r" at the end of every line. It seems like a lot of trouble to deal with something that ftp clients do automatically... if I copy your program and data file over to Unix I have to then change the line endings back to CR/LF before it works???

      --
      Murray Barton
      Do not seek to follow in the footsteps of the wise. Seek what they sought. -Basho

Re: Thoughts on designing a file format.
by jonadab (Parson) on Sep 13, 2005 at 17:30 UTC
    I read a paper once that explained very clearly why the two character line endings (CR, LF) in DOS was a mistake

    Now, let me explain why two-character line endings in DOS was *not* a mistake...

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://491485]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (11)
As of 2015-07-08 06:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (94 votes), past polls