Amen to that! The more languages I learn, the more I can see the strengths and weaknesses of each language. | [reply] |
Your solution also breaks down if there is punctuation in the file. (OS HP-UX 11.0)
File
This is a test file. How many unique words are in this
file? Do you know? Does the file contain more than
ten words?
Results
1
1 a
1 are
1 contain
1 do
1 does
1 file
1 file.
1 file?
1 how
1 in
1 is
1 know?
1 many
1 more
1 ten
1 test
1 than
1 the
2 this
1 unique
1 words
1 words?
1 you
Update: Changed the test file. | [reply] [d/l] [select] |
| [reply] |
Here are the requirements I was given...
Program Purpose
The goal of the program is to count the occurrences of all words in a file, and write this count into a new file.
Requirements
- The input file will contain 1 word per line (lines will be terminated by the newline character), and the file will contain an arbitrary number or lines.
- The file will be terminated by an end of file character.
- The word count must be case insensitive, as there may be varying case throughout the file.
- The output file must write each word once, and include the number of occurrences of that word on the same line.
- The lines in the output file must be sorted in ascending order.
Sample Input:
Chicago
Paris
chicago
London
red
blue
Green
Red
REd
london
Sample output:
blue;1
Chicago;2
Green;1
London;2
Paris;1
red;3
| [reply] |
So your original solution works for the narrow scope of the requirements. It fails if the requirement that there is one word per line is changed. This explains perfectly why the questions above arose about lines versus words -- according to the spec, they can be considered the same.
Now only one question remains. Do you code to exactly match a questionable spec? Or, more to the point, wouldn't it be better to code something which works according to the exact spec plus gets the behavior right if the questionable part of the spec is changed?
I think that when possible, a restrictively narrow spec should be answered with a more general solution which works for the spec at hand and future likely changes. In some instances, the future likely cases are hard to determine. In this one they are not. In the spirit of a job interview, I'd like to see either both ways implemented, or a comment in the code that one way was chosen over the other because of the nature of the spec.
Of course, redsquirrel, since you already went above and beyond what the question asked it wouldn't be fair to complain that you didn't do even more work. I'm just making points about more general cases again. ;-)
Come to think of it, it seems that much of my life as a programmer, and even much of my life besides programming (and probably because of habits learned from programming) is about making solutions which already work for one case more general. I think this is probably a goal of a large percentage of programming effort overall.
Update: fixed a tpyo.
| [reply] |