So, how about dealing with the memory issue? I thought the beauty of Haskell was that it was non-strict or lazy. That it dealt with infinite lists. Then why does getContents insist on loading the whole darn file?
I was curious about why this happened, so I tried my own variations to see if I could overcome it. Nothing I tried worked; Hugs ran out of memory with a large file every time. So I did some more searching online and through the Haskell mailing list. There's a thread that talks about the 'wc' program, and gives the reasons why it uses so much memory.
In essence, it's not that Haskell is not being lazy enough and reading in the entire file. The problem is that it's being too lazy, and not evaluating the '+1's in the recursive function call. So it keeps all these '+1's around in memory, waiting to be evaluated until it hits the base case, but you run out of memory before it gets there.
The solution they came up with on the mailing list was to create a data type to hold the stats, and put strictness markers on the data type so that the '+1's would get evaluated immediately. Below is your example converted to that style. The strictness markers are the exclamation points before the parametric types in the data constructor Stats. They tell Haskell to evaluate whatever value is about to be put into that slot in the Stats value being created, instead of keeping the "thunks" around to evaluate later. This version does run in constant memory space with a huge file.
data Stats = Stats !Int !Int !Int
wcf :: Stats -> String -> Stats
wcf stats [] = stats
wcf (Stats cc w lc) (' ' : xs) = wcf (Stats (cc+1) (w+1) (lc ) ) xs
wcf (Stats cc w lc) ('\t' : xs) = wcf (Stats (cc+1) (w+1) (lc ) ) xs
wcf (Stats cc w lc) ('\n' : xs) = wcf (Stats (cc+1) (w+1) (lc+1) ) xs
wcf (Stats cc w lc) ( x : xs) = wcf (Stats (cc+1) (w ) (lc ) ) xs
wc :: IO()
wc = do
putStr ( "Filename: " )
name <- getLine
contents <- readFile name
let (Stats cc w lc) = wcf (Stats 0 0 0) contents
putStrLn ( "The file " ++ name ++ " has " )
putStrLn ( show cc ++ " chars " )
putStrLn ( show w ++ " words " )
putStrLn ( show lc ++ " lines." )
main = wc