Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
best and fastest way to read lines from a file?
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2007-10-01 (21:27)
From: YC <yinso.chen@g...>
Subject: best and fastest way to read lines from a file?
Hi all -

Newbie question: I'm wondering what's the most efficient way to read in a
file line by line?  I wrote a routine in both python and ocaml to read in a
file with 345K lines to do line count and was surprised that python's code
run roughly 3x faster.

I thought the speed should be equivalent and/or somewhat in ocaml favor,
given this is an IO-bound comparison, but perhaps Python's simplistic for
loop have a read-ahead buffer built-in, and perhaps ocaml's input channel is
unbuffered, but I'm not sure how to write a buffered code that can do a line
by line read-in.

Any insight is appreciated, thanks ;)


Python code:

file = <345k-line.txt>
count = 0
for line in open (file, "r"):
    count = count + 1
print "Done: ", count

OCaml code:
(* *)
let rec line_count filename =
  let f = open_in filename in
  let rec loop file count =
      ignore (input_line file);
      loop file (count + 1)
      End_of_file -> count
    loop f 0;;

let count = line_count <345k-line.txt> in
    Printf.printf "Done: %d" count;;

$ time ./
Done: 345001

real    0m0.416s
user   0m0.101s
sys    0m0.247s

$ ocamlopt -o test
$ time ./test
Done: 345001
real    0m1.483s
user   0m0.631s
sys    0m0.685s