<?xml version="1.0" encoding="ISO-8859-1"?>

<!DOCTYPE message PUBLIC
  "-//MLarc//DTD MLarc output files//EN"
  "../../mlarc.dtd"[
  <!ATTLIST message
    listname CDATA #REQUIRED
    title CDATA #REQUIRED
  >
]>

  <?xml-stylesheet href="../../mlarc.xsl" type="text/xsl"?>


<message 
  url="2003/11/b6fed40b3f32e144dbacec0e814d0d52"
  from="Brian Hurt &lt;bhurt@s...&gt;"
  author="Brian Hurt"
  date="2003-11-10T01:49:08"
  subject="Re: [Caml-list] Arbitrarily throwing End_of_file"
  prev="2003/11/a33bcd6e98a9ec945c42f2e24c4f7045"
  next="2003/11/a4fa114bc35fac9ad8b2bce95978fe86"
  prev-in-thread="2003/11/09e582c31663ff8716cce84f773350e2"
  prev-thread="2003/11/01e1b29e39f68e928829da73207a8c30"
  next-thread="2003/11/c950cf7764ddb8e256d4f95c9c48ff10"
  root="../../"
  period="month"
  listname="caml-list"
  title="Archives of the Caml mailing list">

<thread subject="[Caml-list] Arbitrarily throwing End_of_file">
<msg 
  url="2003/11/09e582c31663ff8716cce84f773350e2"
  from="Michael Hoisie &lt;mbh@O...&gt;"
  author="Michael Hoisie"
  date="2003-11-10T01:16:41"
  subject="[Caml-list] Arbitrarily throwing End_of_file">
<msg 
  url="2003/11/b6fed40b3f32e144dbacec0e814d0d52"
  from="Brian Hurt &lt;bhurt@s...&gt;"
  author="Brian Hurt"
  date="2003-11-10T01:49:08"
  subject="Re: [Caml-list] Arbitrarily throwing End_of_file">
</msg>
</msg>
</thread>

<contents>
On Sun, 9 Nov 2003, Michael Hoisie wrote:

&gt; I have a file which is approximately 278,440 lines of text (more
&gt; specifically, it is the result of doing 'ls -lAR /')

-l lists the file size in *bytes*, not lines.  Use 'wc -l longfile.dat' to
determine the number of lines.  If each line is ~10.6 bytes long
(including the EOLN) then a 278,000 byte file will be about 26,000 lines
long.  The -A means "almost all" (everything except . and ..), and the R
means recursive (list subdirectories as well).

&gt; 
&gt; I was trying to write this relatively simple program to analyze it but
&gt; it seems that End_of_file was thrown very early.
&gt; 
&gt; To test, it, I made a simple function:  
&gt; 
&gt; let rec count_lines file n =
&gt;     try let str = input_line file in
&gt;     count_lines file (n + 1)
&gt;     with End_of_file -&gt; Printf.printf "The file is %d\n lines long" n

This function isn't tail recursive- the function's call to itself is
within a try/with block, which breaks the tail recursion.  That isn't the
problem you're hitting, but you're not far from hitting it.  I generally
hit it about 30,000 functions deep or so.  Try the following instead:

let rec count_lines file n =
    let line, eof = try (input_line file), false
                   with End_of_file -&gt; "", true
    in
    if not eof then
        begin
            (* do something with line here *)
            count_lines file (n + 1)
        end
    else
        n

let file = open_in "longfile.dat" in
Printf.printf "The file is %d lines long.\n" (count_lines file 0)

Note that the tail recursion is now outside of the try/with block, and 
this function will work with any length file.

Brian


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

</contents>

</message>

