New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot close properly a file with memory mapped blocks (Bigarray.map_file) #6567
Comments
Comment author: Christoph Bauer My current solution is: I defined a flush function: external flush : ('a, 'b, 'c) Bigarray.Array1.t -> unit = "ml_flush_bigarray" It is implemented like this: #ifdef WIN32 static void ml_flush_mapped_bigarray(void * addr, uintnat len) GetSystemInfo(&sysinfo); FlushViewOfFile((void )((uintnat)addr - delta), #else static void ml_flush_mapped_bigarray(void * addr, uintnat len) #if defined(_POSIX_SYNCHRONIZED_IO) #endif CAMLprim value ml_flush_bigarray( value v ) if( b->flags & CAML_BA_MANAGED_MASK ) { CAMLreturn( Val_unit ); |
Comment author: Christoph Bauer If needed I could prepare a patch against the OCaml sources. |
Comment author: @damiendoligez Christoph, what does your |
Comment author: Christoph Bauer I my case I must do both:flush the file and close the file handle with Unix.close. Then everything works. You can rename the file it contains the content you expected. Just closing the file is not enough. |
Comment author: gerd This issue isn't restricted to Windows. On Unix, you need the msync() call (as in the patch), which in particular writes NFS-mounted files out. Actually, POSIX doesn't specify that modifications of mapped regions are immediately visible in file operations at all (not even for local files, although current OS guarantee this). However, I suggest to use the MSYNC_SYNC flag, because the programmer normally wants a guarantee that the synchronization is complete at some point. Of course, there is no locking problem on Unix. It's just a synch problem, i.e. file reads could see old data. |
Comment author: Christoph Bauer I agree with the MSYNC_SYNC flag. I edited the patch above. |
Comment author: @alainfrisch The Windows version is not synchronous, isn't it? MSDN says one should call FlushFileBuffers to get something closer to MS_SYNC (for local files at least), but we need to know the file handle. Can you also comment on the locking part for network files? Is FlushViewOfFile followed by close enough / required so that the server releases the lock on the file? Do you have some references to document this behavior? It's somehow surprising that closing is not enough... |
Comment author: Christoph Bauer The patch solved our problems (file locks or files full of zeros) reliable. The problem, which was mentioned in the MSDN documentation is, that the file is not written to disk on the remote system, but the data could be still is the cache. Also for Linux you cannot sync reliable over the network AFAIK. I would say: forget it. Unfortunately I have longer access to the windows setup, where I could reproduce the problem. |
Comment author: @alainfrisch So the function would be synchronous on Unix, waiting for the data to on disk; while on Windows, it would be asynchronous, with no guarantee of completion (even for local files?). If we go this direction, this should be properly documented. In order to move forward on this one, can you prepare a pull request, probably adding the flush function everywhere where there is a map_file function in Bigarray? |
Comment author: @xavierleroy Two remarks and one question on the patch:
|
Comment author: Christoph Bauer I made a patch against the sources of 4.02.3 (bigarray_flush2.patch). The new flush function gets now a second [sync] parameter. It raises an exception on MS Windows when sync is true. The async (sync == false) call should work on all systems. Also the three remarks from 0015089 are implemented (correct flag, rounding on unix, Sys_error). Thanks for the hints! Patch can be applied with: Unfortunately I cannot test this patch against in the original setup, because I have no longer access to it. |
Comment author: @gasche Xavier or Alain, do you think we could take a decision on this patch? Given that how late we are in the 4.03 release cycle, I think it's too late to integrate this in the release branch. I would, however, be happy to merge it in trunk if you have no qualms against it -- it's a more satisfying solution than just postponing. |
Comment author: @damiendoligez I'm in favour of merging into trunk. |
Comment author: @damiendoligez What has changed for this PR now that #1077 ( #1077 ) is merged? |
Comment author: @xavierleroy
|
Comment author: @xavierleroy See pull request at #1432 |
This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc. |
The explicit "flush" operation was implemented in PR #1432, then got such a barrage of questions from reviewers (hi @alainfrisch) that I rage-closed it. Anyone who cares about this issue is welcome to start again from where I left. |
Original bug ID: 6567
Reporter: Christoph Bauer
Assigned to: @xavierleroy
Status: assigned (set by @xavierleroy on 2017-10-19T13:27:11Z)
Resolution: open
Priority: normal
Severity: minor
Platform: Windows MSVC
OS: Windows
OS Version: 7
Version: 4.01.0
Target version: 4.07.0+dev/beta2/rc1/rc2
Category: otherlibs
Monitored by: @hcarty gerd
Bug description
My program maps several blocks of a file into memory with Bigarray.Array1.map_file for writing.
Finally the file handle is closed with Unix.close.
On local files system everything works as expected. But if the file is on a network drive,
the resulting file is after the close still locked and it contains just zeros.
My workaround is, to call Gc.full_major() before the close. Then all remaining bigarrays are
finalized (the finalizer calls caml_ba_unmap_file(), see bigarray_stubs.c).
Calling Gc.full_major () is somewhat dissatisfying. But I’m not sure how it could be solved more cleanly.
I would expect that Unix.close should unmap all remaining views of the file. But Unix.close knows nothing
about the bigarray mapping.
Maybe a special Bigarray.unmap is the solution.
File attachments
The text was updated successfully, but these errors were encountered: