Skip to content

Interfaces for writing to files using mmap#237

Draft
pranav1344 wants to merge 4 commits intoMPLLang:mainfrom
pranav1344:file
Draft

Interfaces for writing to files using mmap#237
pranav1344 wants to merge 4 commits intoMPLLang:mainfrom
pranav1344:file

Conversation

@pranav1344
Copy link

@pranav1344 pranav1344 commented Feb 7, 2026

Add functionality to file.sig and file.sml to allow writes through mmap and mmunmap to replace PosixWriteFile.

(WIP and open to review)

@pranav1344
Copy link
Author

I've added writing to files using records as well as an example for writing in parallel to examples/lib/WriteFile.sml

Running a primitive test on this write functionality using this code by generating 1GB arrays containing just a single letter gives this result:

Processes Filename User Time (s) System Time (s) CPU % Elapsed Time (s)
1 1.txt 2.02 0.10 91% 2.306
2 2.txt 2.10 0.08 196% 1.112
4 4.txt 2.24 0.12 383% 0.615
8 8.txt 2.77 0.25 711% 0.424

…mmap and virtual memory changes to support the interfaces.
@pranav1344
Copy link
Author

I'm not too sure if this part is too convoluted, but I wanted it to have parity with the read functionality, which supports reading from a certain offset.

val writeChar : {file: t, file_offset: int, array_slice_offset: int} -> char -> unit
val writeWord8s : {file: t, file_offset: int, array_slice_offset: int} -> Word8.word ArraySlice.slice -> unit 

@pranav1344 pranav1344 marked this pull request as ready for review February 12, 2026 12:54
@pranav1344 pranav1344 marked this pull request as draft February 12, 2026 12:54
val readChars: t -> int -> char ArraySlice.slice -> unit
val readWord8s: t -> int -> Word8.word ArraySlice.slice -> unit
val writeChar : {file: t, file_offset: int, array_slice_offset: int} -> char -> unit
val writeWord8s : {file: t, file_offset: int, array_slice_offset: int} -> Word8.word ArraySlice.slice -> unit
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I find the two offsets confusing; I think it would be easier to just pass a single offset.

My understanding is that openFileWriteable s n gives us a file of n bytes. So, then, writeChar should be able to pass a single offset, somewhere in the range [0,n).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two offsets are to handle the index to which the buffer should start writing and the second index is the index from which the array slice is supposed to write. It was a case of leaking abstraction and was probably user hostile, so I've changed the interface to take in just an offset where the file starts writing file_offset and the array slice.

The new interfaces are

  val writeChar : {file: t, file_offset: int} -> char -> unit
  val writeWord8s : {file: t, file_offset: int} -> Word8.word ArraySlice.slice -> unit

fun openFileWriteable path final_size =
let
open Posix.FileSys
val file = createf (path, O_RDWR, O.append, S.flags [S.irusr, S.iwusr, S.irgrp, S.iroth])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious, what happens here if the file already exists? Does createf behave like openf in that case?

Copy link
Author

@pranav1344 pranav1344 Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. In case the file already exists, it works as openf and opens the file in append mode. The user can then start writing from where the file originally ended.

open Posix.FileSys
val file = createf (path, O_RDWR, O.append, S.flags [S.irusr, S.iwusr, S.irgrp, S.iroth])
val fileSize = Position.toInt (ST.size (fstat file))
val size = final_size + fileSize
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this just be final_size ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was to take the size of the expected input to the file from the user in file_size and then append it to the file, the code then returns the increased size of the file mapped to the memory as well as the "offset" from there user can start writing to the file. I've renamed this variable to buffer_size for better clarity on what the code is trying to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments