Monday 9 August 2021

How can I use os.posix_fadvise to prevent file caching on Linux?

I have a script that generally operates on entire block devices, and if every block that gets read is cached, it will evict data being used by other applications. To prevent this from happening, I added support for using mmap(2) with posix_fadvise(2) with the following logic:

Function for indicating that blocks are no longer needed:

def advise_dont_need(fd, offset, length):
    """
    Announce that data in a particular location is no longer needed.

    Arguments:
    - fd (int): File descriptor.
    - offset (int): Beginning of the unneeded data.
    - length (int): Length of the unneeded data.
    """
    # TODO: macOS support
    if hasattr(os, "posix_fadvise"):
        # posix_fadvise(2) states that "If the application requires that data
        # be considered for discarding, then offset and len must be
        # page-aligned." When this code aligns the offset and length, the
        # advised area is widened under the presumption it is better to discard
        # more memory than needed than to leak it which could cause resource
        # issues.

        # If the offset is unaligned, extend it toward 0 to align it and adjust
        # the length to compensate for the change.
        aligned_offset = offset - offset % PAGE_SIZE
        length += offset - aligned_offset
        offset = aligned_offset

        # If the length is unaligned, widen it to align it.
        length -= length % -PAGE_SIZE

        os.posix_fadvise(fd, offset, length, os.POSIX_FADV_DONTNEED)

Logic that reads the file:

            with open(path, "rb", buffering=0) as file, \
              ProgressBar("Reading file") as progress, timer() as read_loop:
                size = file_size(file)

                if mmap_file:
                    # At the time of this writing, mmap.mmap in CPython uses
                    # st_size to determine the size of a file which will not
                    # work with every file type which is why file size
                    # autodetection (size=0) cannot be used here.
                    fd = file.fileno()
                    view = mmap.mmap(fd, size, prot=mmap.PROT_READ)

                try:
                    while writer.error is None and hash_queue.error is None:
                        # Skip offsets that are already in the block map.
                        if offset in blocks:
                            while offset in blocks:
                                if mmap_file:
                                    advise_dont_need(fd, offset, block_size)

                                offset += block_size

                            if not mmap_file:
                                file.seek(offset)

                        if mmap_file:
                            block = view[offset:offset + block_size]
                            advise_dont_need(fd, offset, len(block))
                        else:
                            block = file.read(block_size)

                        if not block:
                            break

                        bytes_read += len(block)

                        while hash_queue.error is None:
                            try:
                                hash_queue.put((offset, block), timeout=0.1)
                                offset += len(block)
                                progress.update(offset / size)
                                break
                            except queue.Full:
                                pass
                finally:
                    if mmap_file:
                        view.close()

When I run the script and monitor the output of free -h, I can see buffer cache usage increases despite this logic. Is my logic incorrect, or is this the result of posix_fadvise(2) being just that -- advice vs. a mandate?

Here are some logs showing the values of the length and offset toward the end of the script's execution with block_size set to 1048576:

offset=107296587776; length=1048576
offset=107297636352; length=1048576
offset=107298684928; length=1048576
offset=107299733504; length=1048576
offset=107300782080; length=1048576
offset=107301830656; length=1048576
offset=107302879232; length=1048576
offset=107303927808; length=1048576
offset=107304976384; length=0


from How can I use os.posix_fadvise to prevent file caching on Linux?

No comments:

Post a Comment