Skip to content

__getdents_common Only Returns linux_dirent on 32-bit x86 #1328

@Z1pburg3r

Description

@Z1pburg3r

Describe the bug

When emulating 32-bit x86 programs, calls to readdir and readdir_r fail and throw UC_ERR_READ_UNMAPPED. After some debugging with GDB, the issue appears to center around Qiling's implementation of getdents64, which readdir and readdir_r call under the hood. Details and a potential fix included below.

Sample Code

The test file:

int main()
{
    DIR *dirp = NULL;
    struct dirent *entry = NULL;
    struct dirent *result = NULL;

    dirp = opendir("/");
    entry = malloc(offsetof(struct dirent, d_name) + NAME_MAX + 1);

    if(dirp == NULL)
    {
        puts("Couldn't open root...");
        return 1;
    }

    readdir_r(dirp);
    return 0;
}

Qiling emulation:

exe = '/path/to/test'.split()
rootfs = '/'
ql = qiling.Qiling(exe, rootfs)
ql.run()

Expected behavior

I'd expect Qiling to encounter a call to readdir or readdir_r and execute it successfully. More specifically, I'd expect Qiling to intercept the getdents64 syscall within these two functions and return the appropriately formatted struct.

Screenshots
N/A

Additional context

As mentioned in #773, getdents uses linux_dirent, whereas getdents64 uses linux_dirent64. The main difference between both structures is the type of their leading two members, d_ino and d_off. linux_dirent declares them as unsigned long and unsigned long (respectively), which will be stored in memory as 4-byte values on 32-bit x86. linux_dirent64 declares d_ino and d_off as ino64_t and off64_t (respectively), which are stored in memory as 8-byte values on 32-bit x86. I believe the bug I'm encountering lies in Qiling's general implementation for getdents -- __getdents_common -- which attempts to account for calls to both getdents and getdents64 with functionality depending on the is_64 parameter:

def __getdents_common(ql: Qiling, fd: int, dirp: int, count: int, *, is_64: bool):
    # TODO: not sure what is the meaning of d_off, should not be 0x0
    # but works for the example code from linux manual.
    #
    # https://stackoverflow.com/questions/16714265/meaning-of-field-d-off-in-last-struct-dirent

    def _type_mapping(ent):
        methods_constants_d = {
            'is_fifo'         : 0x1,
            'is_char_device'  : 0x2,
            'is_dir'          : 0x4,
            'is_block_device' : 0x6,
            'is_file'         : 0x8,
            'is_symlink'      : 0xa,
            'is_socket'       : 0xc
        }

        ent_p = pathlib.Path(ent.path) if isinstance(ent, os.DirEntry) else ent

        for method, constant in methods_constants_d.items():
            if getattr(ent_p, method)():
                t = constant
                break
        else:
            t = 0x0 # DT_UNKNOWN

        return bytes([t])

    if ql.os.fd[fd].tell() == 0:
        n = ql.arch.pointersize
        total_size = 0
        results = os.scandir(ql.os.fd[fd].name)
        _ent_count = 0

        for result in itertools.chain((pathlib.Path('.'), pathlib.Path('..')), results): # chain speical directories with the results
            d_ino = result.inode() if isinstance(result, os.DirEntry) else result.stat().st_ino
            d_off = 0
            d_name = (result.name if isinstance(result, os.DirEntry) else result._str).encode() + b'\x00'
            d_type = _type_mapping(result)
            d_reclen = n + n + 2 + len(d_name) + 1

            # TODO: Dirty fix for X8664 MACOS 11.6 APFS
            # For some reason MACOS return int value is 64bit
            try:
                packed_d_ino = (ql.pack(d_ino), n)
            except:
                packed_d_ino = (ql.pack64(d_ino), n)

            if is_64:
                fields = (
                    (ql.pack(d_ino), n),
                    (ql.pack(d_off), n),
                    (ql.pack16(d_reclen), 2),
                    (d_type, 1),
                    (d_name, len(d_name))
                )
            else:
                fields = (
                    packed_d_ino,
                    (ql.pack(d_off), n),
                    (ql.pack16(d_reclen), 2),
                    (d_name, len(d_name)),
                    (d_type, 1)
                )

            p = dirp
            for fval, flen in fields:
                ql.mem.write(p, fval)
                p += flen

            ql.log.debug(f"Write dir entries: {ql.mem.read(dirp, d_reclen)}")

            dirp += d_reclen
            total_size += d_reclen
            _ent_count += 1

        regreturn = total_size
        ql.os.fd[fd].seek(0, os.SEEK_END) # mark as end of file for dir_fd
    else:
        _ent_count = 0
        regreturn = 0

    ql.log.debug("%s(%d, /* %d entries */, 0x%x) = %d" % ("getdents64" if is_64 else "getdents", fd, _ent_count, count, regreturn))

    return 

The culprit seems to be this part:

if is_64:
    fields = (
        (ql.pack(d_ino), n),
        (ql.pack(d_off), n),
        (ql.pack16(d_reclen), 2),
        (d_type, 1),
        (d_name, len(d_name))
    )

Where n = ql.arch.pointersize. On 32-bit x86, n == 4, but the initial two values should be packed as 8-byte values. As a result, when readdir or readdir_r operate on the Qiling-packed structs after the getdents64 call, the functions improperly index d_reclen, subtract a constant from it (0x13, the offset to d_name for linux_dirent64), and then use it as the nbytes parameter to a memmove call. When d_reclen is indexed, it can be less than 0x13, so after the subtraction is performed, it can be a value a few bytes less than UINT_MAX. This means that when memmove is called, it will generate a segfault after violating a VMA boundary, which would throw UC_ERR_READ_UNMAPPED.

Potential Fix

If n == 8 and the first two members are packed as 8-byte values, then the emulation completes without issue. As a potential fix, I was thinking of implementing something like the following:

...

if ql.os.fd[fd].tell() == 0:
    if is_64 and ql.arch.bits == 32:
        n = ql.arch.pointersize * 2
    else:
        n = ql.arch.pointersize
    total_size = 0
    results = os.scandir(ql.os.fd[fd].name)
    _ent_count = 0

...

if is_64:
    fields = (
        (ql.pack64(d_ino), n),
        (ql.pack64(d_off), n),
        (ql.pack16(d_reclen), 2),
        (d_type, 1),
        (d_name, len(d_name))
    )

...

I'd like to know what you guys think.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions