Add cross-assembler and linker for FLUX bytecode by SuperInstance · Pull Request #23 · SuperInstance/flux-runtime

SuperInstance · 2026-04-13T03:27:44Z

Cross-assembler with 100+ opcodes, label resolution (@Label and name: syntax), macros (#define, #ifdef, .include), multiple output formats (binary, hex, JSON, Intel HEX, Python list), linker with symbol resolution and relocation, binary patcher, and ELF header generation.

Features

Cross-assembler: Two-pass assembly with 100+ opcodes across 8 categories (control flow, integer/float arithmetic, bitwise, comparison, stack, SIMD, A2A protocol, trust/capabilities)
Label resolution: Both name: and @label syntax with forward references
Macro preprocessor: #define, #ifdef/#ifndef/#else/#endif, #undef, .set, .include
Output formats: Raw binary, hex string, JSON with metadata, Intel HEX, Python list
Linker: Object file serialization (FLUXOBJ format), multi-file linking, symbol resolution, relocation tables
BinaryPatcher: Post-assembly binary patching with undo support
Branch aliases: BEQ, BNE, BLT, BGE, BGT, BLE for common conditional jumps
Arithmetic aliases: ADD, SUB, MUL, DIV, MOD, NEG, NOT, AND, OR, XOR, SHL, SHR
Comment styles: ;, //, and # (non-preprocessor) comment support
Data directives: .byte, .word, .dword, .ascii, .asciz, .fill, .align, .org

Tests

96 tests covering all features — errors, macros, assembler, patcher, linker, ELF headers, and integration.

@Label

- Full cross-assembler with 100+ opcodes, label resolution (@Label and name:), two-pass assembly - Macro preprocessor: #define, #ifdef/#ifndef/#else/#endif, .set, .include - Multiple output formats: binary, hex, JSON, Intel HEX, Python list - Linker with object file serialization, symbol resolution, relocation table - BinaryPatcher for post-assembly binary patching - ELF-like header generation - Branch aliases: BEQ/JE, BNE/JNE, BLT/JL, BGE/JGE, BGT/JG, BLE/JLE - Arithmetic aliases: ADD, SUB, MUL, DIV, MOD, NEG, NOT, AND, OR, XOR, SHL, SHR - # comment support (non-preprocessor lines) - Data directives: .byte, .word, .dword, .ascii, .asciz, .fill, .align, .org - SIMD vector ops, A2A protocol ops, trust/capability ops, float ops - 96 tests covering all features

beta-devin-ai-integration

Devin Review found 7 potential issues.

View 7 additional findings in Devin Review.

beta-devin-ai-integration · 2026-04-13T03:36:03Z

+
+        # Build record: :LLAAAATT[DD...]CC
+        # LL = byte count, AAAA = address, TT = record type (00=data)
+        checksum = chunk_size + (addr >> 8) & 0xFF + addr & 0xFF + 0x00


🔴 Intel HEX checksum calculation incorrect due to operator precedence

The checksum calculation at src/flux/asm/cross_assembler.py:582 uses chunk_size + (addr >> 8) & 0xFF + addr & 0xFF + 0x00. In Python, + has higher precedence than &, so this evaluates as (chunk_size + (addr >> 8)) & (0xFF + addr) & (0xFF) instead of the intended chunk_size + ((addr >> 8) & 0xFF) + (addr & 0xFF). Verified: for addr=16, chunk_size=2, the buggy expression yields 2 while the correct value is 18. This produces corrupt Intel HEX output for any data larger than 16 bytes.

Suggested change

checksum = chunk_size + (addr >> 8) & 0xFF + addr & 0xFF + 0x00

checksum = chunk_size + ((addr >> 8) & 0xFF) + (addr & 0xFF) + 0x00

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

beta-devin-ai-integration · 2026-04-13T03:36:04Z

+                          shdr_offset: int, timestamp: float) -> bytes:
+        """Build the ELF64-like file header."""
+        ident = bytearray(16)
+        ident[0:4] = FLUX_MAGIC


🔴 FLUX_MAGIC is 5 bytes but assigned to 4-byte slice, causing 65-byte header

FLUX_MAGIC = b"\x7fFLUX" is 5 bytes, but ident[0:4] = FLUX_MAGIC at src/flux/asm/elf_header.py:243 assigns it into a 4-byte slice. Python's bytearray slice assignment resizes the array, making ident 17 bytes instead of 16. This cascades: header[0:16] = ident makes header 65 bytes instead of 64. The rest of generate() assumes a 64-byte header for offset calculations (phdr_offset = header_size = 64), so all program headers and section data are misaligned by 1 byte in the output binary.

Suggested change

ident[0:4] = FLUX_MAGIC

ident[0:5] = FLUX_MAGIC

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

beta-devin-ai-integration · 2026-04-13T03:36:05Z

+        elif line.startswith(".set"):
+            self._handle_set(line, loc)
+        elif line.startswith(".include"):
+            self._handle_include(line, loc, filename)


🔴 .include directive discards preprocessed content — included files have no effect

_handle_include (line 231) returns the preprocessed included file content as a str, but _handle_directive (line 128) calls it without capturing the return value: self._handle_include(line, loc, filename). Since _handle_directive returns None and the caller in preprocess() does continue after calling it, the included file's assembly lines are completely lost. Only side effects on self.macros persist — no code from the included file is emitted.

Prompt for agents

The _handle_include method returns the preprocessed included file content as a string, but _handle_directive discards the return value at line 128. The output_lines list in the preprocess() method never receives the included content. To fix this, the architecture needs rethinking. One approach: instead of returning the content, _handle_include should directly append to output_lines. But output_lines is local to preprocess(). Options: 1. Make output_lines an instance variable so _handle_include's recursive preprocess() call can contribute to it. 2. Have _handle_directive return the included content, and have preprocess() check and append it to output_lines. 3. Change _handle_include to not recursively call preprocess() but instead inline the include content into the current preprocess()'s line list. The simplest fix might be option 2: have _handle_directive return Optional[str], and in preprocess(), capture the return and split/extend output_lines when non-None.

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

beta-devin-ai-integration · 2026-04-13T03:36:06Z

+                if stripped.startswith("#") or stripped.startswith(".set ") or stripped.startswith(".include "):
+                    self._handle_directive(stripped, loc, filename)
+                    continue


🔴 #define, #undef, .set, .include processed inside inactive #ifdef blocks

In preprocess() (lines 85-87), all preprocessor directives are dispatched to _handle_directive without checking self.is_active. The is_active guard at line 90 only protects non-directive lines. This means #define, #undef, .set, and .include inside inactive #ifdef blocks are still executed. Verified: #ifdef UNDEFINED\n#define LEAKED 42\n#endif results in LEAKED being defined even though the conditional block is inactive.

Prompt for agents

In _handle_directive (macros.py:111-129), the conditional directives (#ifdef, #ifndef, #else, #endif) must always be processed to maintain the conditional stack, but #define, #undef, .set, and .include should only be processed when self.is_active is True. The fix should add a guard in _handle_directive after the conditional directives are handled. For example, after the #else elif block and before the #undef elif block, add: elif not self.is_active: return # Skip non-conditional directives in inactive blocks This ensures the conditional stack is always maintained correctly while preventing side effects from #define/#undef/.set/.include inside inactive blocks.

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

beta-devin-ai-integration · 2026-04-13T03:36:07Z

+        elif directive == ".ascii":
+            match = re.search(r'"([^"]*)"', line)
+            return len(match.group(1)) if match else 0


🔴 .ascii/.asciz size estimate doesn't account for escape sequences, corrupting label addresses

_estimate_directive_size (line 322-324) returns the raw string length from the regex match for .ascii/.asciz directives, but _emit_directive processes escape sequences via _unescape_string, which converts two-character sequences like \t and \n into single bytes. This causes Pass 1 label addresses to be larger than the actual emitted byte count in Pass 2. Verified: .ascii "hello\tworld\n" estimates 14 bytes but emits 12, causing subsequent labels to have incorrect offsets (e.g., end label recorded at offset 14 but HALT actually emitted at offset 12).

Suggested change

elif directive == ".ascii":

match = re.search(r'"([^"]*)"', line)

return len(match.group(1)) if match else 0

elif directive == ".ascii":

match = re.search(r'"([^"]*)"', line)

return len(self._unescape_string(match.group(1)).encode("utf-8")) if match else 0

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

beta-devin-ai-integration · 2026-04-13T03:36:08Z

+        struct.pack_into("<H", header, 56, n_phdrs)
+        struct.pack_into("<H", header, 58, ELF64_SECTION_HEADER_SIZE)
+        struct.pack_into("<H", header, 60, n_sections)
+        struct.pack_into("<H", header, 62, 4)  # shndx of .shstrtab (index 4 in our layout)


🟡 shstrtab section index hardcoded as 4 but is actually at index 3

At src/flux/asm/elf_header.py:265, the section header string table index (e_shstrndx) is hardcoded as 4, but all_sections at line 151 is [null(0), code(1), data(2), strtab(3), symtab(4), ...]. The .shstrtab section is at index 3, not 4. This causes ELF loaders/readers to look at the .symtab section instead of .shstrtab for section name resolution.

Suggested change

struct.pack_into("<H", header, 62, 4) # shndx of .shstrtab (index 4 in our layout)

struct.pack_into("<H", header, 62, 3) # shndx of .shstrtab (index 3 in our layout)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

beta-devin-ai-integration · 2026-04-13T03:36:09Z

+        """Build a string table for section names."""
+        table = bytearray(b'\x00')  # Start with null byte
+        for name in names:
+            table.extend(name.encode("utf-8"))
+            table.append(0x00)
+        return bytes(table)


🟡 String table has extra leading null byte causing all section name indices to be off by 1

_build_string_table (line 312) starts with a \x00 byte, then iterates through names appending each name + null. Since names starts with "" (the null section), this produces two consecutive null bytes at the start. The _build_section_header name index calculation (line 290-295) iterates through all_names accumulating len(name) + 1 without accounting for the extra initial null byte. Verified: .flux.code is at actual table offset 2, but the computed name_idx is 1 (pointing to an empty string).

Suggested change

"""Build a string table for section names."""

table = bytearray(b'\x00') # Start with null byte

for name in names:

table.extend(name.encode("utf-8"))

table.append(0x00)

return bytes(table)

def _build_string_table(self, names: list[str]) -> bytes:

"""Build a string table for section names."""

table = bytearray()

for name in names:

table.extend(name.encode("utf-8"))

table.append(0x00)

return bytes(table)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

beta-devin-ai-integration bot reviewed Apr 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add cross-assembler and linker for FLUX bytecode#23

Add cross-assembler and linker for FLUX bytecode#23
SuperInstance wants to merge 1 commit intomainfrom
superz/cross-assembler

SuperInstance commented Apr 13, 2026 •

edited by beta-devin-ai-integration bot

Loading

Uh oh!

beta-devin-ai-integration bot left a comment

Uh oh!

beta-devin-ai-integration bot Apr 13, 2026

Uh oh!

beta-devin-ai-integration bot Apr 13, 2026

Uh oh!

beta-devin-ai-integration bot Apr 13, 2026

Uh oh!

beta-devin-ai-integration bot Apr 13, 2026

Uh oh!

beta-devin-ai-integration bot Apr 13, 2026

Uh oh!

beta-devin-ai-integration bot Apr 13, 2026

Uh oh!

beta-devin-ai-integration bot Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	checksum = chunk_size + (addr >> 8) & 0xFF + addr & 0xFF + 0x00
	checksum = chunk_size + ((addr >> 8) & 0xFF) + (addr & 0xFF) + 0x00

	struct.pack_into("<H", header, 62, 4) # shndx of .shstrtab (index 4 in our layout)
	struct.pack_into("<H", header, 62, 3) # shndx of .shstrtab (index 3 in our layout)

Conversation

SuperInstance commented Apr 13, 2026 • edited by beta-devin-ai-integration bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Features

Tests

Uh oh!

beta-devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

beta-devin-ai-integration bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

beta-devin-ai-integration bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

beta-devin-ai-integration bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

beta-devin-ai-integration bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

beta-devin-ai-integration bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

beta-devin-ai-integration bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

beta-devin-ai-integration bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

SuperInstance commented Apr 13, 2026 •

edited by beta-devin-ai-integration bot

Loading