Skip to content

Enhance: 247-opcode ISA, execution tracing, bounds checking, error messages#3

Open
SuperInstance wants to merge 1 commit intomainfrom
superz/c-runtime-enhance
Open

Enhance: 247-opcode ISA, execution tracing, bounds checking, error messages#3
SuperInstance wants to merge 1 commit intomainfrom
superz/c-runtime-enhance

Conversation

@SuperInstance
Copy link
Copy Markdown
Owner

@SuperInstance SuperInstance commented Apr 13, 2026

Summary

Significant enhancement of the FLUX runtime C implementation:

Extended ISA (80 → 247 opcodes)

  • Single-register: ABS, SQRT
  • Register pair: MIN, MAX, CLZ, CTZ, POPCNT, BSWAP, SIGN_EXT, ZERO_EXT
  • Immediate arithmetic: ADDI, SUBI, MULI, ANDI, ORI, XORI, SHLI
  • Rotate: ROL, ROR
  • Extended branches: JLE, JGE, JEQ
  • Extended jumps: JMP_REG, CALL_REG, RET_IMM, TAILCALL_INDIRECT
  • Extended A2A: REQUEST, REPLY, SUBSCRIBE, PUBLISH, MERGE, SPLIT, GATHER, SCATTER
  • Float: FSQRT, FMOD, FEQ, FLT, FLE, FGT, FGE
  • Extended memory: LOAD16, STORE16, LOAD32, STORE32
  • Vector: VDOT, VMIN, VMAX, VAND, VOR, VXOR, VSHL, VSHR, VCMPEQ
  • Atomic: ATOMIC_LOAD/STORE/ADD/SUB/CMPXCHG/FENCE, MUTEX, SEMAPHORE
  • Crypto: HASH, HMAC, ENCRYPT, DECRYPT, SIGN, RANDOM, CRC32
  • I/O: IO_READ/WRITE, PRINT_INT/STR/FLOAT/HEX, SPRINTF
  • String: STR_LEN/CAT/CMP/COPY/SUB/FIND
  • Debug: TRACE_ON/OFF, LOG, ASSERT, PERF_BEGIN/END, WATCHPOINT

Execution Tracing

  • Configurable trace callback with cycle-based start and max-line limits
  • Default stderr tracer showing PC, opcode, registers, flags
  • Runtime trace enable/disable via TRACE_ON/TRACE_OFF opcodes

Safety Improvements

  • Bounds checking on all bytecode reads (PC boundary validation)
  • Register index bounds checking on all GP/FP register accesses
  • Stack overflow/underflow detection with descriptive messages
  • Bytecode payload OOB checks for variable-length opcodes

Better Error Messages

  • Human-readable error detail strings for all error conditions
  • Error PC and opcode captured on failure
  • Full error enum coverage in error_string()

Build

  • Debug build target with -g and FLUX_TRACE_ENABLED
  • All 39 existing tests pass (22 VM + 5 memory + 12 assembler)

Tests

  • 39 tests passing

Staging: Open in Devin

…nds checking, improved errors

- Extended ISA from ~80 to 247 opcodes
- Execution tracing with configurable callback
- Memory bounds checking on all bytecode reads
- Register bounds checking on all GP/FP accesses
- Stack overflow/underflow protection
- Human-readable error detail strings
- Opcode info table for debugging/disassembly
- Debug build target in Makefile
- All 39 existing tests pass
Copy link
Copy Markdown

@beta-devin-ai-integration beta-devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 3 potential issues.

View 6 additional findings in Devin Review.

Staging: Open in Devin

Comment thread src/vm.c
case FLUX_STORE8: { rd=f8(v); if(!v->running) break; rs1=f8(v); if(!v->running) break; CHECK_GPR(rs1); h=flux_mem_get(&v->mem,"heap"); if(h) flux_mem_write_u8(h,(size_t)GPR[rd],(uint8_t)GPR[rs1]); } break;
case FLUX_LOAD16: { rd=f8(v); if(!v->running) break; rs1=f8(v); if(!v->running) break; CHECK_GPR(rd); CHECK_GPR(rs1); h=flux_mem_get(&v->mem,"heap"); GPR[rd]=h?(int32_t)(uint16_t)(h->data[GPR[rs1]] | (h->data[GPR[rs1]+1]<<8)):0; } break;
case FLUX_STORE16: { rd=f8(v); if(!v->running) break; rs1=f8(v); if(!v->running) break; CHECK_GPR(rs1); h=flux_mem_get(&v->mem,"heap"); if(h) { uint16_t val=(uint16_t)GPR[rs1]; h->data[GPR[rd]]=val&0xFF; h->data[GPR[rd]+1]=(val>>8)&0xFF; } } break;
case FLUX_LOAD32: { rd=f8(v); if(!v->running) break; rs1=f8(v); if(!v->running) break; CHECK_GPR(rd); CHECK_GPR(rs1); GPR[rd]=h?(int32_t)flux_mem_read_i32(h,(size_t)GPR[rs1]):0; break; }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 FLUX_LOAD32 uses stale heap pointer h instead of fetching it

The FLUX_LOAD32 handler at line 776 uses the local variable h without first calling h=flux_mem_get(&v->mem,"heap"). The variable h is initialized to NULL at src/vm.c:554, so unless a prior operation in the same flux_vm_execute call happened to set h, this will always read 0 (the NULL fallback). Compare with FLUX_LOAD8 at src/vm.c:772 and FLUX_STORE32 at src/vm.c:777, both of which correctly fetch the heap pointer before use.

Suggested change
case FLUX_LOAD32: { rd=f8(v); if(!v->running) break; rs1=f8(v); if(!v->running) break; CHECK_GPR(rd); CHECK_GPR(rs1); GPR[rd]=h?(int32_t)flux_mem_read_i32(h,(size_t)GPR[rs1]):0; break; }
case FLUX_LOAD32: { rd=f8(v); if(!v->running) break; rs1=f8(v); if(!v->running) break; CHECK_GPR(rd); CHECK_GPR(rs1); h=flux_mem_get(&v->mem,"heap"); GPR[rd]=h?(int32_t)flux_mem_read_i32(h,(size_t)GPR[rs1]):0; break; }
Staging: Open in Devin

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Comment thread src/vm.c
Comment on lines +642 to +643
case FLUX_ROL: rd=f8(v); if(!v->running) break; rs1=f8(v); if(!v->running) break; CHECK_GPR(rd); CHECK_GPR(rs1); { int s=GPR[rs1]&31; GPR[rd]=(GPR[rd]<<s)|(GPR[rd]>>(32-s)); } break;
case FLUX_ROR: rd=f8(v); if(!v->running) break; rs1=f8(v); if(!v->running) break; CHECK_GPR(rd); CHECK_GPR(rs1); { int s=GPR[rs1]&31; GPR[rd]=(GPR[rd]>>s)|(GPR[rd]<<(32-s)); } break;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 ROL/ROR trigger undefined behavior when shift amount is 0

When GPR[rs1] & 31 == 0, the expression 32 - s evaluates to 32, and shifting a 32-bit int32_t by 32 is undefined behavior in C (C11 §6.5.7: "If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined"). This affects both FLUX_ROL and FLUX_ROR. A rotate by 0 should be a no-op but may produce arbitrary results or crash under optimization.

Suggested change
case FLUX_ROL: rd=f8(v); if(!v->running) break; rs1=f8(v); if(!v->running) break; CHECK_GPR(rd); CHECK_GPR(rs1); { int s=GPR[rs1]&31; GPR[rd]=(GPR[rd]<<s)|(GPR[rd]>>(32-s)); } break;
case FLUX_ROR: rd=f8(v); if(!v->running) break; rs1=f8(v); if(!v->running) break; CHECK_GPR(rd); CHECK_GPR(rs1); { int s=GPR[rs1]&31; GPR[rd]=(GPR[rd]>>s)|(GPR[rd]<<(32-s)); } break;
case FLUX_ROL: rd=f8(v); if(!v->running) break; rs1=f8(v); if(!v->running) break; CHECK_GPR(rd); CHECK_GPR(rs1); { int s=GPR[rs1]&31; if(s) { uint32_t u=(uint32_t)GPR[rd]; GPR[rd]=(int32_t)((u<<s)|(u>>(32-s))); } } break;
case FLUX_ROR: rd=f8(v); if(!v->running) break; rs1=f8(v); if(!v->running) break; CHECK_GPR(rd); CHECK_GPR(rs1); { int s=GPR[rs1]&31; if(s) { uint32_t u=(uint32_t)GPR[rd]; GPR[rd]=(int32_t)((u>>s)|(u<<(32-s))); } } break;
Staging: Open in Devin

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Comment thread src/vm.c
Comment on lines +774 to +775
case FLUX_LOAD16: { rd=f8(v); if(!v->running) break; rs1=f8(v); if(!v->running) break; CHECK_GPR(rd); CHECK_GPR(rs1); h=flux_mem_get(&v->mem,"heap"); GPR[rd]=h?(int32_t)(uint16_t)(h->data[GPR[rs1]] | (h->data[GPR[rs1]+1]<<8)):0; } break;
case FLUX_STORE16: { rd=f8(v); if(!v->running) break; rs1=f8(v); if(!v->running) break; CHECK_GPR(rs1); h=flux_mem_get(&v->mem,"heap"); if(h) { uint16_t val=(uint16_t)GPR[rs1]; h->data[GPR[rd]]=val&0xFF; h->data[GPR[rd]+1]=(val>>8)&0xFF; } } break;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 LOAD16/STORE16 bypass bounds-checked memory API, risking buffer overflow

LOAD16 and STORE16 directly access h->data[GPR[rs1]] and h->data[GPR[rs1]+1] without any bounds checking, unlike LOAD8/STORE8 which use the bounds-checked flux_mem_read_u8/flux_mem_write_u8 API (src/vm.c:772-773). Since GPR[rs1] is int32_t, a negative value or a value near the region size can cause an out-of-bounds read/write on the heap data buffer. The flux_mem_read/flux_mem_write functions at src/memory.c:26-27 explicitly check o + l <= r->size before accessing data.

Prompt for agents
LOAD16 and STORE16 at src/vm.c:774-775 directly index into h->data[] without bounds checking, unlike LOAD8/STORE8 which use flux_mem_read_u8/flux_mem_write_u8. These should be rewritten to use the bounds-checked memory API. For example, LOAD16 could use flux_mem_read(h, offset, &tmp, 2) and reassemble the 16-bit value, or a new flux_mem_read_u16 helper could be added to src/memory.c following the pattern of flux_mem_read_u8 at memory.c:30. STORE16 similarly should use flux_mem_write with bounds checking.
Staging: Open in Devin

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant