Summary
I made some changes for 5.0.1 for the CS_OPT_SYNTAX_NOREGNAME which I thought were working, but things have become more broken in 5.0.1
It looks like NOREGNAME produces the same output as DEFAULT.
Example code
This example code prints out the default register form and the 'noregname' form.
#!/usr/bin/env python
import sys
from capstone import *
import capstone.arm_const
code = bytearray([0, 0x10, 0x90, 0xe5]) # LDR r1,[r0]
# A decoder for 'no regname'
mdnr = Cs(CS_ARCH_ARM, CS_MODE_ARM)
mdnr.detail = True
mdnr.syntax = capstone.CS_OPT_SYNTAX_NOREGNAME
# A decoder for default format
mddef = Cs(CS_ARCH_ARM, CS_MODE_ARM)
mddef.detail = True
mddef.syntax = capstone.CS_OPT_SYNTAX_DEFAULT
optype_names = dict((getattr(capstone.arm_const, optype), optype) for optype in dir(capstone.arm_const) if optype.startswith('ARM_OP_'))
print("cs_version() = %r" % (cs_version(),))
for regnum in range(0, 16):
# Tweak the source register
code[2] = (code[2] & 0xF0) | regnum
for i in mddef.disasm(bytes(code), 0x1000):
dis_default = "%-6s%s" % (i.mnemonic, i.op_str)
for i in mdnr.disasm(bytes(code), 0x1000):
dis_noregname = "%-6s%s" % (i.mnemonic, i.op_str)
print("Register %2i: default: %-20s noregname: %s" % (regnum, dis_default, dis_noregname))
Test results for 4.0.2
cs_version() = (4, 0, 1024)
Register 0: default: ldr r1, [r0] noregname: ldr r1, [r0]
Register 1: default: ldr r1, [r1] noregname: ldr r1, [r1]
Register 2: default: ldr r1, [r2] noregname: ldr r1, [r2]
Register 3: default: ldr r1, [r3] noregname: ldr r1, [r3]
Register 4: default: ldr r1, [r4] noregname: ldr r1, [r4]
Register 5: default: ldr r1, [r5] noregname: ldr r1, [r5]
Register 6: default: ldr r1, [r6] noregname: ldr r1, [r6]
Register 7: default: ldr r1, [r7] noregname: ldr r1, [r7]
Register 8: default: ldr r1, [r8] noregname: ldr r1, [r8]
Register 9: default: ldr r1, [sb] noregname: ldr r1, [r9]
Register 10: default: ldr r1, [sl] noregname: ldr r1, [r10]
Register 11: default: ldr r1, [fp] noregname: ldr r1, [r11]
Register 12: default: ldr r1, [ip] noregname: ldr r1, [r12]
Register 13: default: ldr r1, [sp] noregname: ldr r1, [sp]
Register 14: default: ldr r1, [lr] noregname: ldr r1, [lr]
Register 15: default: ldr r1, [pc] noregname: ldr r1, [pc]
Test results for 5.0.0
cs_version() = (5, 0, 1280)
Register 0: default: ldr r1, [r0] noregname: ldr r1, [r0]
Register 1: default: ldr r1, [r1] noregname: ldr r1, [r1]
Register 2: default: ldr r1, [r2] noregname: ldr r1, [r2]
Register 3: default: ldr r1, [r3] noregname: ldr r1, [r3]
Register 4: default: ldr r1, [r4] noregname: ldr r1, [r4]
Register 5: default: ldr r1, [r5] noregname: ldr r1, [r5]
Register 6: default: ldr r1, [r6] noregname: ldr r1, [r6]
Register 7: default: ldr r1, [r7] noregname: ldr r1, [r7]
Register 8: default: ldr r1, [r8] noregname: ldr r1, [r8]
Register 9: default: ldr r1, [sb] noregname: ldr r1, [r9]
Register 10: default: ldr r1, [sl] noregname: ldr r1, [r10]
Register 11: default: ldr r1, [fp] noregname: ldr r1, [r11]
Register 12: default: ldr r1, [ip] noregname: ldr r1, [r12]
Register 13: default: ldr r1, [sp] noregname: ldr r1, [r13]
Register 14: default: ldr r1, [lr] noregname: ldr r1, [r14]
Register 15: default: ldr r1, [pc] noregname: ldr r1, [pc]
Notice this is all register numbers in the noregname case; this was what I tried to make more consistent with 4.0.x.
Test results for 5.0.1
cs_version() = (5, 0, 1280)
Register 0: default: ldr r1, [r0] noregname: ldr r1, [r0]
Register 1: default: ldr r1, [r1] noregname: ldr r1, [r1]
Register 2: default: ldr r1, [r2] noregname: ldr r1, [r2]
Register 3: default: ldr r1, [r3] noregname: ldr r1, [r3]
Register 4: default: ldr r1, [r4] noregname: ldr r1, [r4]
Register 5: default: ldr r1, [r5] noregname: ldr r1, [r5]
Register 6: default: ldr r1, [r6] noregname: ldr r1, [r6]
Register 7: default: ldr r1, [r7] noregname: ldr r1, [r7]
Register 8: default: ldr r1, [r8] noregname: ldr r1, [r8]
Register 9: default: ldr r1, [sb] noregname: ldr r1, [sb]
Register 10: default: ldr r1, [sl] noregname: ldr r1, [sl]
Register 11: default: ldr r1, [fp] noregname: ldr r1, [fp]
Register 12: default: ldr r1, [ip] noregname: ldr r1, [ip]
Register 13: default: ldr r1, [sp] noregname: ldr r1, [sp]
Register 14: default: ldr r1, [lr] noregname: ldr r1, [lr]
Register 15: default: ldr r1, [pc] noregname: ldr r1, [pc]
Note that the noregname case is exactly the same as the default.
Expected output
I had hoped that 5.0.1 would be closer to the 4.0.x version. It seems to have gone worse..
Possible reason
I looked at the constants in the Python capstone/__init__.py for CS_OPT_SYNTAX and I see a possible problem?
On 5.0.0 the constants are:
# Capstone syntax value
CS_OPT_SYNTAX_DEFAULT = 0 # Default assembly syntax of all platforms (CS_OPT_SYNTAX)
CS_OPT_SYNTAX_INTEL = 1 # Intel X86 asm syntax - default syntax on X86 (CS_OPT_SYNTAX, CS_ARCH_X86)
CS_OPT_SYNTAX_ATT = 2 # ATT asm syntax (CS_OPT_SYNTAX, CS_ARCH_X86)
CS_OPT_SYNTAX_NOREGNAME = 3 # Asm syntax prints register name with only number - (CS_OPT_SYNTAX, CS_ARCH_PPC, CS_ARCH_ARM)
CS_OPT_SYNTAX_MASM = 4 # MASM syntax (CS_OPT_SYNTAX, CS_ARCH_X86)
CS_OPT_SYNTAX_MOTOROLA = 5 # MOS65XX use $ as hex prefix
On 5.0.1 the constants are:
# Capstone syntax value
CS_OPT_SYNTAX_DEFAULT = 1 << 1 # Default assembly syntax of all platforms (CS_OPT_SYNTAX)
CS_OPT_SYNTAX_INTEL = 1 << 2 # Intel X86 asm syntax - default syntax on X86 (CS_OPT_SYNTAX, CS_ARCH_X86)
CS_OPT_SYNTAX_ATT = 1 << 3 # ATT asm syntax (CS_OPT_SYNTAX, CS_ARCH_X86)
CS_OPT_SYNTAX_NOREGNAME = 1 << 4 # Asm syntax prints register name with only number - (CS_OPT_SYNTAX, CS_ARCH_PPC, CS_ARCH_ARM)
CS_OPT_SYNTAX_MASM = 1 << 5 # MASM syntax (CS_OPT_SYNTAX, CS_ARCH_X86)
CS_OPT_SYNTAX_MOTOROLA = 1 << 6 # MOS65XX use $ as hex prefix
CS_OPT_SYNTAX_CS_REG_ALIAS = 1 << 7 # Prints common register alias which are not defined in LLVM (ARM: r9 = sb etc.)
It's likely that this is correct, but the fact that the selection of the syntax has changed its constant values, and the output has stopped working makes me think that it might be related.
If I look at the setter for syntax in 5.0.1, I see:
# syntax setter: modify assembly syntax.
@syntax.setter
def syntax(self, style):
status = _cs.cs_option(self.csh, CS_OPT_SYNTAX, style)
if status != CS_ERR_OK:
raise CsError(status)
# save syntax
self._syntax = style
But for 'skipdata' I see it has this form:
# setter: modify skipdata status
@skipdata.setter
def skipdata(self, opt):
if opt == False:
status = _cs.cs_option(self.csh, CS_OPT_SKIPDATA, CS_OPT_OFF)
else:
status = _cs.cs_option(self.csh, CS_OPT_SKIPDATA, CS_OPT_ON)
if status != CS_ERR_OK:
raise CsError(status)
# save this option
self._skipdata = opt
ie it's using CS_OPT_ON and CS_OPT_OFF in the call to change options, whilst the syntax isn't, and in capstone.h file we see the actual definitions as:
/// Runtime option value (associated with option type above)
typedef enum cs_opt_value {
CS_OPT_OFF = 0, ///< Turn OFF an option - default for CS_OPT_DETAIL, CS_OPT_SKIPDATA, CS_OPT_UNSIGNED.
CS_OPT_ON = 1 << 0, ///< Turn ON an option (CS_OPT_DETAIL, CS_OPT_SKIPDATA).
CS_OPT_SYNTAX_DEFAULT = 1 << 1, ///< Default asm syntax (CS_OPT_SYNTAX).
CS_OPT_SYNTAX_INTEL = 1 << 2, ///< X86 Intel asm syntax - default on X86 (CS_OPT_SYNTAX).
CS_OPT_SYNTAX_ATT = 1 << 3, ///< X86 ATT asm syntax (CS_OPT_SYNTAX).
CS_OPT_SYNTAX_NOREGNAME = 1 << 4, ///< Prints register name with only number (CS_OPT_SYNTAX)
CS_OPT_SYNTAX_MASM = 1 << 5, ///< X86 Intel Masm syntax (CS_OPT_SYNTAX).
CS_OPT_SYNTAX_MOTOROLA = 1 << 6, ///< MOS65XX use $ as hex prefix
CS_OPT_SYNTAX_CS_REG_ALIAS = 1 << 7, ///< Prints common register alias which are not defined in LLVM (ARM: r9 = sb etc.)
} cs_opt_value;
The value of CS_OPT_ON and CS_OPT_OFF is 1 and 0 respectively, which makes me think that this was intended to be an OR'd bitfield to control the flags.
But I'm guessing here... it seems odd that a patch version update would change the meaning of the constants - that might make it hard in compiled languages that expect to be able to dynamic link with minor versions without an ABI change? Again, I'm guessing that's the case.
Summary
I made some changes for 5.0.1 for the CS_OPT_SYNTAX_NOREGNAME which I thought were working, but things have become more broken in 5.0.1
It looks like NOREGNAME produces the same output as DEFAULT.
Example code
This example code prints out the default register form and the 'noregname' form.
Test results for 4.0.2
Test results for 5.0.0
Notice this is all register numbers in the
noregnamecase; this was what I tried to make more consistent with 4.0.x.Test results for 5.0.1
Note that the
noregnamecase is exactly the same as the default.Expected output
I had hoped that 5.0.1 would be closer to the 4.0.x version. It seems to have gone worse..
Possible reason
I looked at the constants in the Python
capstone/__init__.pyfor CS_OPT_SYNTAX and I see a possible problem?On 5.0.0 the constants are:
On 5.0.1 the constants are:
It's likely that this is correct, but the fact that the selection of the syntax has changed its constant values, and the output has stopped working makes me think that it might be related.
If I look at the setter for syntax in 5.0.1, I see:
But for 'skipdata' I see it has this form:
ie it's using
CS_OPT_ONandCS_OPT_OFFin the call to change options, whilst the syntax isn't, and incapstone.hfile we see the actual definitions as:The value of
CS_OPT_ONandCS_OPT_OFFis 1 and 0 respectively, which makes me think that this was intended to be an OR'd bitfield to control the flags.But I'm guessing here... it seems odd that a patch version update would change the meaning of the constants - that might make it hard in compiled languages that expect to be able to dynamic link with minor versions without an ABI change? Again, I'm guessing that's the case.