feat: add r2 extension#1172
Conversation
elicn
left a comment
There was a problem hiding this comment.
Thanks for the contribution! This functionality looks very cool.
I think this is a nice PoC that could be even cooler if it didn't take r2 data "as-is" and provided some post-processing, like turning all offsets into a vaddrs.
Anyway, I've made additional comments there.
| class R2: | ||
| def __init__(self, ql: Qiling): | ||
| super().__init__() | ||
| path = ql.path.encode() |
There was a problem hiding this comment.
Is there any adaptation we can do for "shellcode mode"?
There was a problem hiding this comment.
Good point, I remember r2 supports writing code to analyze?
There was a problem hiding this comment.
Yes, you could load r2 with an empty buffer but you'll have to specify -a and -b for architecture and bits.
Shellcode content can be inserted with wx.
There was a problem hiding this comment.
Thanks! I have implemented "shellcode mode" using command wx HEX.
But when I try to use the ctype API libr.r_io.r_io_write_at(self._r2c.contents.io, loadaddr, code, size) it complains ctypes.ArgumentError: argument 1: <class 'TypeError'>: expected LP_struct_r_io_t instance instead of LP_struct_r_io_t
There was a problem hiding this comment.
Also there is currently no public util to accurately map QL_ARCH enum to arch and bits.
|
|
||
|
|
||
| class R2: | ||
| def __init__(self, ql: Qiling): |
There was a problem hiding this comment.
Suggesting to allow the user to specify r2 evars to allow more analysis flexibility.
That could be done perhaps by accepting a dictionary of keys and values (e.g. {'anal.hasnext' : 'true', 'anal.depth' : 5} or any other reasonable format. Consider also allowing an "r2 init script" to let the user run a few r2 commands before starting the analysis.
There was a problem hiding this comment.
R2 has its own type of config variables, I think it's similar to qiling's profile, which is for different OS though. We can extend it later.
|
|
||
|
|
||
| @dataclass(unsafe_hash=True) | ||
| class Section: |
There was a problem hiding this comment.
Unless there is a fields post-processing or any additional functionality, these classes could be typing.NamedTuple.
There was a problem hiding this comment.
Yes. Now these classes have no special post-processing stuff. I use dataclass because it allows custom init function so I can pass all dict items got from r2 to construct the class without specifying fields. I can try using NamedTuple though.
Ref:
https://stackoverflow.com/questions/51671699/data-classes-vs-typing-namedtuple-primary-use-cases
https://stackoverflow.com/questions/17622419/creating-a-namedtuple-object-using-only-a-subset-of-arguments-passed
| "fuzzercorn>=0.0.1;platform_system=='Linux'" | ||
| ], | ||
| "SCA" : [ | ||
| "r2libr" |
There was a problem hiding this comment.
@elicn maybe we could add r2libr as an essential dep? My thought roughly is:
- First of all, unlike the previously discussed
udbserver,r2libris maintained by me, in other words, by Qiling team directly. r2librcould be tightly integrated into our loader.- The
r2librcould help us polish some old API likeset_apiwith the additional information. - There seems no use case that users really would like to use QIling without
r2libr?
There was a problem hiding this comment.
Also note, radare2 sometimes introduces break changes. To specify r2libr please use something like r2libr==5.7.0.
There was a problem hiding this comment.
Breaking changes should only happen when the major and minor numbers change. only patch releases are abi stable. so if you have r2libr-5.7.0 you can use r2-5.7.0, but also 5.7.4 or 5.7.8 without runtime problems. That relaxes a little the stress to maintain updates and also makes it easy to transition between abi-breaking versions
There was a problem hiding this comment.
IMHO, I would refrain from defining something as a prerequisit unless it is absolutley required; that is, the program cannot operate without it. Plugging r2 into Qiling sounds like a great idea, where Qiling can benefit a lot from r2 static analysis.
Integrating r2 static analysis into Qiling requires some thinking of how to do it right and consistent across all modules. I really don't want it to be integrated into some modules and not in others. Until r2 is integrated across all Qiling modules, I believe we can wait with this prerequisit.
There was a problem hiding this comment.
Yes, I understand your concern so let me illustrate my motivation a bit more:
- Radare2 helps reduce redundant code and improves robustness, e.g. ELF loader, PE loader, even disassembler etc.
- Indeed, I strongly agree with you that r2 should be, some sort of
everywherein Qiling. I hope to achieve this by using r2 to build a few basic utils or modules. An example I showed before is to identify function boundaries. Imagine you canset_apion a binary without any symbol tables, which is really useful and I don't think users will reject it.
Based on these two points, I think r2 will become a prerequisite someday but also I agree we should wait until we really have to do so.
- eliminate magic number of baseaddr and loadaddr - update example of shellcode mode
- Remove redundant __init__ - Abstract `cmdj` to parse json in only one place - avoid importing whole functool
| "fuzzercorn>=0.0.1;platform_system=='Linux'" | ||
| ], | ||
| "SCA" : [ | ||
| "r2libr" |
There was a problem hiding this comment.
IMHO, I would refrain from defining something as a prerequisit unless it is absolutley required; that is, the program cannot operate without it. Plugging r2 into Qiling sounds like a great idea, where Qiling can benefit a lot from r2 static analysis.
Integrating r2 static analysis into Qiling requires some thinking of how to do it right and consistent across all modules. I really don't want it to be integrated into some modules and not in others. Until r2 is integrated across all Qiling modules, I believe we can wait with this prerequisit.
- replace str enum with typing.Literal - avoid generic typing annotation for compability with Python 3.8
|
Have we reached an agreement about how to integrate r2? Or we need a PoC of using r2 everywhere in qiling at first? |
IMHO, r2 should be optional for now, and not a requirement (i.e. revert the last commit). |
|
I am trying to implement a loader of |
- use `s addr`, `p8 size` to get hex data - return bytes that can be used to write memory
r2 command `iIj` return JSON of binary info like baddr and bintype
As mentioned above, till Qiling becomes dependent of radare2 (by design, not opportunistically), radare2 should not appear as a required dependency. If someone wants to analyze a specific file format that its loading relies on radare2, they can install radare2 for that purpose (similar to EVM and its dependencies). |
I just had a talk with @chinggg and I plan to break this into two phases (also required by GSoC) Phase 1The main goal of this phase is to make r2 more smoothly used, including
All these things can really ease the pain of writing Qiling scripts so far. Expected date: July 24 Phase 2For this phase, I haven't decided the exact task but a few ideas here:
Expected date: Sept 4 |
|
Leveraging radare2 static analysis capabilities to enhance Qiling emulation is a great idea, especially strings and flags. I implemented a similar resolving capabilities to my enhanced All that being said, it is esential to keep in mind that radare2, as powerful as it might be, is a static analysis tool , whereas Qiling dynamically emulates code. Though some of radare2 analysis may be useful in a dynamic environment, its analysis may break or become irrelevant when dealing with dynamically loaded libraries, self modifying code, and basically everything that is not static. My understanding is that radare2 may be useful to extract debugging information and strings, but I am not sure about the rest. We'll still have to load dynamic libraries, resolve imports, etc. Like I mentioned in the past, |
- flag is a bookmark that associate a name with a given offset - memory address in qiling can be interpreted better
- set arch and bits for r2 asm only in shellcode mode
|
@wtdcode I have made a simple PoC of address to flag mapping (by calling |
| def enable_trace(self, mode='full'): | ||
| # simple map from addr to flag name, cannot resolve addresses in the middle | ||
| self.ql.loader.symsmap = {flag.offset: flag.name for flag in self.flags} | ||
| if mode == 'full': |
There was a problem hiding this comment.
This is just a PoC. The trace extension made by @elicn need a symsmap to resolve addresses, I have to made further modification on the trace extension.
elicn
left a comment
There was a problem hiding this comment.
Turning r2 into an integral part of core is a bad idea, in my opinion.
As an extention, r2 analysis should stay an external add-on feature that may [or may not] be used with Qiling.
|
@elicn I think this PR is fine to go and we would like another a few PRs about R2 application in Qiling internally. |
- "Qiling" is only used for type hint
|
Looks good to me. |
|
Looking forward to the following work! |
Checklist
Which kind of PR do you create?
Coding convention?
Extra tests?
Changelog?
Target branch?
One last thing
This PR introduces radare2 extension and adds an example
hello_r2.pyto show its basic usage.My main concerns are:
R2's membersdataclassvsnamedtupleFeel free to review it and give your advice.