Skip to content

Avoids leak in wasm details#1372

Merged
aquynh merged 5 commits intocapstone-engine:nextfrom
catenacyber:leakwasm
Feb 20, 2019
Merged

Avoids leak in wasm details#1372
aquynh merged 5 commits intocapstone-engine:nextfrom
catenacyber:leakwasm

Conversation

@catenacyber
Copy link
Copy Markdown
Contributor

Extending cs_detail in capstone.h

Another way would have been to change the prototype of cs_free
This change seemed to me less disruptive for existing programs (no need to change the code using capstone, just a bit more memory consumption)

This was found by oss-fuzz :
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=12952

(And this likely hides other bugs from fuzzing)

@aquynh
Copy link
Copy Markdown
Collaborator

aquynh commented Feb 7, 2019

this is nice, but i am wondering if we can fix it differently by having operands as a fixed size array, @Spikel?

@catenacyber
Copy link
Copy Markdown
Contributor Author

We have some line
MI->flat_insn->detail->wasm.operands = cs_mem_malloc(sizeof(cs_wasm_op) * (length + 2))
which makes fixed size impossible
This is (as I understand) a jump table of a switch.
Maybe it would be best to retain only memory address and size of this jump table...

There seem to me to be other problems with this code (lack of check before allocating much arbitrary memory size)

Comment thread include/capstone/capstone.h Outdated
cs_wasm wasm; ///< Web Assembly architecture
cs_mos65xx mos65xx; ///< MOS65XX architecture (including MOS6502)
};
cs_arch arch; ///Used architecture (for architecture-specific free)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does not look good, but sounds like we cannot avoid adding this field?

another solution is to have a new csh argument for cs_free()

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed.
As a recap, I see three solutions :

  • this patch (but more memory consumption)
  • changing cs_free prototype (but changing interface is painful for people using this project)
  • changing how WASM handles this "switch" jump table (but I am not sure if this relevant for WASM)

Which is the best ?

Copy link
Copy Markdown
Collaborator

@aquynh aquynh Feb 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes we try best to avoid breaking compatibility, but in this case i think that is still better than having the extra arch field in cs_insn. finally, it was a mistake not to have csh arg in every API. we may afford changing API in the version 5, which will have major update in all archs.

if there is upper limit for brtable, we can avoid this mess by having fixed size array for operands. lets see what @Spikel has to say when he is back.

@catenacyber
Copy link
Copy Markdown
Contributor Author

So, oss-fuzz indeed found the bug for arbitrary size memory allocation :
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=12952

So I pushed another commit adding a safety check...

@aquynh
Copy link
Copy Markdown
Collaborator

aquynh commented Feb 15, 2019

nice work! i think @Spikel will get back from holidays next week to look at this bug.

@aquynh
Copy link
Copy Markdown
Collaborator

aquynh commented Feb 15, 2019

oh now this PR mixes with an old commit, from arch/TMS320C64x/TMS320C64xMapping.c.
can you try to sync your fork, and see if that goes away?

@catenacyber
Copy link
Copy Markdown
Contributor Author

I rebased to remove this other commit

@aquynh
Copy link
Copy Markdown
Collaborator

aquynh commented Feb 15, 2019

Another way to avoid breaking compatibility is to have a new API, lets say cs_free_ex(), which accepts csh as its first param, and promote it as future replacement for cs_free() in next few version.

@catenacyber
Copy link
Copy Markdown
Contributor Author

Newest version changes how WASM handles this switch table, this seems better for #1381 as well

@aquynh
Copy link
Copy Markdown
Collaborator

aquynh commented Feb 18, 2019

so the main change is that operands[] now has upper limit of 2?

@catenacyber
Copy link
Copy Markdown
Contributor Author

Yes :

  • no more allocs and frees : no need to change the API :-), no arbitrary size memory allocation
  • no overlong (>512) instruction printing
  • But the user has to read the variable uint32 targets himself...

@SpikeI
Copy link
Copy Markdown
Contributor

SpikeI commented Feb 20, 2019

Yes :

  • no more allocs and frees : no need to change the API :-), no arbitrary size memory allocation

  • no overlong (>512) instruction printing

  • But the user has to read the variable uint32 targets himself...

I want to try another solution, before I malloc the memory, I do some basic check, to make sure there are enough bytes to satisfied this length. then I will do malloc, if not, I will return false. what's more, I do not malloc such large memory space at one time, first I will malloc 1/4 of the needed space, and realloc twice. the first malloc make sure the space is the same as the binary size itself, If we have a big binary we also need a big memory space. I think it is reasonable.
The purpose of this plan is to avoid the user gives us a fake length, and we should do some thing to check the code if we can trust this length.
and I dont think there is a limit of br_table param length, so I cant use fixed size array. Or It may influence the user who use it in a right way.
that's my opinion, If you have a better solution. Im glad to accept it.

}

data = data + ((code[i] & 0x7f) << (i * 7));
data = data + (((uint64_t) code[i] & 0x7f) << (i * 7));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should fit in a separate pull req, so i can merge it independently?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed, do you want me to submit a separate pull request ?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, please

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or if @SpikeI agree with every changes here, i can merge all of them, without needing you to create a new PR.

@catenacyber
Copy link
Copy Markdown
Contributor Author

@Spikel this will not solve yet the overlong instruction printing from #1381 : capstone limits it to 512 bytes, and so we cannot print a WASM brtable with some hundreds of targets...

For the safety checks, you can look at 6e53ff1

This proposed solution looks to me more like switch instruction in other architectures : the jump table's address is given and the user has to decode all the targets himself. The difference with other architectures is that with WASM, targets are encoded with variable length...

@aquynh what do you think ?

@aquynh
Copy link
Copy Markdown
Collaborator

aquynh commented Feb 20, 2019

Wasm is really special, because it has this br_table instruction, that can be arbitrary long, thus does fit into the existing model.

@catenacyber i think your approach is reasonable. @SpikeI may come up with a similar fix.

Copy link
Copy Markdown
Contributor

@SpikeI SpikeI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Spikel this will not solve yet the overlong instruction printing from #1381 : capstone limits it to 512 bytes, and so we cannot print a WASM brtable with some hundreds of targets...

For the safety checks, you can look at 6e53ff1

This proposed solution looks to me more like switch instruction in other architectures : the jump table's address is given and the user has to decode all the targets himself. The difference with other architectures is that with WASM, targets are encoded with variable length...

@aquynh what do you think ?

I see, you are right. Its my fault to leave many memory mistakes.
the safety check is good, and the table we can not print. you are right.
the only inconvenient thing is that user should read varuint32 themself?

}

MI->wasm_data.brtable.target[i] = get_varuint32(code + tmp_len, code_len - tmp_len, &var_len);
get_varuint32(code + tmp_len, code_len - tmp_len, &var_len);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we do not keep this data?

@SpikeI
Copy link
Copy Markdown
Contributor

SpikeI commented Feb 20, 2019

Wasm is really special, because it has this br_table instruction, that can be arbitrary long, thus does fit into the existing model.

@catenacyber i think your approach is reasonable. @SpikeI may come up with a similar fix.

@catenacyber @aquynh
yes, I agree with this fix, please merge

@aquynh aquynh merged commit 0e51445 into capstone-engine:next Feb 20, 2019
@aquynh
Copy link
Copy Markdown
Collaborator

aquynh commented Feb 20, 2019

merged, thanks!

@catenacyber
Copy link
Copy Markdown
Contributor Author

Thanks @SpikeI for agreeing and congratulations for your work :-)

@riptl riptl mentioned this pull request Jul 22, 2022
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants