The repository consists of three datasets: Unlabled Real-world Smart Contract Dataset (URD), Labled Real-world Smart Contract Dataset (LRD), Publicly Available Smart Contract Vulnerability Dataset (PVD), which are used to train machine learning models of sGuard+ and evaluate the effectiveness and efficiency of sGuard+. Labels in these datasets involve the following five vulnerabilities:
- SWC-101: Integer Overflow and Underflow Vulnerability (IOU)
- SWC-104: Unchecked Call Return Value Vulnerability (UCR)
- SWC-106: Unprotected SELFDESTRUCT Instruction Vulnerability (USI)
- SWC-107: Reentrancy Vulnerability (REN)
- SWC-115: Authorization through Tx-origin Vulnerability (TXO)
https://github.com/renardbebe/Smart-Contract-Benchmark-Suites/tree/master/dataset/UR
We label URD by manually confirming the detection results of Slither, Securify (and v2.0) and Mythril.
We collect the vulnerable smart contracts from SWC Registry, CVE, and SmartBugs Curated as a ground truth of the five vulnerabilities mentioned above.