Ethereum Smart Contract Vulnerability Detection and Machine Learning-Driven Solutions: A Systematic Literature Review

ยท

Abstract

Emerging technologies like smart contracts (SCs) and blockchain promise enhanced data security, yet Ethereum-based SCs remain vulnerable to malicious attacks. Machine learning (ML) methods offer a viable alternative to traditional vulnerability detection techniques, though current approaches often rely heavily on expert knowledge and focus narrowly on known vulnerabilities. This systematic literature review (SLR) examines 55 papers (2019โ€“2024) to classify ML-driven solutions into three categories: classical models, deep learning, and ensemble models. Key contributions include:


1. Introduction

Smart contracts automate agreements via blockchain, reducing risks and costs while improving efficiency. However, their immutability and lack of assessment standards make them prime targets for hackers. Traditional detection tools (static/dynamic analysis, symbolic execution, formal verification, fuzzy testing) suffer from manual rule dependency and inefficiency in identifying novel vulnerabilities. ML models enhance detection speed and accuracy, yet systematic reviews focusing on ML-driven SC vulnerability detection are scarce.

Key Contributions:


2. Preliminaries

2.1 Key Terminology

2.2 Vulnerability Types

VulnerabilityDescription
ReentrancyAllows repeated function calls during execution, enabling fund theft.
Timestamp DependencyRelies on block variables for critical operations or randomness.
Arithmetic OverflowMathematical results exceed storage capacity, causing unexpected behavior.

2.3 Machine Learning Techniques

2.4 Class Imbalance Solutions


3. Related Work

Comparative Analysis of Existing Surveys:
| Study | Focus Area | Year | Key Limitations |
|-------|-----------|------|----------------|
| [28] | SC Security | 2019 | Limited search domain clarity. |
| [35] | ML for SC Vulnerabilities | 2022 | Neglects class imbalance. |
| [38] | SC Platforms | 2023 | Lacks coverage of unknown vulnerabilities. |

Gaps Addressed by This SLR:


4. Methodology

4.1 Research Questions

  1. RQ1: Which ML techniques are used in vulnerability detection tools?
  2. RQ2: How do ML tools address class imbalance?
  3. RQ3: Which frameworks detect unknown vulnerabilities?

4.2 PRISMA-Based Selection


5. Taxonomy of ML-Driven Solutions

5.1 Machine Learning Models

Classical Models

Deep Learning

Ensemble Learning

5.2 Class Imbalance Solutions

5.3 Unknown Vulnerability Detection


6. Comparative Analysis

RQ1: ML Techniques in Vulnerability Detection

RQ2/RQ3: Class Imbalance & Unknown Vulnerabilities

| Framework | Class Imbalance Solution | Unknown Vulnerabilities Addressed |
|-----------------|---------------------------|-----------------------------------|
| MODNN [52] | Focal Loss | Yes |
| ContractWard [24] | SMOTE | No |

๐Ÿ‘‰ Explore advanced ML frameworks for SC security


7. Discussion


8. Conclusions & Future Work

Key Takeaways:

Future Directions:


FAQs

Q1: Which ML model is best for detecting Reentrancy vulnerabilities?
A1: GNNs (e.g., DA-GNN [64]) excel by modeling control flow and data dependencies.

Q2: How can class imbalance impact vulnerability detection?
A2: Biases models toward majority classes, increasing false negatives for rare vulnerabilities.

Q3: Are unknown vulnerabilities detectable without predefined rules?
A3: Yes, via novelty detection (e.g., SAGP [71]) and anomaly-based approaches.

๐Ÿ‘‰ Learn more about smart contract security best practices