Abstract
This paper uses reinforcement learning to present an agent-based modeling of double-spending attacks in fast payment scenarios. The attacker trains an agent to find an optimal helper to start propagating the attack transaction and escape from random-walk detection conducted by supervisor. In our results, when the supervisor randomly walks a few steps, the attacker selects nodes more easily to propagate attack transactions, aiming to increase the probability of the attack transaction being included in the next block. While the supervisor's walking steps increase, the attacker becomes more covert to avoid detection. With the increment of the supervisor's walk steps, the attacker's strategy shifts again towards pursuing the likelihood of making attack transactions recorded. Experimental results show that when the supervisor only checks itself, the attacking quality of the helper node selected by the agent is improved by 68 % compared to the randomly selected node. When the supervisor randomly walks 3 steps, the advantage of our agent represents a 62 % reduction contrasted with the supervisor's self-checking situation. As walk steps increase, the advantage of our model will approach the supervisor's self-checking situation. At this time, the attacking quality of double-spending is already negative, and the supervisor easily discovers the attack.
| Original language | English |
|---|---|
| Article number | 111942 |
| Number of pages | 17 |
| Journal | Computer Networks |
| Volume | 275 |
| Early online date | 25 Dec 2025 |
| DOIs | |
| Publication status | E-pub ahead of print - 25 Dec 2025 |
Keywords
- Agent-based modeling
- Deep reinforcement learning
- Double-spending attacks