TY - GEN
T1 - Surrogate-guided adversarial attacks
T2 - 5th IEEE International Conference on Cyber Security and Resilience, CSR 2025
AU - Asimopoulos, Dimitrios Christos
AU - Radoglou-Grammatikis, Panagiotis
AU - Fouliras, Panagiotis
AU - Panitsidis, Konstandinos
AU - Efstathopoulos, Georgios
AU - Lagkas, Thomas
AU - Argyriou, Vasileios
AU - Kotsiuba, Igor
AU - Sarigiannidis, Panagiotis
PY - 2025/8/26
Y1 - 2025/8/26
N2 - Adversarial attacks pose significant threats to machine learning models, with white-box attacks such as Fast Gradient Sign Method (FGSM), Projected Gradient Descent (PGD), and Basic Iterative Method (BIM) achieving high success rates when model gradients are accessible. However, in real-world scenarios, direct access to model internals is often restricted, necessitating black-box attack strategies that typically suffer from lower effectiveness. In this work, we propose a novel approach to transform white-box attacks into black-box attacks by leveraging state-of-the-art surrogate models, including MultiLayer Perceptrons (MLP) and XGBoost (XGB). Our method involves training a surrogate model to mimic the decision boundaries of an inaccessible target model using pseudo-labeling, thereby enabling the application of gradient-based white-box attacks in a black-box setting. We systematically compare our approach against conventional black-box attacks, such as Zero Order Optimization (ZOO), evaluating their effectiveness in terms of attack success rates, transferability, and computational efficiency. The results demonstrate that surrogate-assisted attacks perform as good as standard black-box methods, bridging the performance gap between white-box and black-box adversarial attacks. This study highlights the power of surrogate models in enhancing adversarial transferability and provides insights into the robustness of different machine learning architectures against adversarial threats.
AB - Adversarial attacks pose significant threats to machine learning models, with white-box attacks such as Fast Gradient Sign Method (FGSM), Projected Gradient Descent (PGD), and Basic Iterative Method (BIM) achieving high success rates when model gradients are accessible. However, in real-world scenarios, direct access to model internals is often restricted, necessitating black-box attack strategies that typically suffer from lower effectiveness. In this work, we propose a novel approach to transform white-box attacks into black-box attacks by leveraging state-of-the-art surrogate models, including MultiLayer Perceptrons (MLP) and XGBoost (XGB). Our method involves training a surrogate model to mimic the decision boundaries of an inaccessible target model using pseudo-labeling, thereby enabling the application of gradient-based white-box attacks in a black-box setting. We systematically compare our approach against conventional black-box attacks, such as Zero Order Optimization (ZOO), evaluating their effectiveness in terms of attack success rates, transferability, and computational efficiency. The results demonstrate that surrogate-assisted attacks perform as good as standard black-box methods, bridging the performance gap between white-box and black-box adversarial attacks. This study highlights the power of surrogate models in enhancing adversarial transferability and provides insights into the robustness of different machine learning architectures against adversarial threats.
KW - Adversarial attacks
KW - Black-box
KW - evasion
KW - surrogate-model
KW - transferability
KW - white-box
U2 - 10.1109/CSR64739.2025.11130067
DO - 10.1109/CSR64739.2025.11130067
M3 - Conference contribution
AN - SCOPUS:105016105310
SN - 9798331535926
T3 - Proceedings of the 2025 IEEE International Conference on Cyber Security and Resilience, CSR 2025
SP - 950
EP - 956
BT - Proceedings of the 2025 IEEE International Conference on Cyber Security and Resilience, CSR 2025
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 4 August 2025 through 6 August 2025
ER -