Authors: Rance W. Whitfield1 and Ross J. Anderson 2,∗
JSA-Vol. 4 (2025),
1 University of Bath, Bath, Somerset, U.K.
2 University of Cambridge, Cambridge, Cambridgeshire, U.K.
* Correspondence: ross.anderson@cl.cam.ac.uk
Received: 25 February 2024; Accepted: 10 May 2025; Published: 15 June 2025.
Abstract: The security of deployed neural networks is commonly addressed through data sanitization, robust training, and inference-time defenses. However, recent studies demonstrate that these approaches are insufficient against attacks introduced during the model compilation stage, where malicious logic can be embedded without modifying training data, model architecture, or learned parameters. Such compiler-stage backdoors fundamentally undermine traditional trust assumptions in machine learning pipelines. In this paper, we propose a verifiable and provenance-aware compilation framework for secure neural networkdeployment. The framework establishes end-to-end trust by linking deployed inference binaries to theirsource models through reproducible builds, intermediate-representation traceability, translation validation,and cryptographic attestation. We formalize the threat model for compiler-stage attacks, analyze trust gaps inmodern machine learning compilers, and present a system architecture that enables post-hoc verification and auditability. Our approach addresses the root cause of compiler-inserted backdoors and provides a principled foundation for trustworthy machine learning systems in safety-critical and adversarial environments.
Keywords: Neural network security, ML compilers, Provenance, Supply-chain attacks, Translation validation, Trustworthy AI