AI Can Be Hacked With a Simple 'Typo' in Its Memory, New Study Claims
Researchers at George Mason University have revealed a new attack method, dubbed 'Oneflip,' which targets deep learning models by flipping a single bit in memory. This technique enables a seemingly normal AI system to operate correctly while having a hidden backdoor. For instance, this could lead to a self-driving car misinterpreting a stop sign as a green light under specific conditions. The research indicates that even models functioning with high accuracy can be manipulated to deliver harmful outputs without detection. This backdoor manipulation occurs silently, as the model's accuracy remains largely unaffected except when triggered. The method employs a known hardware vulnerability called 'Rowhammer,' where a nearby memory bit can be altered by frequent read/write operations. Although this exploit currently requires significant technical capabilities, there are implications for its potential misuse in high-stake environments like finance and healthcare, where AI systems are increasingly integrated. The study emphasizes the need for enhanced security measures at hardware levels, as traditional software defenses may not be sufficient to prevent such subtle and destructive attacks.
Source 🔗