TESSER: Transfer-Enhancing Adversarial Attacks from Vision Transformers via Spectral and Semantic Regularization

Published in arXiv, 2025

Core Insight

Adversarial transferability fails when perturbations are too specific to the source model.

TESSER shows that transferable attacks must preserve shared semantic and spectral structure, not just maximize loss on a surrogate model.



Motivation

Adversarial transferability enables black-box attacks by allowing perturbations crafted on one model to generalize to others.

However, transferability is limited across architectures:

  • Vision Transformers → CNNs
  • CNNs → Transformers

Why?

Because standard attacks:

  • overfit to local, architecture-specific features
  • rely on high-frequency artifacts that do not generalize

What Enables Transferability?

Transferable perturbations must preserve:

  • Semantic alignment → focus on meaningful features
  • Spectral consistency → avoid high-frequency noise

Transferability emerges from shared structure across models.


Intuition

Feature-Sensitive Gradient Scaling (FSGS)
→ reallocates gradients toward semantically meaningful tokens
→ suppresses non-transferable low-level features

Spectral Smoothness Regularization (SSR)
→ suppresses high-frequency components
→ promotes smoother, more transferable perturbations


Method Overview

TESSER is a unified framework combining:

  1. FSGS
    → controls where gradients act (semantic structure)

  2. SSR
    → controls how perturbations behave (frequency structure)

Together, they enforce perturbations that are:

  • semantically aligned
  • spectrally coherent
  • robust across architectures

Abstract

Adversarial transferability enables black-box attacks by allowing perturbations crafted on a surrogate model to generalize to unseen targets. However, transferability remains limited across architectures, particularly from Vision Transformers (ViTs) to Convolutional Neural Networks (CNNs), due to over-reliance on local, architecture-specific features and high-frequency, non-transferable artifacts.

We propose TESSER (Transfer-Enhancing Semantic and Spectral Regularization), a unified adversarial attack framework that improves transferability by jointly controlling gradient allocation and spectral structure. TESSER integrates Feature-Sensitive Gradient Scaling (FSGS), which rebalances gradients toward semantically meaningful features, and Spectral Smoothness Regularization (SSR), which suppresses high-frequency components to promote smoother and more transferable perturbations.

Together, these mechanisms produce perturbations that are semantically aligned and spectrally coherent, reducing overfitting to surrogate-specific patterns. Experiments on ImageNet across diverse architectures—including ViTs, CNNs, hybrid models, and adversarially trained networks—show that TESSER consistently outperforms state-of-the-art transfer-based attacks.

These results demonstrate that transferability is governed by shared semantic and spectral structure, rather than raw optimization strength.


Key Contributions

  • Identifies semantic misalignment and spectral artifacts as key barriers to transferability
  • Proposes FSGS, a token-level gradient modulation strategy for semantic alignment
  • Introduces SSR, a differentiable regularization for spectral coherence
  • Unifies semantic and spectral control into a single framework
  • Achieves state-of-the-art transferability across ViTs, CNNs, and hybrid models
  • Provides analysis linking structure → transferability → generalization

Results

TESSER consistently outperforms state-of-the-art transfer-based attacks.

Produces perturbations that are semantically aligned and spectrally smooth.


Why This Matters

TESSER reframes adversarial attack design:

Strong attacks are not those that maximize loss —
but those that preserve shared structure across models.

This has implications for:

  • black-box attack design
  • robustness evaluation
  • cross-architecture generalization

Broader Perspective

TESSER complements a broader research framework:

  • TESSER → exploits alignment to improve transferability
  • DRIFT → breaks gradient alignment to improve robustness
  • TriQDef → breaks structural alignment across quantization

Together:

Alignment → Transferability → Vulnerability
Disruption → Divergence → Robustness


Citation

@article{guesmi2025tesser,
  title     = {TESSER: Transfer-Enhancing Adversarial Attacks from Vision Transformers via Spectral and Semantic Regularization},
  author    = {Guesmi, Amira and Ouni, Bassem and Shafique, Muhammad},
  journal   = {arXiv},
  volume    = {abs/2505.19613},
  year      = {2025}
}