TESSER: Transfer-Enhancing Adversarial Attacks from Vision Transformers via Spectral and Semantic Regularization

Published in arXiv, 2025

Core Insight

Adversarial transferability fails when perturbations are too specific to the source model.

TESSER shows that transferable attacks must preserve shared semantic and spectral structure, not just maximize loss on a surrogate model.

Motivation

Adversarial transferability enables black-box attacks by allowing perturbations crafted on one model to generalize to others.

However, transferability is limited across architectures:

Vision Transformers → CNNs
CNNs → Transformers

Why?

Because standard attacks:

overfit to local, architecture-specific features
rely on high-frequency artifacts that do not generalize

What Enables Transferability?

Transferable perturbations must preserve:

Semantic alignment → focus on meaningful features
Spectral consistency → avoid high-frequency noise

Transferability emerges from shared structure across models.

Intuition

Feature-Sensitive Gradient Scaling (FSGS)
→ reallocates gradients toward semantically meaningful tokens
→ suppresses non-transferable low-level features

Spectral Smoothness Regularization (SSR)
→ suppresses high-frequency components
→ promotes smoother, more transferable perturbations

Method Overview

TESSER is a unified framework combining:

FSGS
→ controls where gradients act (semantic structure)
SSR
→ controls how perturbations behave (frequency structure)

Together, they enforce perturbations that are:

semantically aligned
spectrally coherent
robust across architectures

Abstract

Adversarial transferability enables black-box attacks by allowing perturbations crafted on a surrogate model to generalize to unseen targets. However, transferability remains limited across architectures, particularly from Vision Transformers (ViTs) to Convolutional Neural Networks (CNNs), due to over-reliance on local, architecture-specific features and high-frequency, non-transferable artifacts.

We propose TESSER (Transfer-Enhancing Semantic and Spectral Regularization), a unified adversarial attack framework that improves transferability by jointly controlling gradient allocation and spectral structure. TESSER integrates Feature-Sensitive Gradient Scaling (FSGS), which rebalances gradients toward semantically meaningful features, and Spectral Smoothness Regularization (SSR), which suppresses high-frequency components to promote smoother and more transferable perturbations.

Together, these mechanisms produce perturbations that are semantically aligned and spectrally coherent, reducing overfitting to surrogate-specific patterns. Experiments on ImageNet across diverse architectures—including ViTs, CNNs, hybrid models, and adversarially trained networks—show that TESSER consistently outperforms state-of-the-art transfer-based attacks.

These results demonstrate that transferability is governed by shared semantic and spectral structure, rather than raw optimization strength.

Key Contributions

Identifies semantic misalignment and spectral artifacts as key barriers to transferability
Proposes FSGS, a token-level gradient modulation strategy for semantic alignment
Introduces SSR, a differentiable regularization for spectral coherence
Unifies semantic and spectral control into a single framework
Achieves state-of-the-art transferability across ViTs, CNNs, and hybrid models
Provides analysis linking structure → transferability → generalization

Results

TESSER consistently outperforms state-of-the-art transfer-based attacks.

Produces perturbations that are semantically aligned and spectrally smooth.

Why This Matters

TESSER reframes adversarial attack design:

Strong attacks are not those that maximize loss —
but those that preserve shared structure across models.

This has implications for:

black-box attack design
robustness evaluation
cross-architecture generalization

Broader Perspective

TESSER complements a broader research framework:

TESSER → exploits alignment to improve transferability
DRIFT → breaks gradient alignment to improve robustness
TriQDef → breaks structural alignment across quantization

Together:

Alignment → Transferability → Vulnerability
Disruption → Divergence → Robustness

Citation

@article{guesmi2025tesser,
  title     = {TESSER: Transfer-Enhancing Adversarial Attacks from Vision Transformers via Spectral and Semantic Regularization},
  author    = {Guesmi, Amira and Ouni, Bassem and Shafique, Muhammad},
  journal   = {arXiv},
  volume    = {abs/2505.19613},
  year      = {2025}
}

Share on

Twitter Facebook LinkedIn

Amira Guesmi