Deep neural network-based long-term time series forecasting (LTSF) models deployed as black-box APIs face critical security vulnerabilities to adversarial attacks. We present TSFAdv, a query-efficient adversarial attack framework that exploits frequency-domain vulnerabilities in black-box LTSF models. Our method introduces frequency-domain sensitivity analysis and sensitivity-guided Natural Evolution Strategies to generate targeted perturbations with improved query efficiency. Comprehensive evaluations across five benchmark datasets and seven state-of-the-art LTSF architectures demonstrate significant effectiveness under 200-query constraints, achieving average DTW improvements of 21.91-85.00% and SME improvements of 15.04-61.97% compared to baseline black-box attacks.
Figure: Fundamental difference between classification and regression models. Classification models learn discrete decision boundaries between classes, while regression models fit smooth continuous hyperplanes through data space. This distinction makes adversarial attacks on time series forecasting inherently more challenging, as perturbations must significantly shift continuous trajectories rather than simply cross decision boundaries.
Conventional time-domain adversarial attacks fail to capture the fundamental nature of forecasting models, which maintain continuous temporal dependencies rather than discrete decision boundaries. Our empirical analysis reveals that while time-domain perturbations elicit homogeneous responses across architectures, frequency-domain perturbations expose architecture-specific spectral vulnerabilities that can be exploited for more efficient attacks.
| 🔍 Analysis Dimension | 📉 Time-domain | 🌊 Frequency-domain | |
|---|---|---|---|
| 📊 Response Pattern | Characteristics | ❌ Diffuse, localized | ✅ Sharp, architecture-specific |
| Discriminative Power | ❌ Limited capability | ✅ High discriminative | |
| 🎯 Attack Efficiency | Vulnerability Discovery | ❌ Homogeneous responses | ✅ Model-specific patterns |
| Exploitation Success | ❌ Inefficient targeting | ✅ Efficient targeting | |
Figure: Model vulnerability analysis comparing time-domain vs. frequency-domain perturbations across LTSF architectures. Top row: Time-domain perturbations (Gaussian bump position vs. spread) show diffuse, homogeneous responses with limited discriminative power across models. Bottom row: Frequency-domain perturbations (sinusoidal phase vs. frequency) reveal sharp, architecture-specific vulnerability patterns with distinct peaks. Color intensity represents RMSE deviation magnitude, demonstrating that frequency-domain analysis exposes model-specific weaknesses invisible in time-domain analysis.
Figure: Frequency-domain sensitivity profiling results showing amplitude and phase vulnerability patterns across different LTSF models. Results demonstrate that each architecture exhibits unique spectral fingerprints that can be exploited for efficient adversarial manipulation, providing the foundation for TSFAdv's sensitivity-guided attack strategy.
TSFAdv integrates three core innovations:
- Sensitivity Profiling: Quantifies spectral perturbation vulnerabilities through frequency-domain analysis
- Spectrally-Constrained Optimization: Parameterizes perturbations in frequency domain for efficient search
- Sensitivity-Aware NES: Incorporates frequency-domain priors into Natural Evolution Strategies
Figure: Complete TSFAdv attack framework pipeline showing three main stages: (1) Offline sensitivity profiling quantifies model vulnerability to amplitude and phase perturbations across frequencies; (2) Frequency-domain Natural Evolution Strategies estimate gradients through stochastic sampling in spectral space; (3) Sensitivity-weighted optimization generates targeted perturbations by weighting gradient estimates with spectral vulnerability maps. This frequency-guided approach enables query-efficient black-box attacks on LTSF models.
-
Frequency-Domain Sensitivity Analysis: Measures amplitude and phase perturbation sensitivities
$S_{\text{amp}}(k)$ and$S_{\text{phase}}(k)$ -
Sensitivity-Weighted Gradient Estimation: Modulates conventional NES with frequency priors
$\hat{g} = \frac{1}{m} \sum_{k=1}^{m} \left(\frac{\ell_k^{+} - \ell_k^{-}}{2\sigma}\right) (u_k \odot \tilde{w})$ - Trajectory-Level Evaluation: Uses Dynamic Time Warping (DTW) and Slope Misalignment Error (SME) for comprehensive attack assessment
Figure: Comprehensive performance comparison of TSFAdv against baseline attacks across seven LTSF architectures and five datasets. Results show DTW and SME improvements over baseline methods, with TSFAdv consistently achieving 21.91-85.00% DTW improvements and 15.04-61.97% SME improvements under 200-query constraints, demonstrating the effectiveness of frequency-guided attacks across diverse model architectures.
| Metric | Improvement Range | Significance |
|---|---|---|
| 🎯 DTW Improvements | 21.91% - 85.00% | High statistical significance |
| 📈 SME Improvements | 15.04% - 61.97% | Consistent across datasets |
| ⚡ Query Efficiency | 70-80% effectiveness in 200 queries | 5x faster convergence |
| 🏗️ Architecture Coverage | 7 diverse LTSF models | Universal applicability |
Perturbation Budget. TSFAdv consistently outperforms all baselines across the entire perturbation spectrum, with performance gaps widening as
Figure: Perturbation budget analysis showing attack effectiveness as a function of perturbation magnitude ε. TSFAdv consistently outperforms all baseline methods across the entire perturbation spectrum, with performance gaps widening as ε increases. This demonstrates that frequency-domain attacks maintain effectiveness under both conservative and aggressive perturbation constraints, highlighting the robustness of spectral vulnerability exploitation.
Query Budget. TSFAdv achieves rapid early convergence (70-80% of maximum effectiveness within 200 queries) across architectures.
Figure: Query efficiency analysis showing attack success rate as a function of query budget. TSFAdv achieves rapid convergence, reaching 70-80% of maximum effectiveness within 200 queries, compared to baseline methods requiring 1000+ queries for similar performance. This demonstrates approximately 5x faster convergence, making TSFAdv practical for real-world scenarios where query access is limited or monitored.
Baseline approaches exhibit negligible correlation with sensitivity patterns, whereas TSFAdv demonstrates pronounced alignment with frequency-domain sensitivity profiles.
Figure: Frequency-domain sensitivity alignment analysis comparing TSFAdv with baseline approaches. Results show that baseline methods exhibit negligible correlation with model sensitivity patterns, while TSFAdv demonstrates pronounced alignment with frequency-domain sensitivity profiles. This validates the effectiveness of sensitivity-guided gradient estimation in exploiting architecture-specific spectral vulnerabilities.
Existing defenses exhibit limited effectiveness against frequency-domain attacks. Randomized smoothing counterproductively increases both DTW (+4.8%) and SME (+38.2%), as Gaussian noise amplifies spectral perturbations.
Figure: Defense mechanism evaluation against TSFAdv attacks. Results show that existing defenses exhibit limited effectiveness against frequency-domain attacks. Counterintuitively, randomized smoothing increases attack effectiveness (+4.8% DTW, +38.2% SME) as Gaussian noise amplifies frequency-domain perturbations. This highlights the need for specialized defense mechanisms specifically designed to protect against spectral adversarial attacks in time series forecasting systems.
For more detailed visualizations and high-resolution figures, please check the pics/ folder which contains:
- Extended experimental results and analysis plots
- High-resolution versions of all figures for publications
- Additional case studies and ablation study visualizations
# Environment Setup
conda create -n tsfadv python=3.11
conda activate tsfadv
pip install -r requirements.txt
# Full Evaluation
bash run_all.shDatasets: ETTh1, ETTh2, ETTm1, ETTm2, Weather Target Models: DLinear, Autoformer, PatchTST, iTransformer, TimesNet, TimeMixer, TimeXer Evaluation Metrics: Dynamic Time Warping (DTW), Slope Misalignment Error (SME), MSE, MAE
We thank the following open-source projects for their valuable contributions:
This project is licensed under the MIT License - see the LICENSE file for details.