[Paper] Generating Financial Time Series by Matching Random Convolutional Features

Published: 1 day ago (June 3, 2026 at 01:46 PM EDT)

4 min read

Source: arXiv

Source: arXiv - 2606.05138v1

Overview

The paper introduces SOCK (SOft Competing Kernels), a fully differentiable random‑convolutional feature extractor that can be used to train generative models for financial time series. By matching these random features between real and synthetic data, the authors achieve more realistic synthetic price paths—especially when only a handful of historical trajectories are available.

Key Contributions

Differentiable Random Convolutional Features: Proposes SOCK, the first random‑convolutional map that is end‑to‑end differentiable, enabling gradient‑based training of generators.
Improved Generator Training: Shows that matching SOCK features yields generators that consistently beat state‑of‑the‑art baselines based on path signatures and diffusion models on small‑sample financial datasets.
Broad Empirical Validation: Demonstrates SOCK’s versatility on two‑sample hypothesis testing and time‑series classification, where it matches or exceeds existing unsupervised feature maps (e.g., ROCKET, Hydra).
Practical Toolkit: Provides an open‑source implementation that integrates easily with popular deep‑learning frameworks (PyTorch, TensorFlow).

Methodology

Random Convolutional Kernels: SOCK draws a large set of 1‑D convolutional kernels from a simple distribution (e.g., Gaussian). Each kernel is applied to the input series, followed by a non‑linear pooling (e.g., max, mean).
Soft Competition Layer: To make the whole pipeline differentiable, the authors replace the hard arg‑max selection used in ROCKET with a softmax‑weighted combination of kernel responses. This “soft competition” preserves the expressive power of random convolutions while allowing gradients to flow back to the generator.
Feature Matching Objective: A generator (G) receives a noise vector and outputs a synthetic series. The loss is the squared distance between the average SOCK feature vectors of real data ({x_i}) and generated data ({G(z_j)}):
[ \mathcal{L}{\text{SOCK}} = \big| \frac{1}{N}\sum_i \phi{\text{SOCK}}(x_i) - \frac{1}{M}\sum_j \phi_{\text{SOCK}}(G(z_j)) \big|2^2 ]
where (\phi{\text{SOCK}}) denotes the differentiable random‑conv feature map.
Training Loop: The generator is updated with standard stochastic gradient descent (or Adam) using (\mathcal{L}_{\text{SOCK}}). No discriminator is required, sidestepping over‑fitting issues common in GAN‑style adversarial training with tiny datasets.

Results & Findings

Dataset (samples)	Baseline (Signature)	Baseline (Diffusion)	SOCK‑trained Generator
S&P 500 daily (30)	0.71 (KS‑stat)	0.68	0.84
FX EUR/USD (50)	0.66	0.62	0.80
Crypto BTC (20)	0.59	0.55	0.77

Higher statistical similarity: SOCK‑trained generators achieve larger Kolmogorov–Smirnov (KS) statistics and lower Wasserstein distances, indicating synthetic series that are statistically indistinguishable from the real ones.
Robustness to sample size: Performance gains are most pronounced when the training set contains fewer than 100 trajectories—a regime typical for proprietary financial data.
Classification & Two‑sample tests: When SOCK features are used as embeddings for downstream tasks, they reach 92 % accuracy on the UCR “ElectricDevices” benchmark and outperform ROCKET on a two‑sample test with a 5 % significance level.

Practical Implications

Synthetic Data for Stress‑Testing: Banks and fintechs can generate realistic price paths for Monte‑Carlo risk simulations without needing massive historical archives.
Data Augmentation for ML Pipelines: Developers building predictive models (e.g., volatility forecasting, algorithmic trading) can augment scarce training data with high‑fidelity synthetic series, improving model generalization.
Privacy‑Preserving Sharing: Financial institutions can share SOCK‑generated datasets with partners or regulators while mitigating disclosure risk, since the generator does not memorize exact historical trajectories.
Plug‑and‑Play Integration: Because SOCK is just a set of random convolutions followed by a softmax pooling, it can be dropped into existing PyTorch/TensorFlow training loops with a single line of code—no custom CUDA kernels required.

Limitations & Future Work

Randomness Dependency: While SOCK is differentiable, its performance still hinges on the number and distribution of random kernels; selecting these hyper‑parameters may require modest tuning.
Scope to Other Domains: The study focuses on short‑term financial series; extending SOCK to longer‑horizon macro‑economic time series or high‑frequency tick data remains an open question.
Theoretical Guarantees: The paper provides empirical evidence of expressiveness but lacks a formal analysis of why soft competition preserves the discriminative power of hard max‑pooling.
Future Directions: The authors suggest exploring learned (instead of purely random) kernel initializations, combining SOCK with adversarial discriminators for hybrid training, and applying the method to multi‑asset joint generation.

Authors

Konrad J. Mueller
Nikita Zozoulenko
Ben Wood
Thomas Cass
Lukas Gonon

Paper Information

arXiv ID: 2606.05138v1
Categories: cs.LG, q-fin.ST
Published: June 3, 2026
PDF: Download PDF

[Paper] Generating Financial Time Series by Matching Random Convolutional Features

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] STRIDE: Training Data Attribution via Sparse Recovery from Subset Perturbations

[Paper] Streaming Communication in Multi-Agent Reasoning

[Paper] Reinforcement Learning from Rich Feedback with Distributional DAgger

[Paper] Multi-Column RBF Neural Network Using Adaptive and Non-Adaptive Particle Swarm Optimization