DDSP Guitar Amp

Abstract

Neural network models for guitar amplifier emulation, while being effective, often demand high computational cost and lack interpretability. Drawing ideas from physical amplifier design, this paper aims to address these issues with a new differentiable digital signal processing (DDSP)-based model, called ``DDSP guitar amp,'' that models the four components of a guitar amp (i.e., preamp, tone stack, power amp, and output transformer) using specific DSP-inspired designs. With a set of time- and frequency-domain metrics, we demonstrate that DDSP guitar amp achieves performance comparable with that of black-box baselines while requiring less than 10% of the computational operations per audio sample, thereby holding greater potential for usages in real-time applications.

Motivation

Neural networks are good at modeling guitar amplifiers. However, they are also non-interpretable and computationally heavy. Can we develop interpretable models with low computational requirements?

Motivation-2

The diagram below illustrates the modeling architecture assumption. As we can see, our model is fully interpretable for all its components.

Model Architecture

The overall model architecture of DDSP Guitar Amp. The control parameters \(\phi\) the model takes include the knob settings for five knobs: "gain," "bass," "mids," "treble," and "master" volume.

Detail Components

Details of DDSP Guitar Amp. Letters in the squares denote their respective types---\(\texttt{F}\): filter, \(\texttt{G}\): gain, \(\texttt{NL}\): nonlinear function, \(\texttt{L}\): low-shelf filter, \(\texttt{P}\): peak filter, \(\texttt{H}\): high-shelf filter, \(\texttt{M}\): master, \(\texttt{PS}\): phase splitter, \(\texttt{PI}\): phase inversion. Orange blocks: knob controller-estimated parameters for multiplication. Pink blocks: designed operations with Knob Controller-predicted parameters. Green blocks: fixed operations without learnable parameters. Best viewed in color.

Result

We progressively evaluate our model by considering more comprehensive design of the guitar amp.

Case Study 1

Does our assupmtion about GRU and output transformer is reasonable?

As seen in the figure, our trained output transformer part showed very similar behavior with the standard magnetic hysteresis curve of a physical output transformer. This shows that our assumption about GRU and hysteresis is aligned, and further control to the GRU has large potential to model the further complex transformer behavior. For example, the width of our distortion curve is not wide enough compared to that of the physical one.

Case Study 2

Small Concat-GRU vs. Big Concat-GRU vs. DDSP Guitar Amp

Target Audio

Small Concat-GRU

Big Concat-GRU

DDSP Guitar Amp

The figures above illustrate the output differences among the three models. In each graph, the blue line represents the target waveform, while the orange line shows the model's output. The most significant differences are observed in the transient parts of the audio. 1). Small Concat-GRU: This model struggles to accurately reproduce the transients, resulting in audio that lacks micro-dynamics and sounds less lively. 2). Big Concat-GRU: Among the three models, this one produces transients most similar to the target. However, there are still noticeable discrepancies. 3). DDSP Guitar Amp: This model tends to slightly overemphasize the transient parts, leading to a somewhat exaggerated attack in the audio. Overall, while each model has its strengths, none of them can perfectly capture the transients with complete accuracy. When comparing the audio outputs to the target, there are still noticeable differences in texture and overall sound quality. This suggests that further improvements in modeling transient behavior could significantly enhance the performance of these guitar amplifier emulations.

Case Study 3

DDSP Guitar Amp vs. Small Concat-GRU 8 (Similar Computation Cost)

Small Concat-GRU 8

DDSP Guitar Amp

Comparing the Small Concat-GRU and DDSP Guitar Amp models reveals two key differences in their audio output characteristics:

Small Concat-GRU: 1. Initial Spike: There is a significant spike at the beginning of the audio segment, with relatively high amplitude which can not be ignored. 2. DC Offset: A clear offset is visible between the input (blue) and output (orange) waveforms, particularly noticeable at the start and end of the audio segment.

DDSP Guitar Amp: 1. Initial Spike: While still present, the spike at the beginning of the audio is much smaller in amplitude compared to the Small Concat-GRU model. 2. DC Offset: The offset between input and output waveforms is negligible and can be effectively disregarded.

These observations suggest important implications for the Small Concat-GRU model: 1. DC Bias: The presence of a DC offset indicates a DC bias issue, which may be attributed to implicit bias terms within the model's architecture. 2. Headroom Reduction: The large initial spike could potentially lead to decreased headroom in the audio signal, although further research is needed to confirm this effect. It's important to note that while these observations provide valuable insights, our current study did not include specific experiments to quantify these effects. Further research is necessary to fully understand the implications of these characteristics and their impact on overall audio quality and performance.

Audio Example

The following section demonstrates the audio example. Those audio files are from the test dataset, which means the audio content is not seen during training. The audio is heavy distortion, so please volume down first and adjust the volume carefully.