The phase vocoder (PV) is a widely spread technique for processing audio signals. It employs a short-time Fourier transform (STFT) analysis-modify-synthesis loop and is typically used for time-scaling of signals by means of using different time steps for STFT analysis and synthesis. The main challenge of PV used for that purpose is the correction of the STFT phase. In this paper, we introduce a novel method for phase correction based on phase gradient estimation and its integration. The method does not require explicit peak picking and tracking nor does it require detection of transients and their separate treatment. Yet, the method does not suffer from the typical phase vocoder artifacts even for extreme time stretching factors.
The preprint is available here here.
The following archive pvdoneright.zip[1.4MB] contains a win32 executable pvdoneright.exe which is an audio player allowing changing the playback tempo in the range 10%-1000% on-the-fly using the proposed algorithm. Now includes pitch-shifting. Please read README before running the demo.
The source code is now available here.
Sound examples used in the listening test are available in Table 1. The following algorithms and software were used in the comparison:
The playback can be started by selecting one of the table cells (the cells turn yellow when the cursor hovers over them). All the files were compressed by oggenc from vorbis-tools 1.4.0 with the default settings. Your browser must support HTML5 audio player and it must be able to decode ogg files. Alternativelly, the file path is shown below the player and it can be downloaded by Save Link As ...
Table 1: Time stretched examples Loaded file: None |
---|
Stretching factor 2.0 | Stretching factor 1.5 | |||||||||
Original | Prop. | PV | EL | ME | IR | Prop. | PV | EL | ME | IR |
CastViolin | [play] | [play] | [play] | [play] | [play] | [play] | [play] | [play] | [play] | [play] |
DrumSolo | [play] | [play] | [play] | [play] | [play] | [play] | [play] | [play] | [play] | [play] |
Latino | [play] | [play] | [play] | [play] | [play] | [play] | [play] | [play] | [play] | [play] |
Musetta | [play] | [play] | [play] | [play] | [play] | [play] | [play] | [play] | [play] | [play] |
March | [play] | [play] | [play] | [play] | [play] | [play] | [play] | [play] | [play] | [play] |
PeterGabriel | [play] | [play] | [play] | [play] | [play] | [play] | [play] | [play] | [play] | [play] |
EddieRabbit | [play] | [play] | [play] | [play] | [play] | [play] | [play] | [play] | [play] | [play] |
The following table presents additional sound examples for an extreme stretching factor 4:
Table 2: Extreme examples Loaded file: None |
---|
Stretching factor 4.0 | |||||
Original | Prop. | PV | EL | ME | IR |
CastViolin | [play] | [play] | [play] | [play] | [play] |
DrumSolo | [play] | [play] | [play] | [play] | [play] |
Latino | [play] | [play] | [play] | [play] | [play] |
Musetta | [play] | [play] | [play] | [play] | [play] |
March | [play] | [play] | [play] | [play] | [play] |
PeterGabriel | [play] | [play] | [play] | [play] | [play] |
EddieRabbit | [play] | [play] | [play] | [play] | [play] |