Abstract:

The constant center frequency to bandwidth ratio (Q-factor) of wavelet transforms provides a very natural representation for audio data. So far, invertible wavelet transforms have either required non-uniform decimation---leading to irregular data structures that are cumbersome to work with---or require excessively high oversampling with unacceptable computational overhead. Here, we present a novel decimation strategy for wavelet transforms that leads to invertible representations, i.e., representations that allow perfect reconstruction, with minimal oversampling and uniform decimation. Numerical evidence shows that the resulting representation is highly stable in the sense of frame theory. The obtained wavelet coefficients can be stored in a time-frequency matrix with a natural interpretation of columns as time frames and rows as frequency channels. In a special case, the employed decimation strategy corresponds to a time-frequency lattice related to certain one-dimensional low discrepancy sequences. The matrix structure of the wavelet coefficients immediately grants access to a large number of algorithms that have been successfully used in time-frequency audio processing, but could not previously be used jointly with invertible wavelet transforms. We demonstrate the application of our method in processing based on non-negative matrix factorization and phaseless reconstruction.

Sound examples - Experiment 1 (Non-negative matrix factorization): The playback can be started by selecting one of the table cells (the cells turn yellow when the cursor hovers over them). Your browser must support HTML5 audio player. Alternativelly, the file path is shown below the player and it can be downloaded by Save Link As ...


Loaded file: None
Non-negative matrix factorization signals:
original lead accompaniment denoised upmix comp 1 comp 2 comp 3 comp 4 comp 5 comp 6 comp 7 comp 8 comp 9 comp 10

Sound examples - Experiment 2 (Phase retrieval): The playback can be started by selecting one of the table cells (the cells turn yellow when the cursor hovers over them). Your browser must support HTML5 audio player. Alternativelly, the file path is shown below the player and it can be downloaded by Save Link As ...


Loaded file: None
Original
Redundancy 10
Redundancy 5
Redundancy 3
Redundancy 1.8 (not reported in the paper)
Classical wavelets:
01_sine_ORG 01_sine_red_10_FGLIM 01_sine_red_5_FGLIM 01_sine_red_3_FGLIM 01_sine_red_1.8_FGLIM
02_pink_ORG 02_pink_red_10_FGLIM 02_pink_red_5_FGLIM 02_pink_red_3_FGLIM 02_pink_red_1.8_FGLIM
04_elgong_ORG 04_elgong_red_10_FGLIM 04_elgong_red_5_FGLIM 04_elgong_red_3_FGLIM 04_elgong_red_1.8_FGLIM
14_oboe_ORG 14_oboe_red_10_FGLIM 14_oboe_red_5_FGLIM 14_oboe_red_3_FGLIM 14_oboe_red_1.8_FGLIM
15_coranglais_ORG 15_coranglais_red_10_FGLIM 15_coranglais_red_5_FGLIM 15_coranglais_red_3_FGLIM 15_coranglais_red_1.8_FGLIM
16_clarinet_ORG 16_clarinet_red_10_FGLIM 16_clarinet_red_5_FGLIM 16_clarinet_red_3_FGLIM 16_clarinet_red_1.8_FGLIM
27_castanets_ORG 27_castanets_red_10_FGLIM 27_castanets_red_5_FGLIM 27_castanets_red_3_FGLIM 27_castanets_red_1.8_FGLIM
39_grandpiano_ORG 39_grandpiano_red_10_FGLIM 39_grandpiano_red_5_FGLIM 39_grandpiano_red_3_FGLIM 39_grandpiano_red_1.8_FGLIM
49_femaleeng_ORG 49_femaleeng_red_10_FGLIM 49_femaleeng_red_5_FGLIM 49_femaleeng_red_3_FGLIM 49_femaleeng_red_1.8_FGLIM
50_maleeng_ORG 50_maleeng_red_10_FGLIM 50_maleeng_red_5_FGLIM 50_maleeng_red_3_FGLIM 50_maleeng_red_1.8_FGLIM
51_femalefra_ORG 51_femalefra_red_10_FGLIM 51_femalefra_red_5_FGLIM 51_femalefra_red_3_FGLIM 51_femalefra_red_1.8_FGLIM
52_malefra_ORG 52_malefra_red_10_FGLIM 52_malefra_red_5_FGLIM 52_malefra_red_3_FGLIM 52_malefra_red_1.8_FGLIM
53_femaleger_ORG 53_femaleger_red_10_FGLIM 53_femaleger_red_5_FGLIM 53_femaleger_red_3_FGLIM 53_femaleger_red_1.8_FGLIM
54_maleger_ORG 54_maleger_red_10_FGLIM 54_maleger_red_5_FGLIM 54_maleger_red_3_FGLIM 54_maleger_red_1.8_FGLIM
70_eddierabbit_ORG 70_eddierabbit_red_10_FGLIM 70_eddierabbit_red_5_FGLIM 70_eddierabbit_red_3_FGLIM 70_eddierabbit_red_1.8_FGLIM
STFT:
01_sine_ORG 01_sine_red_10_FGLIM 01_sine_red_5_FGLIM 01_sine_red_3_FGLIM 01_sine_red_1.8_FGLIM
02_pink_ORG 02_pink_red_10_FGLIM 02_pink_red_5_FGLIM 02_pink_red_3_FGLIM 02_pink_red_1.8_FGLIM
04_elgong_ORG 04_elgong_red_10_FGLIM 04_elgong_red_5_FGLIM 04_elgong_red_3_FGLIM 04_elgong_red_1.8_FGLIM
14_oboe_ORG 14_oboe_red_10_FGLIM 14_oboe_red_5_FGLIM 14_oboe_red_3_FGLIM 14_oboe_red_1.8_FGLIM
15_coranglais_ORG 15_coranglais_red_10_FGLIM 15_coranglais_red_5_FGLIM 15_coranglais_red_3_FGLIM 15_coranglais_red_1.8_FGLIM
16_clarinet_ORG 16_clarinet_red_10_FGLIM 16_clarinet_red_5_FGLIM 16_clarinet_red_3_FGLIM 16_clarinet_red_1.8_FGLIM
27_castanets_ORG 27_castanets_red_10_FGLIM 27_castanets_red_5_FGLIM 27_castanets_red_3_FGLIM 27_castanets_red_1.8_FGLIM
39_grandpiano_ORG 39_grandpiano_red_10_FGLIM 39_grandpiano_red_5_FGLIM 39_grandpiano_red_3_FGLIM 39_grandpiano_red_1.8_FGLIM
49_femaleeng_ORG 49_femaleeng_red_10_FGLIM 49_femaleeng_red_5_FGLIM 49_femaleeng_red_3_FGLIM 49_femaleeng_red_1.8_FGLIM
50_maleeng_ORG 50_maleeng_red_10_FGLIM 50_maleeng_red_5_FGLIM 50_maleeng_red_3_FGLIM 50_maleeng_red_1.8_FGLIM
51_femalefra_ORG 51_femalefra_red_10_FGLIM 51_femalefra_red_5_FGLIM 51_femalefra_red_3_FGLIM 51_femalefra_red_1.8_FGLIM
52_malefra_ORG 52_malefra_red_10_FGLIM 52_malefra_red_5_FGLIM 52_malefra_red_3_FGLIM 52_malefra_red_1.8_FGLIM
53_femaleger_ORG 53_femaleger_red_10_FGLIM 53_femaleger_red_5_FGLIM 53_femaleger_red_3_FGLIM 53_femaleger_red_1.8_FGLIM
54_maleger_ORG 54_maleger_red_10_FGLIM 54_maleger_red_5_FGLIM 54_maleger_red_3_FGLIM 54_maleger_red_1.8_FGLIM
70_eddierabbit_ORG 70_eddierabbit_red_10_FGLIM 70_eddierabbit_red_5_FGLIM 70_eddierabbit_red_3_FGLIM 70_eddierabbit_red_1.8_FGLIM
Proposed wavelets:
01_sine_ORG 01_sine_red_10_FGLIM 01_sine_red_5_FGLIM 01_sine_red_3_FGLIM 01_sine_red_1.8_FGLIM
02_pink_ORG 02_pink_red_10_FGLIM 02_pink_red_5_FGLIM 02_pink_red_3_FGLIM 02_pink_red_1.8_FGLIM
04_elgong_ORG 04_elgong_red_10_FGLIM 04_elgong_red_5_FGLIM 04_elgong_red_3_FGLIM 04_elgong_red_1.8_FGLIM
14_oboe_ORG 14_oboe_red_10_FGLIM 14_oboe_red_5_FGLIM 14_oboe_red_3_FGLIM 14_oboe_red_1.8_FGLIM
15_coranglais_ORG 15_coranglais_red_10_FGLIM 15_coranglais_red_5_FGLIM 15_coranglais_red_3_FGLIM 15_coranglais_red_1.8_FGLIM
16_clarinet_ORG 16_clarinet_red_10_FGLIM 16_clarinet_red_5_FGLIM 16_clarinet_red_3_FGLIM 16_clarinet_red_1.8_FGLIM
27_castanets_ORG 27_castanets_red_10_FGLIM 27_castanets_red_5_FGLIM 27_castanets_red_3_FGLIM 27_castanets_red_1.8_FGLIM
39_grandpiano_ORG 39_grandpiano_red_10_FGLIM 39_grandpiano_red_5_FGLIM 39_grandpiano_red_3_FGLIM 39_grandpiano_red_1.8_FGLIM
49_femaleeng_ORG 49_femaleeng_red_10_FGLIM 49_femaleeng_red_5_FGLIM 49_femaleeng_red_3_FGLIM 49_femaleeng_red_1.8_FGLIM
50_maleeng_ORG 50_maleeng_red_10_FGLIM 50_maleeng_red_5_FGLIM 50_maleeng_red_3_FGLIM 50_maleeng_red_1.8_FGLIM
51_femalefra_ORG 51_femalefra_red_10_FGLIM 51_femalefra_red_5_FGLIM 51_femalefra_red_3_FGLIM 51_femalefra_red_1.8_FGLIM
52_malefra_ORG 52_malefra_red_10_FGLIM 52_malefra_red_5_FGLIM 52_malefra_red_3_FGLIM 52_malefra_red_1.8_FGLIM
53_femaleger_ORG 53_femaleger_red_10_FGLIM 53_femaleger_red_5_FGLIM 53_femaleger_red_3_FGLIM 53_femaleger_red_1.8_FGLIM
54_maleger_ORG 54_maleger_red_10_FGLIM 54_maleger_red_5_FGLIM 54_maleger_red_3_FGLIM 54_maleger_red_1.8_FGLIM
70_eddierabbit_ORG 70_eddierabbit_red_10_FGLIM 70_eddierabbit_red_5_FGLIM 70_eddierabbit_red_3_FGLIM 70_eddierabbit_red_1.8_FGLIM
Fig. 2: Whisker plots showing the minimal, median, and maximal spectral SNR after phaseless reconstruction for 15 signals and three transforms, namely, the classical wavelet transform, the proposed wavelet transform with Kronecker sequence based decimation, and the STFT. The different oversampling rates are arranged vertically. A wavelet system is used as a reference.