1
3 RTG 2224 PhD projects R3 Nonnegative Matrix and Tensor Factorizations – Theory and Selected Applications Blind Source Separation in Polyphonic Music Recordings Pascal Fernsel, Peter Maass S¨oren Schulze, Emily J. King Nonnegative Matrix Factorization (NMF) Classical NMF Problem For a given matrix X R N ×T 0 , find matrices B R N ×K 0 and C R K×T 0 with K min{N,T }, s.t. X BC . Task Areas Used e.g. for dimension reduction, data compression, ba- sis learning and source separation. Optimization Theory Review and extension of the majorize-minimization prin- ciple for generalized NMF models. 1 min B ,C 0 D NMF (X , BC )+ L X =1 γ P (B , C ) Applications MALDI Imaging Development of supervised NMF models for extracting tumor-specific spectral patterns. 2 Testing Training Feature Extraction and Classification Train Class Labels Train Data Test Data Pseudo Spectra Supervised NMF Basis Map Test Feature Data Classifier Classifier Parameters Test Prediction Dynamic Computerized Tomography Reconstruction and low-rank decomposition for dynamic inverse problems via NMF. 3 X Reconstruction Time steps Intensity C Temporal Features Spatial Features B Intensity Time steps Clustering Development of an orthogonal NMF model with total variation regularization to enforce spatial coherence. Find connections between regularized orthogonal NMF and generalized K-means models. Audio Source Separation We want to identify and separate the contributions of differ- ent instruments in a music recording. 4 2 4 2 8 Violin Recorder Figure 1: Musical score for a piece by Mozart This is much easier if the tones from one instrument fol- low the same pitch-invariant pattern. We developed a pitch- invariant and frequency-uniform time-frequency representa- tion that combines properties from the mel spectrogram and the constant-Q transform. 4 However, in order to improve the resolution, we rely on a sparse pursuit method. 5 Sparse Pursuit We use an algorithm to represent a discrete mixture via shifted continuous patterns from a parametric model. Discrete mixture Matched mixture Pattern 1 Pattern 1, adjusted Pattern 2 Pattern 2, adjusted Input Output Figure 2: Test example for the sparse pursuit algorithm With this, we can recover the spectra of the individual instru- ments. 0 2 4 6 8 0.1 1 10 Time [s] Frequency [kHz] Original recording 0 2 4 6 8 Time [s] Separated recorder track 0 2 4 6 8 Time [s] Separated violin track Figure 3: Separation result for the above music piece In blind separation, the sounds of the instruments are not known a-priori, so we train a dictionary containing the rela- tive amplitudes of the harmonics. 1 P. Fernsel, P. Maass: A survey on surrogate approaches to non-negative matrix factorization. Vietnam Journal of Mathematics, 46 (2018), pp. 987-1021. 2 J. Leuschner, M. Schmidt, P. Fernsel, D. Lachmund, T. Boskamp, P. Maass: Supervised non-negative matrix factorization methods for MALDI imaging applications. Bioinformatics, 35 (2018), pp. 1940-1947. 3 S. Arridge, P. Fernsel, A. Hauptmann: Joint reconstruction and low-rank decomposition for dynamic inverse problems. arXiv:2005.14042, 2020. 4 S. Schulze, E. J. King: A frequency-uniform and pitch-invariant time-frequency representation. PAMM 2019. 5 S. Schulze, E. J. King: Sparse pursuit and dictionary learning for blind source separation in polyphonic music recordings. Submitted. arXiv:1806.00273, 2020.

Violin - math.uni-bremen.de

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Violin - math.uni-bremen.de

3

RTG 2224

PhD projects R3Nonnegative Matrix and Tensor Factorizations – Theory and Selected ApplicationsBlind Source Separation in Polyphonic Music RecordingsPascal Fernsel, Peter Maass Soren Schulze, Emily J. King

Nonnegative Matrix Factorization (NMF)

Classical NMF Problem

� For a given matrix X ∈ RN×T≥0 , find matrices B ∈ RN×K

≥0and C ∈ RK×T

≥0 with K � min{N, T}, s.t. X ≈ BC .

Task Areas

� Used e.g. for dimension reduction, data compression, ba-sis learning and source separation.

Optimization Theory

� Review and extension of the majorize-minimization prin-ciple for generalized NMF models.1

minB ,C≥0

DNMF(X ,BC ) +L∑

`=1

γ`P`(B ,C )

Applications

MALDI Imaging

� Development of supervised NMF models for extractingtumor-specific spectral patterns.2

Testing

Training

Feature Extraction and Classi�cation

Train ClassLabels

Train Data

Test Data

PseudoSpectra

SupervisedNMF

Basis Map

Test FeatureData

Classi�er

Classi�erParameters

TestPrediction

Dynamic Computerized Tomography

� Reconstruction and low-rank decomposition for dynamicinverse problems via NMF.3

X

1

Reconstruction

Time steps

Inte

nsity

C

1

Temporal FeaturesSpatial FeaturesB

1

Inte

nsity

Time steps

Clustering

� Development of an orthogonal NMF model with totalvariation regularization to enforce spatial coherence.

� Find connections between regularized orthogonal NMFand generalized K-means models.

Audio Source Separation

We want to identify and separate the contributions of differ-ent instruments in a music recording.

� �

����

���

��

����

� ����

���42

42�� ��8

� �Violin

Recorder ���

�� ����

���� ��

Figure 1: Musical score for a piece by Mozart

This is much easier if the tones from one instrument fol-low the same pitch-invariant pattern. We developed a pitch-invariant and frequency-uniform time-frequency representa-tion that combines properties from the mel spectrogram andthe constant-Q transform.4 However, in order to improve theresolution, we rely on a sparse pursuit method.5

Sparse Pursuit

We use an algorithm to represent a discrete mixture viashifted continuous patterns from a parametric model.

Discrete mixture Matched mixture

Pattern 1 Pattern 1, adjusted

Pattern 2 Pattern 2, adjusted

Input Output

Figure 2: Test example for the sparse pursuit algorithm

With this, we can recover the spectra of the individual instru-ments.

0 2 4 6 8

0.1

1

10

Time [s]

Frequency

[kHz]

Original recording

0 2 4 6 8

Time [s]

Separated recorder track

0 2 4 6 8

Time [s]

Separated violin track

Figure 3: Separation result for the above music piece

In blind separation, the sounds of the instruments are notknown a-priori, so we train a dictionary containing the rela-tive amplitudes of the harmonics.

1P. Fernsel, P. Maass: A survey on surrogate approaches to non-negative matrix factorization. Vietnam Journal of Mathematics, 46 (2018), pp. 987-1021.2J. Leuschner, M. Schmidt, P. Fernsel, D. Lachmund, T. Boskamp, P. Maass: Supervised non-negative matrix factorization methods for MALDI imaging applications. Bioinformatics, 35 (2018), pp. 1940-1947.3S. Arridge, P. Fernsel, A. Hauptmann: Joint reconstruction and low-rank decomposition for dynamic inverse problems. arXiv:2005.14042, 2020.4S. Schulze, E. J. King: A frequency-uniform and pitch-invariant time-frequency representation. PAMM 2019.5S. Schulze, E. J. King: Sparse pursuit and dictionary learning for blind source separation in polyphonic music recordings. Submitted. arXiv:1806.00273, 2020.