Samples for Latent Size Ablation

Introduction

In the following samples, we convert a simple MIDI clip to Beethoven domain and Bach domain.

We perform ablation study on the dimensionality of the latent encoding. As can be heard, a latent dimensionality of 64 tends to reconstruct the input (unwanted memorization).
A model with a latent space of 8 performs well. A model with a latent dimensionality of 4 is more creative, less related to the input midi, and also suffers from a reduction in quality.

Audio Source

Musical Translation

Latent Size Beethoven Bach
4
8
64