Rewrite audio waveform generation and display code

The current code used to generate and display audio waveforms is not optimal in speed / memory usage / accuracy. We could improve this by rewriting the code, maybe using another library like the BBC audiowaveform code: https://github.com/bbc/audiowaveform

For reference, the code currently generating audio waveform is working like this:

we use an MLT function mlt_audio_calculate_frame_samples on each frame of the source video to get an audio level array (QVector<uint8_t> mltLevels):
https://invent.kde.org/multimedia/kdenlive/-/blob/master/src/jobs/audiolevelstask.cpp#L178
We then save the resulting array as a png file, using one row for each audio channel, end encoding the levels values as RBGA components on each pixel.

Finally, to display the waveform, we load again the png file into a QVector<uint8_t> container, and use a simple Qt code to draw the wave: https://invent.kde.org/multimedia/kdenlive/-/blob/master/src/timeline2/view/qml/timelineitems.cpp#L176

The current code uses MLT's audio data, but it shouldn't be a problem if the new solution creates audio thumbs using a different backend. Most of the thumbnails we create are from video/audio files that can be opened by FFmpeg, and for playlist clips, we can imagine exporting audio to a standard audio format before generating the thumbnails.

This is how waveforms currently look in timeline:

Edited Jun 21, 2024 by Jean-Baptiste Mardelle