Étienne André requested to merge etiandre/kdenlive-audiowaveform:work/audiowaveform into master Dec 11, 2024

Rewrite Audio Waveform generation + drawing

Addresses #1888.

This work is done as part of contract work with KDE e.V.

Summary

Waveform generation performance improvements

The original MLT method has been improved and a faster libav* -based one is used when possible. The original method used the "audiolevel" MLT filter, which only supported 1 point per frame and was quite slow.

Measured time taken for audio levels generation in release builds on my machine (AMD Ryzen 7 3700U with SSD).

original MLT is before these changes, with one point per frame;
new MLT is the improved method, with 5 points per frame;
new libav is the method using libav directly, with 5 points per frame.

file	new libav (s)	new MLT (s)	original MLT (s)
1h 20min of stereo uncompressed WAV	2.477	5.187	8.131
1h 20min of stereo max-compressed FLAC	6.818	9.03	12.513
26min of OPUS audio in a MKV video file	4.774	6.86	8.614

Better waveform resolution

This merge requests brings better waveform temporal resolution (1 -> N points per frame) and better vertical resolution (256 -> 65,535 levels).

Before changes (KDenlive 24.08.3):

After changes:

Fix incorrect waveform drawing function

When drawing the waveform, the current implementation samples the audio levels at points corresponding to the pixels to draw. This is incorrect and results in distorted waveforms, missing peaks, and visual artifacts. The new implementations uses a slower but correct max-based resampling method.

Before changes (KDenlive 24.08.3)

After changes:

Reference (audacity)

Stretching a waveform, before changes: stretch-before

Stretching a waveform, after changes:

stretch-after

Detailed changes

Change audiolevels sample format to uint16_t for increased precision
Add support for N points per frame, currently set to 5
projectclip.cpp:
- Use TimelineWaveform to render the audio clip thumbnails
- do not store audiolevels in object
- generate larger thumbnails
Change audio max property key from "kdenlive:audio_max%1" to "_kdenlive:audio_max%1"
audiolevelstask.cpp: major refactor
- Replace audiolevels PNG (de)serializer with a dumber one
- add fast libav-based generation
- simplify and improve MLT-based generation
  - disable caching
  - disable resampling
  - remove useless audiochannels filter
  - remove redundant stream selection
  - replace audiolevel filter with direct levels calculation
  - add generic computePeaks() function
kdenliveclipmonitor.qml: fix incorrect clip duration passed to waveform renderer
timelineitems.cpp: separated items into their own source/header files (timelineplayhead, timelinerecwaveform, timelinetriangle, timelinewaveform)
timelinewaveform.cpp: simplify and improve waveform rendering
- add support for N points per frame
- add support for fractional in and outpoints
- replace incorrect sampling drawing function with correct max-based one
- remove unused properties
- add even and odd color properties
Add tests
- Add tests for computePeaks and both generation methods
- Add audio tests files + script to generate them

Edited Dec 14, 2024 by Étienne André

audio waveform (audiolevels) rewrite