Commit d6b28a9b authored by Michael Pyne's avatar Michael Pyne
Browse files

tag_scan: Fix painful rescan of music metadata on startup.

For the longest time, JuK has suffered from a problem where its intended
behavior to load music metadata from a cached database, instead of
re-scanning every individual track on startup, was not working right.

There has been debugging lines in JuK since all the way back to 2013
trying to trace what area of startup sequence was taking up all the
time, but to no avail in helping me fix the bug.

The Problem
===========

Recently I took a different approach, of adding a debug-only crash
whenever we would load a music track tag the "slow" way, and long story
short there were two bugs that each would cause slowdown:

1. Playlists aside from the CollectionList would cause every music track
   in that playlist to be re-scanned. What this means is that every
   though the music in the CollectionList would be loaded quickly, if
   you had that same music track in a separate Playlist, that music
   track would reload the same tags from disk rather than copying from
   the existing CollectionList item.  This was especially bad for users
   of the old "tree mode" view, since every individual artist *and*
   album were rendered as individual playlists, which would therefore
   each re-scan the music over and over again.
2. JuK supports a "folder scan" feature, and in fact really wants the
   user to have at least one folder assigned. Any music identified in
   this folder is added to the CollectionList automatically on startup
   and, you guessed it, causes the music track information to be loaded
   from disk, even if the music was already in the CollectionList! :(

The net effect is that most music would be re-scanned on startup unless
you were a user who used CollectionList exclusively, and had most of
your music not visible to the folder scanner.

The Solution
============

Due to how painful this problem has been, I had ended up adding a
threaded solution for the folder scan process. This didn't help make
things any faster but at least the GUI wasn't frozen. But now that the
threading code is present I judged it would be easier and safer to make
the central object holding track metadata (CollectionList's m_itemsDict)
available in thread-safe fashion.

This then permitted me to check for whether a track has already been
loaded when performing folder scan, and to check whether a track has
already been loaded when creating a new (non-CollectionList) Playlist.
In either event if the track already exists, then we copy the FileHandle
rather than create a new one.

The combination speeds up loading significantly, taking anywhere from
60% to 70% off of the total time to load on my system, with mostly a
CollectionList under folder scan and few additional playlists. In this
configuration I go from about 5.4 seconds to 1.5 seconds with cold
caches. The difference should be even more stark on systems where disk
I/O is expensive, or where there are a great number of tracks in
playlists outside of the CollectionList.

I consider this a bugfix (and there are even multiple bug reports) so I
will backport shortly.

CHANGELOG:Reduce startup time by 60-70% or more.
BUG:428772
BUG:317666
FIXED-IN:21.04
parent 36581b2b
Pipeline #55528 passed with stage
in 11 minutes and 1 second
......@@ -24,18 +24,20 @@
#include <kdirwatch.h>
#include <KLocalizedString>
#include <QList>
#include <QDragMoveEvent>
#include <QDropEvent>
#include <QApplication>
#include <QTimer>
#include <QTime>
#include <QMenu>
#include <QClipboard>
#include <QDragMoveEvent>
#include <QDropEvent>
#include <QElapsedTimer>
#include <QFileInfo>
#include <QHeaderView>
#include <QList>
#include <QMenu>
#include <QReadLocker>
#include <QSaveFile>
#include <QElapsedTimer>
#include <QTime>
#include <QTimer>
#include <QWriteLocker>
#include "playlistcollection.h"
#include "stringshare.h"
......@@ -84,6 +86,8 @@ void CollectionList::loadNextBatchCachedItems()
Cache *cache = Cache::instance();
bool done = false;
QReadLocker lock(&m_itemsDictLock);
for(int i = 0; i < 20; ++i) {
FileHandle cachedItem(cache->loadNextCachedItem());
......@@ -94,7 +98,10 @@ void CollectionList::loadNextBatchCachedItems()
// This may have already been created via a loaded playlist.
if(!m_itemsDict.contains(cachedItem.absFilePath())) {
lock.unlock();
CollectionListItem *newItem = new CollectionListItem(this, cachedItem);
lock.relock();
setupItem(newItem);
}
}
......@@ -149,7 +156,7 @@ CollectionListItem *CollectionList::createItem(const FileHandle &file, QTreeWidg
// It's probably possible to optimize the line below away, but, well, right
// now it's more important to not load duplicate items.
if(m_itemsDict.contains(file.absFilePath()))
if(hasItem(file.absFilePath()))
return nullptr;
CollectionListItem *item = new CollectionListItem(this, file);
......@@ -246,10 +253,14 @@ void CollectionList::saveItemsToCache() const
QDataStream s(&data, QIODevice::WriteOnly);
s.setVersion(QDataStream::Qt_4_3);
QHash<QString, CollectionListItem *>::const_iterator it;
for(it = m_itemsDict.begin(); it != m_itemsDict.end(); ++it) {
s << it.key();
s << (*it)->file();
{ // locked scope
QHash<QString, CollectionListItem *>::const_iterator it;
QWriteLocker lock(&m_itemsDictLock);
for(it = m_itemsDict.begin(); it != m_itemsDict.end(); ++it) {
s << it.key();
s << (*it)->file();
}
}
QDataStream fs(&f);
......@@ -288,12 +299,13 @@ void CollectionList::slotCheckCache()
qCDebug(JUK_LOG) << "Starting to check cached items for consistency";
stopwatch.start();
int i = 0;
foreach(CollectionListItem *item, m_itemsDict) {
if(!item->checkCurrent())
invalidItems.append(item);
if(++i == (m_itemsDict.size() / 2))
qCDebug(JUK_LOG) << "Checkpoint";
{ // locked scope
QWriteLocker lock(&m_itemsDictLock);
for(auto item : qAsConst(m_itemsDict)) {
if(!item->checkCurrent())
invalidItems.append(item);
}
}
clearItems(invalidItems);
......@@ -303,13 +315,15 @@ void CollectionList::slotCheckCache()
void CollectionList::slotRemoveItem(const QString &file)
{
QWriteLocker lock(&m_itemsDictLock);
delete m_itemsDict[file];
}
void CollectionList::slotRefreshItem(const QString &file)
{
if(m_itemsDict[file])
m_itemsDict[file]->refresh();
auto item = lookup(file);
if(item)
item->refresh();
}
////////////////////////////////////////////////////////////////////////////////
......@@ -370,6 +384,30 @@ void CollectionList::dragMoveEvent(QDragMoveEvent *e)
e->setAccepted(false);
}
void CollectionList::addToDict(const QString &file, CollectionListItem *item)
{
QWriteLocker lock(&m_itemsDictLock);
m_itemsDict.insert(file, item);
}
void CollectionList::removeFromDict(const QString &file)
{
QWriteLocker lock(&m_itemsDictLock);
m_itemsDict.remove(file);
}
bool CollectionList::hasItem(const QString &file) const
{
QReadLocker lock(&m_itemsDictLock);
return m_itemsDict.contains(file);
}
CollectionListItem *CollectionList::lookup(const QString &file) const
{
QReadLocker lock(&m_itemsDictLock);
return m_itemsDict.value(file, nullptr);
}
QString CollectionList::addStringToDict(const QString &value, int column)
{
if(column > m_columnTags.count() || value.trimmed().isEmpty())
......@@ -410,11 +448,6 @@ QStringList CollectionList::uniqueSet(UniqueSetType t) const
return m_columnTags[column]->keys();
}
CollectionListItem *CollectionList::lookup(const QString &file) const
{
return m_itemsDict.value(file, nullptr);
}
void CollectionList::removeStringFromDict(const QString &value, int column)
{
if(column > m_columnTags.count() || value.trimmed().isEmpty())
......
......@@ -19,6 +19,7 @@
#include <QHash>
#include <QVector>
#include <QReadWriteLock>
#include "playlist.h"
#include "playlistitem.h"
......@@ -144,8 +145,8 @@ protected:
// These methods are used by CollectionListItem, which is a friend class.
void addToDict(const QString &file, CollectionListItem *item) { m_itemsDict.insert(file, item); }
void removeFromDict(const QString &file) { m_itemsDict.remove(file); }
void addToDict(const QString &file, CollectionListItem *item);
void removeFromDict(const QString &file);
// These methods are also used by CollectionListItem, to manage the
// strings used in generating the unique sets and tree view mode playlists.
......@@ -156,7 +157,7 @@ protected:
void addWatched(const QString &file);
void removeWatched(const QString &file);
virtual bool hasItem(const QString &file) const override { return m_itemsDict.contains(file); }
virtual bool hasItem(const QString &file) const override;
signals:
void signalCollectionChanged();
......@@ -203,6 +204,7 @@ private:
static CollectionList *m_list;
QHash<QString, CollectionListItem *> m_itemsDict;
mutable QReadWriteLock m_itemsDictLock;
KDirWatch *m_dirWatch;
TagCountDicts m_columnTags;
};
......
......@@ -21,6 +21,7 @@
#include <QMutexLocker>
#include "mediafiles.h"
#include "collectionlist.h"
// Classifies files into types for potential loading purposes.
enum class MediaFileType {
......@@ -118,8 +119,15 @@ FileHandle loadMediaFile(const QString &fileName)
static QMutex fhLock;
QMutexLocker locker(&fhLock);
FileHandle loadedMetadata(fileName);
(void) loadedMetadata.tag(); // Ensure tag is read
const auto item = CollectionList::instance()->lookup(fileName);
if(item) {
// Don't re-load a file if it's already loaded once
return FileHandle(item->file());
}
else {
FileHandle loadedMetadata(fileName);
(void) loadedMetadata.tag(); // Ensure tag is read
return loadedMetadata;
return loadedMetadata;
}
}
......@@ -116,8 +116,9 @@ void FileHandle::setFile(const QString &path)
Tag *FileHandle::tag() const
{
if(Q_UNLIKELY(!d->tag))
if(Q_UNLIKELY(!d->tag)) {
d->tag.reset(new Tag(d->absFilePath));
}
return d->tag.data();
}
......@@ -165,6 +166,10 @@ void FileHandle::read(CacheDataStream &s)
switch(s.cacheVersion()) {
case 1:
default:
if(d->tag) {
qCWarning(JUK_LOG) << "We already read in tag for d->absFilePath!";
}
if(!d->tag)
d->tag.reset(new Tag(d->absFilePath, true));
......
......@@ -1084,6 +1084,7 @@ void Playlist::read(QDataStream &s)
s >> files;
QTreeWidgetItem *after = 0;
const CollectionList *clInst = CollectionList::instance();
m_blockDataChanged = true;
......@@ -1091,7 +1092,15 @@ void Playlist::read(QDataStream &s)
if(file.isEmpty())
throw BICStreamException();
after = createItem(FileHandle(file), after);
const auto &cItem = clInst->lookup(file);
if(cItem) {
// Reuse FileHandle so the playlist doesn't force TagLib to read it
// from disk
after = createItem(cItem->file(), after);
}
else {
after = createItem(FileHandle(file), after);
}
}
m_blockDataChanged = false;
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment