A brief history of Dolphin’s performance and memory usage

Since I took over Dolphin’s maintainership almost a year and a half ago, we have not only fixed lots of bugs and improved many details which make Dolphin more pleasant to use, but also made quite a few changes under the hood to make Dolphin faster and less memory-hungry. The other day, I compiled a few different Dolphin versions and ran some benchmarks to see how much has changed during the past 18 months.

The most straightforward way to get an idea of an application’s performance is to see how it can handle a large amount of data, so I created 100,000 empty files in a new directory using

touch a{1..100000}.txt

I then built the tags v4.9.0, v.4.10.0, v.4.11.0, and the KDE/4.12 branch (to be precise, commit f863968b7) of the kde-baseapps repository in release mode, and checked how long it takes to open the directory with my rather old computer (Athlon 64 X2 5000+ CPU) in all view modes: Icons, Compact, and Details. For all my tests, I used the same recent KDE/4.12 checkout of kdelibs (commit 3028cc4b6). I opened the directory in Dolphin once and then closed the application before starting the measurements, to ensure that the files are cached by the kernel, and that slow disk reads with non-reproducible access times do not add noise to the results.

The time required to load the directory is measured and written to the shell that Dolphin is run from with a simple patch. After the initial loading, I pressed F5 eight times and calculated the mean value of all results to reduce random fluctuations. Here are the results:

performanceMoreover, I measured with KSysGuard how much memory the Dolphin process uses. The memory usage fluctuates less than the time which is required for loading the directory, but still, I started Dolphin five times and calculated the mean value.

memoryI hope that you like these results as much as I do! All Dolphin users should see similar relative savings, but the absolute values might obviously be different. E.g., the memory usage will be much lower on a 32-bit system because all pointers need only 4 bytes instead of 8.

I will tell you in a minute how these savings were achieved, but let me first make some remarks, which will also answer the questions that some readers might already have.

  • The timer whose measurements the first plot is based on is started just before the KDirLister’s openUrl(const KUrl&) method is called, and stopped when the files are shown on the screen. In v4.9.0 and v4.10.0, the CPU and disk were still busy for quite some time after that, because Dolphin versions before 4.11.0 loaded the icons for all files in the background (some details can be found in a recent blog post). In v4.10.0, it took more than two minutes on my system until this process was completed. During these two minutes, the memory consumption grew very slowly, but constantly, until it reached the final value which is shown in the second plot. In v4.9.0, that took even much longer due to some inefficiencies which were fixed before v4.10.0 (for example this one). Therefore, it was not really possible to perform detailed memory usage measurements for v4.9.0. I just took a single data point and found that there is not a huge difference between v4.9.0 and v4.10.0.
  • The reason why Details View often requires more memory than the other view modes is that the correct handling of expanded folders requires some additional data, and earlier Dolphin versions initialized the corresponding data structures even if it was not clear yet if these data would be needed or not. This is also the reason why Details View was the slowest view mode in v4.9.0 and v4.10.0.
  • In Icons and Compact View, more time is spent on the item layout because the height of a row (in Icons View) or the width of a column (in Compact View) depends on the file names. In Icons View, we calculate how many lines of text will be shown under each icon, and in Compact View, we calculate the width of the text (which is slightly cheaper because no line breaks have to be taken into account). This information is needed to initialize the scroll bar correctly. We might consider using guessed values for the files far from the ones which are visible on the screen, but ensuring that this works correctly in all cases when scrolling in the view, and no subtle bugs occur when items which had their on-screen size guessed suddenly become visible, and the maximum value of the scroll bar needs to be updated, is not trivial.

Now I will summarize some of the main ideas that were used to achieve these results. Maybe some of them are also interesting for other applications and libraries.

Dolphin 2.2/KDE SC 4.10.0

  • Emmanuel Pescosta speeded up sorting by modifying our Merge Sort implementation such that it makes use of multiple CPU cores.
  • He also modified a function that is responsible for loading icons: all recently loaded icons are now cached, which reduces the CPU and memory usage if there are many items shown on the screen which share the same icon.

Dolphin 4.11.0

Dolphin 4.12.0 (not yet released)

  • The view receives all data for each item from the class KFileItemModel in a QHash<QByteArray, QVariant>. The QByteArray key is something like “text”, “size”, “iconName”, “iconPixmap”, etc., and QVariant is a very flexible data type that can wrap, e.g., a QString or a QPixmap. In earlier Dolphin versions, KFileItemModel initialized this QHash for every item in the current directory before anything was shown in the view. This took quite a bit of time and memory – there is a Qt Quarterly article which visualizes the overhead that QHash adds to the actual data that it stores. In most cases, we now initialize this QHash lazily, i.e., only just before the item that it belongs to is really shown on the screen. The corresponding patch could only work because some code that relied on the QHash always being initialized had also been modified (1, 2, 3).
  • Even if we have to initialize the QHash for an item, we do not store “default” values any more. One example: the QHash stores, among other things, the information if the item is “expandable”, i.e., if it is a non-empty folder that can be expanded in Details View. However, if we have a
    QHash<QByteArray, QVariant> hash

    that does not contain the key “isExpandable”, then

    hash.value("isExpandable").toBool()

    will return false. Therefore, it is not necessary to store the value “false” at all.

    Similarly, storing other kinds of data in the QHash which are equal to the default values returned by QHash::value(key) is not necessary.

  • Sorting the files “naturally”, i.e., sorting them such that the order in the view is
    file1.txt
    file2.txt
    ...
    file9.txt
    file10.txt

    rather than

    file1.txt
    file10.txt
    file2.txt
    ...

    is quite time-consuming because the string comparison function

    KStringHandler::naturalCompare(const QString &a, const QString &b)

    essentially has to chop the two strings a and b into little “number” and “text” chunks, which are then compared in a locale-aware way.

    Qt 5 provides a class QCollator (public API since Qt 5.2 thanks to the awesome work of Aleix Pol), which can be used to pre-compute a “sort key” for every QString in advance. Comparing two “sort keys” is much faster than comparing the QStrings in a natural and locale-aware fashion. This means that O(N log(N)) expensive comparisons are replaced by O(N) “sort key” calculations, which might be expensive, and O(N log(N)) cheap comparisons.

    However, it turned out that we can improve the sorting performance even while we are still using Qt 4.8. The key observations are:

    1. Sorting the items with QString’s operator “<” as the comparison function is very fast.
    2. After this fast sorting, the order of the items is not totally correct, but most of the time, the order is quite close to the correct one.
    3. The number of comparisons that our Merge Sort implementation needs is greatly reduced if the order of the items is “mostly correct”.

    Therefore, we now sort the items twice: first using QString’s operator “<“, and then using KStringHandler::naturalCompare. This simple change reduced the time required for sorting by 63% in my tests.

    Some readers will probably have noted that there is a sorting algorithm which is designed to make use of input data which are “mostly sorted” already, and which might therefore improve the performance even further: Timsort, which is used, e.g, as the default sort algorithm in Python (1, 2). Maybe we will use it in future Dolphin releases.

  • In the class KItemListViewLayouter, which is responsible for the layout of the items in the view, we calculated and stored some redundant data for all files in the current directory: both the size of an item (i.e., how much space the icon and text need on the screen, that is 2 numbers for width and height), and the rectangle that contains the item (storing a rectangle requires 4 numbers). Before an item was shown on the screen, the rectangle was adjusted according to the current value of the scroll bar(s). Now we only store the size and the position of the top-left corner of the rectangle. The final rectangle can easily be constructed from these once it is necessary.

Outlook

I still have some ideas for improvements inside Dolphin, but as Mark Gaiser pointed out in his comment to a review request, the major remaining consumer of huge amounts of memory is the class UDSEntry from kdelibs. I assume that few people know this class, but it’s being used indirectly by every application that uses the KIO framework to access directory contents either directly or indirectly (e.g., via the “Open/Save File” dialog).

UDSEntry is used to transmit all kinds of information about files from kioslaves (which are the helper applications that read files from disk, or access remote files via sftp, smb, etc.) to the applications. A first optimization which is already in the master branch of kdelibs is to make use of Milian Wolff’s implicit sharing trick once more. The idea is that the user name and the group are very often the same for many files in a directory. Therefore, the function that receives the UDSEntries from the kioslave now checks if, e.g., the user name for a file is the same as the one for the previous file. If that is the case, the same implicitly shared QString is used for both. If all files in a directory belong to the same user, the same implicitly shared QString will be used for all.

In the benchmarks I discussed at the beginning of this post, there were 100,000 copies of my user name, and 100,000 copies of the group name “users” in memory – it should be obvious that this is not really efficient.

I will discuss further possibilities to optimize UDSEntry in a future blog post. For the time being, here are the raw data for the plots above, including the benchmark results for the master branches of kde-baseapps and kdelibs, which include the first UDSEntry optimization:results

Conclusion

I hope that I made clear what the key ideas are, and that these might be useful also for some readers of this post:

  • Do not determine large amounts of data in advance if it is likely that most of these data will never be needed.
  • Do not store redundant data: think twice if everything that is stored in memory is really needed, or if there is another solution that can achieve the same with less memory.
  • Make use of implicit sharing: if there are many QStrings, QByteArrays, or other implicitly shared data in an application, many of which are likely to be equal, check if they can share their memory.
  • If some calculation takes very long, try to find a more efficient algorithm.

Finally, I would like to thank everyone who helped to make these improvements possible. In particular, I would like to mention Mark Gaiser, who loves to stress-test file managers with half a million files and share his findings on our mailing list. Without his challenges, some of the work presented here might not have happened.

About these ads

19 thoughts on “A brief history of Dolphin’s performance and memory usage

  1. Anon

    Wow – awesome work, and an awesome blog post to match! This kind of unglamorous optimisation work is very much appreciated – thanks!

    1. Danny

      Wow indeed. When I first started using openSuse, and thus Linux as well the first of this year, I decided to go with Gnome because I like its Mac like interface much better. I use both Macs and Windows computers at work and at home since the late 80’s and have always like the Mac interface much better. However, after having issues with Gnome running KDE and QT based apps I gave KDE a try as my openSuse GUI. First of all….I did NOT find the issues with Gnome apps running under KDE as I did KDE apps under Gnome. HOWEVER, I did notice that KDE booted up and was a bit slower than GNOME in 12.3 as well as Dolphin performance. This seemed to improve in 13.1. I can’t wait to see these further improvements in 13.3 next year. Although I STILL prefer Gnome from a look and feel standpoint, I have grown to like the stability and improving speed of KDE and Dolphin. It just feels better engineered. Keep up the good work to all involved with KDE and Dolphin.

  2. Visitor

    Hi!
    It’s very nice to see Dolphin improving over the time, and its developers taking this performance aspect seriously! Congrats!
    But IMO when it comes to performance issues regarding a file manager, I think most people is more concerned about the time it takes for Dolphin (or the same way, for to the {open,save}-dialog) to show up when required (i.e. startup time).
    Sadly on this subjective metric I can hardly say that Dolphin rocks. I expect my file manager to appear instantaneously when requested (in the KDE3 days this was achieved by a always-hiding instance of konqueror running under the hood and ready to pop-up on demand, don’t know if still applies in 4, doesn’t look so…), but it’s clearly not the case on my machines, and it’s even worse for the time it takes to show the open/save dialog (what is actually a productivity killer).
    Do you plan to look into this next ?

    1. freininghaus Post author

      Several years ago, all Dolphin windows shared the same process. Opening a new window was very fast then because the existing process could just be reused. Peter changed this for KDE SC 4.7 and made every window use its own process:

      http://quickgit.kde.org/?p=kde-baseapps.git&a=commit&h=fff7573ebb910712ad97951bf1762e6a7bb0bdc7

      This was done mostly because Dolphin was using some buggy libraries at the time, which caused frequent crashes. Every crash of the ‘dolphin’ process would make all windows disappear. Moreover, there were problems when the KIO framework, which is used for the interaction with files and folders, showed a dialog like “File exists already – what do you want to do?”. These dialogs are application-modal, such that they froze all Dolphin windows.

      The “Dolphin crashes frequently” issue is mostly resolved now. If anyone volunteers to make the KIO dialogs non-modal, then I don’t see why we couldn’t go back to the “single process” model.

      There were some discussions about this issue on the mailing list recently:

      http://lists.kde.org/?t=137537235900004&r=1&w=2

      I can’t really say much about the “File” dialog in other KDE applications. There might be ways to make it load faster, but I’m far too busy with Dolphin to look into that.

      > Do you plan to look into this next?

      The great thing about free software is that *everyone* can help to improve it!

      1. Dawit A.

        > This was done mostly because Dolphin was using some buggy libraries at the
        > time, which caused frequent crashes. Every crash of the ‘dolphin’ process
        > would make all windows disappear. Moreover, there were problems when the
        > KIO framework, which is used for the interaction with files and folders, showed a
        > dialog like “File exists already – what do you want to do?”. These dialogs are ?
        > application-modal, such that they froze all Dolphin windows.
        >
        > The “Dolphin crashes frequently” issue is mostly resolved now. If anyone
        > volunteers to make the KIO dialogs non-modal, then I don’t see why we couldn’t
        > go back to the “single process” model.

        Well. It is easy enough to change these dialogs to be WindowModal dialogs because we can now fix it in one single location, KIO::JobUiDelegate::requestMessageBox.

        The problem I have with such change is how we are supposed to deal with a scenario where the user, say by accident, takes the same action from two Dolphin windows that result in the “File exists already – what do you want to do?” prompt?

        1. freininghaus Post author

          Thanks for your comment Dawit – I didn’t know that it’s so simple to change it!

          About the “same action from two windows” issue: my first naive idea would be to show one “File exists” dialog for each window. If the user takes an action in one of the dialogs that results in the original file being moved or renamed, then taking an action in the second dialog would probably just result in KIO noticing that the original file is gone, and show an error message, right? This is just what happens now if you try to rename or move the same file from two different Dolphin or Konqueror windows, which live in separate processes.

          Am I overlooking something? Do you expect any problems in this use case because both KIO jobs and “File exists” dialogs belong to the same process?

          BTW, I’m not even sure if the dialogs should be modal at all. Considering that KIO’s jobs are asynchronous, it’s not even guaranteed that the window still shows the folder where the action took place, so I’m not sure if it makes much sense to freeze that window while the dialog is shown.

  3. Martin Gräßlin

    I really like how you documented the exact steps to do the benchmark. Given the very good description it would be possible to reproduce the results (assuming one has the same hardware). If just everybody would do it in such a nice way (/me looks at certain web portals specialised on putting out benchmarks).

    1. freininghaus Post author

      To be honest, my description of the steps is still not complete ;-). Other things that the results might depend on to some extent are:

      (a) The full path to the test folder, because it affects the length of the KUrls which are stored in memory and then hashed with qHash(const KUrl&) in some places.

      (b) The user name (as I said, all kdelibs/KIO versions except the current master and frameworks branches will store the user name N times in memory when reading a directory with N files).

      (c) The order in which the files appear on disk. Perhaps surprisingly, “touch a{1..100000}.txt” seems to create the files in random order.

      (d) Probably many more things.

      Therefore, I think that one can only compare benchmark results if they have been obtained on the same machine, and with exactly the same test data.

  4. Mark Gaiser

    I really like blog :)
    Yes, i like to give dolphin a really hard time by feeding it a folder that contains 100.000, 500.000 or sometimes even a million files. It’s amazing where profiling brings you with folders that large,

    This actually motivates me to do some more UDSEntry experimental work. We’ll see where that ends up.

    1. freininghaus Post author

      > I really like ths blog :)

      I’m glad to hear that :)

      > This actually motivates me to do some more UDSEntry experimental work. We’ll see where that ends up.

      I’m currently planning to split the latest version of

      https://git.reviewboard.kde.org/r/113355/

      into independent parts and create review requests for KIO once the repositories are split in the frameworks branch. If you can make UDSEntry even more efficient, I’d like to hear about it!

      1. Mark Gaiser

        I probably can’t in it’s current structure since you’ve done an outstanding job in getting it as optimized as possible.

        Oh well, looking around for possible different ways of storing the same data is what i’m doing now. Don’t count on anything though, it’s very hard to beat the current optimized UDSEntry with your patches applied.

  5. trapanator

    Awesome blog post, one of the most beautiful I’ve read in these times.

    You are doing an amazing job improving Dolphin, the most important KDE application.

  6. Peppe

    Your work is really astonishing!
    Thank you guys for making Dolphin such a beautiful app and for sharing your ideas!

    I always get inspired by your methods :)

  7. Jon Mease

    Great work and really informative benchmarks. Thanks for taking the time to benchmark and explain these low-level improvements. It’s great to see how core applications like dolphin getting better and better!

  8. emmanuel

    It’s awesome to see how Dolphin has evolved over time. :)

    Thanks for this great blog post and for the great work you’re doing as maintainer and developer.

  9. Mike Linksvayer

    Just wanted to say thank you. I enjoyed reading this post, enjoy KDE subjectively being more performant on my old hardware over the last year, and enjoy seeing my subjective experience correlated with numbers. :-)

Comments are closed.