No more “unknown” icons

UnknownIconsIn recent versions of Dolphin, the view sometimes looked like this just after entering a directory.

Some of the files and sub-directories have “unknown” icons, which are replaced by the correct icons later.

This will not happen any more in Dolphin 4.11.

Why did we show you “unknown” icons at all in the first place?

Dolphin can show you a lot of information about your files. Some of this information can be obtained quite easily, like, e.g., the name, the file size, and the modification time.

However, some other properties of a file or directory might be more expensive to determine. Examples are:

  • The type of a file: if it cannot be determined unambiguously from the file name, we have no choice but to read the contents of the file.
  • The icon that has to be shown for a file depends on the type. A directory icon, on the other hand, can be chosen by the user in the “Properties” dialog. The chosen custom icon is then stored in a file named “.directory” inside the directory, so determining the icon for a directory always requires a disk access (because we don’t know in advance if there is a .directory file or not).
  • If previews are enabled, the preview for each item has to be retrieved from the .thumbnails directory in your home folder. If no cached preview can be found there, it gets worse, and the (possibly large) file must be read from disk to generate a new preview image.
  • Counting the number of files inside a directory also requires a disk access. It is shown, e.g., in the “Size” column in Details View.
  • Meta data provided by Nepomuk.

To make things more complicated, some of these properties can have “preliminary” and “final” values. For example, the preliminary icon for any directory is the generic “folder” icon, but the final icon is only known after checking the contents of the corresponding .directory file.

We obviously cannot determine the final values of all these properties synchronously, i.e., blocking the application’s main thread, for all files and directories. In large directories, it might take very long until this process is finished. Therefore, we have to find some compromise between “show the items in the view as quickly as possible” and “only show any kind of information if we know what its final state is”.

What happened in earlier Dolphin versions

We have a class called KFileItemModelRolesUpdater (.h, .cpp) which is responsible for determining the data that are possibly expensive to retrieve.

When you enter a directory, it tries to determine the final icons for the visible items for up to 200 milliseconds. It then then returns to the event loop, such that the view can be painted, mouse events can be processed, etc. The remaining icons and other data are determined asynchronously – first for the visible items in ascending order, then for all invisible ones in random order. We still do it in the main thread, but only for one item at a time, such that it never blocks for too long until the next events can be processed. Moreover, the asynchronous processing of the remaining items is suspended if the view is scrolled. This ensures that scrolling feels smooth.

This approach has two downsides:

  1. Some “unknown” icons can remain if 200 milliseconds are not sufficient to determine the final icons for all visible items. This can happen easily if there are many directories in the view, whose final icons can only be determined by accessing the contents of the directory.
  2. It wastes resources when viewing large directories. Loading previews for thousands of images (I know from user feedback at bugs.kde.org and forum.kde.org that it is not uncommon to have hundreds of thousands of images in one directory) can easily make the hard drive and the CPU busy for a few minutes, even though it is very unlikely that the user will ever scroll through all these images in the view. Moreover, the previews need a lot of memory.

Improvements in Dolphin 4.11

KFileItemModelRolesUpdater still only tries to load the final icons for 200 milliseconds. However, just before an item is shown on the screen, Dolphin will check if its final icon has been determined already. If that is not the case, it will use the “preliminary” icon, which can be determined very fast. In many cases, the preliminary icons are actually equal to the final ones. Notable exceptions are directories with custom icons, and most of the files in /usr/bin.

Some changes in kdelibs were required to make this possible, see this discussion on the mailing list for details about this, and how “Poor Man’s Profiling” helped to solve the problem. Many thanks to David Faure for his help!

Thanks to this modification, Dolphin will never show you “unknown” icons again for items which do have a valid icon.

Moreover, Dolphin will not try to load icons/previews/etc. for all items in the directory asynchronously any more. It will just do that for the items which are currently visible, and those which can be reached quickly by scrolling up or down a few pages or to the top or bottom of the view (up to 500 at a time).

This greatly reduces the amount of memory and CPU cycles that we use to load and show a directory.

Some more information about the code changes that were required for this can be found in the corresponding review requests: 1, 2, 3, 4, 5, 6.

Some thoughts on algorithmic complexity, part 2: Be careful if you insert into or remove from a std::vector, QVector, or QList in a loop

This is the second part of a series of blog posts. Like in part 1, most or all of what I write here will be obvious to many readers. However, I have seen problems which are equivalent to those that I show here more than once in widely-used code, so I think it makes sense to make people aware of these issues and possible solutions.

If you are familiar with containers like std::vector already, feel free to skip the introduction.

Introduction

Different kinds of data structures, also called containers, are a core part of most applications. Depending on the programming language, various containers are provided by the language itself, by libraries which are shipped with the compiler (like the STL in the case of C++), or by libraries like Qt. Containers like vectors, linked lists, sets, hashes, etc., have quite different properties. Each container has advantages in some situations and disadvantages in others. If you want to know more about these, you might want to have a look at the relevant page in the Qt docs or the “Containers” page at cppreference.com for a general overview, or a Qt quarterly article and an excellent article by Marc Mutz for some very interesting details.

QVector, QList, and std::vector

In this post, I will focus on containers which store their elements in one contiguous block of memory. These have the following useful properties:

  • The items have a well-defined order (this is not the case, e.g., in a QSet).
  • Each item has an integer index: the item stored at the beginning of the block has the index 0, the next one has the index 1, and so on. Moreover, accessing an item with a certain index is very fast, no matter how many items there are. It takes O(1), or constant, time, in contrast to linked lists, which also have a well-defined order, but require O(N) time to access a particular element.
  • Appending items at the end also needs constant time. The containers reserve some free space at the end to make sure that appending many items one by one will not require an expensive memory reallocation every time. Removing the last item also needs constant time.

However, inserting or removing items at a random position takes O(N) time because all items after the inserted/removed item (on average, that is N/2 items) have to be moved.

Special case: QList

One important difference between QList and the other two containers is that it does not store the items themselves in the contiguous block of memory if an item requires more size than a pointer. It rather stores pointers to the actual items in the contiguous block then. Moreover, it never uses less than the size of a pointer (4 or 8 bytes on a 32-bit or 64-bit platform, respectively) for an item. These properties can be beneficial in some situations (for example, it enables QLists with different data types to share much of their code in the binary, and it makes moving large items inside the list faster), but storing many bools in a QList might waste a lot of memory.

Another difference is that QList can even add items at the beginning or remove the first item in O(1) time because it also reserves some memory at the beginning. When inserting/removing at a random position, it checks if there are less items to the left or to the right and moves the smaller part of the memory block. Therefore, only N/4 items have to be moved on average, but that is still O(N).

Removing many items can be very slow if implemented incorrectly

Let’s assume that we have stored some items of a type T in a QVector, and we have to remove all items for which a function filter(T) returns true.

Simple example: remove all odd numbers from a QVector with many numbers.

remove

The bad solution

A possible solution would be to go through all numbers and remove any odd number that we find:

template<class Container, class Filter>
void remove1(Container& c, Filter filter)
{
    auto it = c.begin();

    while (it != c.end()) {
        if (filter(*it)) {
            it = c.erase(it);
        } else {
            ++it;
        }
    }
}

To test the efficiency of the function, I wrote a small benchmark that creates a sorted QVector containing the integers between 1 and N, and then removes either the first half or the second half of these numbers, or all odd numbers. The results are shown in this plot (the y-axis shows the time in milliseconds):

plot1

It turns out that the function is quite inefficient and has O(N^2) complexity in all test cases. The reason should be clear: every time an item is removed, all items after that one are moved one position to the front. In other words, O(N) items are moved O(N) times, so we get O(N^2) behaviour. What actually happens is shown here:

removeBad

Note that we could fix the “remove many items from the end” case by going through the items in reverse order (remember that removing the last item is fast). However, this would not help much in the other cases.

Good solution 1

Rather than moving all following items one position to the front when an item is removed, we could just replace it with the last item. This guarantees that removing a single item will always cause at most one other item to be moved:

template<class Container, class Filter>
void remove2(Container& c, Filter filter)
{
    auto it = c.begin();
    auto end = c.end();

    while (it != end) {
        if (filter(*it)) {
            --end;
            *it = std::move(*end);
            c.erase(end);
        } else {
            ++it;
        }
    }
}

The new function has linear complexity for all types of test cases:

plot2

The obvious disadvantage is that the relative order of the items is not preserved. However, if the order does not matter, then this solution might be worth considering, in particular if only a few items have to be removed (note that removing a single item is always O(1) with this solution).

Good solution 2

We can achieve linear complexity and still preserve the sort order. The solution is to make sure that every item is moved at most once: we move an item to the front only when its final position is known.

removeGood

Implementing this idea is quite straightforward, but it’s not even necessary, because this is just what std::remove_if does.

template<class Container, class Filter>
void remove3(Container& c, Filter filter)
{
    const auto oldEnd = c.end();
    const auto newEnd = std::remove_if(c.begin(), oldEnd, filter);
    c.erase(newEnd, oldEnd);
}

In our test cases, this is faster than the “replace by the last item” solution by a factor of 2:

plot3

Final remarks

  • If you insert many items into an existing std::vector, QVector, or QList, similar considerations apply.
  • Such problems really do exist in the software you use, even though they might not cause any serious trouble as long as you don’t let applications handle a large data set that triggers the worst case behaviour. I’ve fixed a few such problems in Dolphin recently, and I will tell you about more problems of this kind in the next post.
  • Thanks for all the interesting feedback that I got after my last post! In particular, I would like to mention Emilio Coppa from the University of Rome here. He works on aprof, a Valgrind-based tool that can help to find inefficient functions like those that I presented here. Just run your program inside this tool and make sure that all interesting functions are fed with various amounts of input data. I ran my benchmark in aprof (I just made it consider a larger set of different vector sizes to make the output look nicer and decreased the overall size of the vectors to reduce the runtime on Valgrind’s simulated CPU). Here is the visualization of the output for the function remove1:
    aprof-remove1
    And the same for remove2:
    aprof-remove2
    Note that you don’t have to write a benchmark at all to get such results with aprof. You can just run your application normally, make sure that it handles inputs with different sizes, and then you can spot the inefficient functions quite quickly (provided that you trigger the inefficient behavior of these functions). If you want to know how the tool works, have a look at this paper. It found a bug that was basically just the strlen issue that I discussed in my last post in the word frequency counter “wf”.
  • Just to prevent misunderstandings: I’m not saying that algorithms with O(N^2) worst-case complexity are always evil. If it is guaranteed that N will never get really large or that the input data will never trigger a bad worst-case behaviour, then there may be reasons why such an algorithm might be preferable.

Dolphin bugs fixed in June 2013

The last bug fix update for Dolphin 2.2 has been shipped recently. Here is a summary of the changes that you can enjoy if you upgrade:

  • Bug 320823: Rename the right item when items are added to the view while renaming. See git commit 4de9a233, review request 110909.
  • Bug 319912: Do not rename files unexpectedly when changing the URL while renaming. See git commit 5948dc0f, review request 110908.
  • Bug 310705, bug 312014: Ensure that switching to “Sort by Type” takes effect without reloading the view. See git commit 735c046d, review request 111004.
  • Bug 320897: Prevent that the rubber band selection rectangle gets a width or height of zero pixels. See git commit 22d93e8f, review request 111144.
  • Bug 321234: Allow renaming multiple files without providing one or more ‘#’ as a placeholder for a number if all extensions are different. This was an “accidental” feature that we were not aware of at all until a user reported that it had been broken by a recent bug fix. See git commit 6f2e1403, review request 111079.
  • Bug 316209: Prevent possible infinite recursion in the ViewProperties constructor, which caused crashes on MacOS. See git commit 04229f8f, review request 111182.
  • Bug 307336: When dragging a file from a search result to KMail, use the correct URL. See git commit c847aacf, review request 111209.
  • Bug 321198: Make searching recent files work correctly. See git commit 25a2255e.

Moreover, Dawit Alemayehu, who has fixed an amazing number of kdelibs bugs that affect file management, made two more commits last month which:

These improvements were contributed and/or tested by Dawit Alemayehu, Emmanuel Pescosta, Kurt Hindenburg, and Vishesh Handa. Thanks for your help! If I forgot anyone, please let me know.

Unfortunately, I’m unable to attend Akademy, but I hope that everyone who is in Bilbao has a good time there!

Dolphin bugs fixed in May 2013

Again, we have a new bug fix release. Here are the bug fixes that Dolphin users who upgrade can benefit from:

  • Bug 304918: Update the ‘Recently Accessed’ and ‘Search For’ sections of the Places Panel when Nepumuk is started or stopped. See git commit 24609e78, review request 110323.
  • Bug 255819: Ensure that manual changes of the icon for the ‘Sort By’ toolbar button are remembered. See git commit f12345a5, review request 109966.
  • Bug 319660: When dropping the trash in a directory, create a link, rather than copying the trash contents to that directory. See git commit 6072f58c.
  • Bug 319747: Fix crash when copying many files. See git commit 1b9726e2.
  • Bug 315796: Make it possible to change the parameters of ‘read only’ search queries. See git commit 2c68fe33, review request 110324.
  • Bug 309740: Prevent that files in expanded folders are shown with a size of 16 EiB while they are being changed. See git commit 56cf52db, review request 110428.
  • Bug 299675: Fix problems when moving files out of or into expanded folders in Details View. See git commit a79c3a39, review request 110401.
  • Bug 319336: Fix a crash that could happen when editing the tags of a file. See git commit 2080bc1d.

Thanks to Daniel Faust, David Faure, and Vishesh Handa!

Some Thoughts On Algorithmic Complexity, Part 1

When people develop and maintain software, they usually try to make sure that their code not only works well, but also that it runs fast. There are many different ways to make code perform better. It can be as simple as choosing reasonable compiler options, or as hard as rewriting some code in assembly.

If you see some code that takes long to run, you should not start optimizing by looking into the little details though. The first questions that you should ask is: are we actually using the right algorithm for the job? Are there any design mistakes that make things much worse than they need to be? This is especially important if you don’t know in advance how much input the code will have to handle. It should be obvious that it makes a difference if a list that needs to be sorted contains 1000 or 1 million items. However, how big that difference actually is depends on the choice of the algorithm and its correct implementation.

If you are not interested in mathematics or theoretical computer science, feel free to skip the next sections. Maybe the example programs at the end of the post will be interesting for you nonetheless, at least if you are fairly familiar with programming.

Big O notation

The exact time f(N), measured in milliseconds, CPU cycles, or some other suitable unit, that an algorithm requires if it is given an input of size N (N could be the length of a list or the length of a string, for example) is often less interesting than the growth rate, i.e., how fast that time grows if N becomes larger and larger. Is it constant, or does it increase logarithmically, linearly, or even worse?

To classify the so-called run-time complexity of an algorithm, one often uses the big-O notation. It can be used to provide an upper limit for f(N) in a rather simple way, even if writing down f(N) exactly would yield a very complex expression. The idea is to use a simple function g(N), and say that

f(N)=O(g(N))

if there is a constant c such that

f(N)\le c\, g(N)

for any N, i.e., if this constant times g(N) is an upper bound for f(N). Simple example: if

f(N)=4 N+\log_2(N)+2,

we could write f(N)=O(N) because the linear term 4 N is the one which grows fastest, such that the total value f(N) is always smaller than some constant times N.

Typical orders (of growth) that often occur in algorithms in the software that we use include O(1), O(log N), O(N), and O(N log N).

I’ll give some examples for situations where these occur:

O(1) constant: a simple example in the Dolphin context would be showing the files and folders on the screen. Even if you have 100,000 items in a directory, only a small, constant number of them can be shown at the same time, so the effort required to draw them should not depend on N at all.
O(\log N) logarithmic: the classical example is binary search. If you are given a sorted list of N items, and you have to find a particular item, you might start by looking at the element in the middle. This tells you if you have to look at the first or in the second half of the list. It is easy to show that applying this procedure repeatedly will yield the right item in \log_2 N steps.
O(N) linear: a Dolphin example would be loading the data for all files in a directory (like the name, size, and other things that we might want to show in the view).
O(N \log N) If a good sorting algorithm is used to sort a list of N items, this is the typical number of comparisons of two items that are required, and hence, also a measure of the time that will be needed to sort the list.
O(N^2) quadratic: this is something that you really want to avoid, unless you can guarantee that you will never have to handle really many items, or there really is no better solution for your problem. If you are not sure if a better solution exists, you should look for one. Really.

Best case, average case, and worst case

Things are often a bit more complicated: the run-time complexity of an algorithm can depend on the actual input data. For example, certain sorting algorithms might work better if the input data are (almost) sorted, and worse for certain other kinds of data.

Therefore, one often states the best case, average case, and worst case run-time complexity of an algorithm. Which of these is most important depends strongly on what you know in advance about the input data. If they are always in a certain form, an algorithm that has a good best-case performance just for that kind of data might be preferable. On the other hand, if the input data might come from an untrusted source, ensuring a reasonable worst-case performance should have the highest priority if you want to prevent that you and your users will be the target of algorithmic complexity attacks.

Why am I telling you all this?

This is an obvious question that you might ask now. Everyone who develops software should be familiar with those things already, right? Well, it turns out that a lot of real-world code does have a worst-case (or, sometimes, even an average) run-time complexity that is worse than it needs to be, either because the wrong algorithm has been chosen, or because a tiny mistake in the implementation makes things go terribly wrong if a certain kind of input is fed into the algorithm. This is why I thought it might be a good idea to write a series of blog posts to raise people’s awareness of these issues.

Example

Let’s start with a simple example, and assume that we have to write a function that replaces all occurrences of a certain character in a plain C string by another character. E.g., replace ‘u’ by ‘o’ in the string ‘algurithm’ to obtain the right spelling ‘algorithm’. Let us also ignore that we don’t actually use plain null-terminated C strings much in KDE/Qt code, and that solutions for this problem exist already.

Any algorithm to solve this problem will have to go through all N characters of the string once, and look at and possibly replace each character. So we would expect that the run-time complexity is O(N).

This is a possible solution that we could come up with:

void replace1(char* text, char oldChar, char newChar)
{
    for (unsigned int i = 0; i < strlen(text); ++i) {
        if (text[i] == oldChar) {
            text[i] = newChar;
        }
    }
}

Some will already see the flaw in this implementation, but I assume that many would just verify that this function works nicely for a few examples and then happily use it. It might even work OK in real world applications for some time, that is, until someone uses this function to handle a very long string.

A small benchmark that tests this function for strings with lengths between 100,000 and 800,000 characters can be found in my personal KDE scratch repository. The following plot shows the time in milliseconds as function of the input size (plots were generated with qtestlib-tools).

1

When you see this plot, you immediately realize that the time depends quadratically on the input size. So our algorithm has a run-time complexity of O(N^2), rather than the expected O(N). This results in an unacceptable run time of 8 seconds just for going through 800,000 characters and replacing some of them! If you are doing something like this in the main thread of an application, the GUI will be completely frozen during that time.

What went wrong?

The problem is the condition ‘i < strlen(text)’ in the for loop. This condition is evaluated after each execution of the loop, i.e., N times. Moreover, strlen(text) must go through the entire string, because it looks for the terminating null character, which defines the end of the string. This means that the O(N) function strlen(const char*) is executed N times. Now it is obvious why our function is O(N^2).

Correcting the mistake

Here is a better version of the function, and benchmark results that show that it needs only linear time. Note that the time to handle 800,000 characters has been reduced from 8 seconds to 1.5 milliseconds:

void replace2(char* text, char oldChar, char newChar)
{
    const unsigned int length = strlen(text);
    for (unsigned int i = 0; i < length; ++i) {
        if (text[i] == oldChar) {
            text[i] = newChar;
        }
    }
}

2

What about the compiler?

Maybe some readers will now ask if an optimizing compiler shouldn’t be able to see that calculating the length of the string repeatedly is unnecessary, and thus optimize the first version of our code such that it becomes O(N). It turns out that such an optimization is impossible because using our function to replace any char by 0 will actually change the length of the string inside the loop.

Do professional developers really write code which has such problems?

Yes, they do. According to an article that I once read on the web (I cannot find it at the moment, unfortunately – if you do, please add a comment!), the “strlen in for loop condition” issue once caused huge problems in the computer system of a bank. Moreover, there is still a lot of code out there which has similar problems, even if they are sometimes less obvious. In the next posts in this series, I’ll tell you about some further examples, which will be fixed in KDE 4.11.

Dolphin bugs fixed in April 2013

Here are the bugs which were fixed in Dolphin for last week’s release:

  • Aureliéns efforts to make our hover animation as good as possible were not over yet: one little bug remained after his recent fixes. For details, see git commit c3a7ba4e, review request 109960.
  • Bug 316129: When entering a directory, load the icons for the visible icons first, rather than loading all icons in random order. See git commit 52a38ee9, review request 109843.
  • Bug 317827: Fix possible crash when filtering items. See git commit 34d0ad72.
  • Bug 316285: Fix crash in the accessibility code. See git commit f8bf2577.
  • Bug 317772: Disable the non-working Find/Replace actions when renaming inline. See git commit a16562cd.
  • Bug 193298: Make sure that the correct free space for the current device is shown when viewing a folder with a name whose beginning coincides with a mount point. See git commit 4f2ecbab, review request 110225.
  • Bug 318942: When renaming multiple files, make sure that the “Rename” button is enabled if and only if there is exactly one connected sequence of ‘#’ symbols. See git commit cbb2a4cf, review request 110223.
  • Bug 305734: Fix the problem that selected hidden items are hard to see with certain color schemes. See git commit 572513a1, review request 110164.
  • Bug 302037: Prevent that the view URL changes unexpectedly when the Terminal Panel is enabled and the current path is a symbolic link to a directory. See git commit 2e123de9, review request 110233.

Thanks for providing patches, helpful analysis and testing proposed fixes: Aurélien Gâteau, David Faure, Emmanuel Pescosta, and Jekyll Wu.