While refactoring KSelectionProxyModel I’ve been trying to address the problem that it’s really hard to implement QAbstractProxyModel.
There’s just so much API to implement. The entire QAbstractItemModel API needs to be implemented along with listener API for the source model (eg, sourceRowsAboutToBeInserted/sourceRowsInserted) along with mapToSource and mapFromSource for the proxy model and the internal mapping structures necessary for that.
The requirement of an internal mapping structure is that it must be possible to map objects representing QModelIndexes in the sourceModel to QModelIndexes in the proxyModel itself and vice-versa. In the refactor-in-progress KSelectionProxyModel I was until recently using a QHash to map parent indexes in the source model to parent indexes in the proxy model. The QHash gives O(1) time complexity for lookups, so mapping from the source model to the proxy is fast.
However, mapping from the proxy model to the source model required a call to QHash::key() to get the key given a particular value. That is a O(N) call. Initially I considered adding another mapping in the other direction to the private class, and maintaining both within the proxy implementation. That would be error prone however and not reusable.
The reason for this refactoring work is to make reusable components to use to implement proxy models with. So I needed a class to implement an efficient mapping in both directions. I implemented KBiHash which provided an opportunity to try out the massif-visualiser Milian has been writing and the QTestLib-Tools.
KBiHash maintains two symmetric QHash objects
KBiHash maintains internally a QHash<T, U> and a QHash<U, T> and provides iterators for both “sides” of the mapping along with API for inserting, removing, updating and querying from both sides. The fact that it is also quite similar in structure to a QSet<QPair<T, U> > is not lost on me, so it also provides some set-like API such as intersect(), subtract() and union(). Because the implementation is a bit naïve though (it’s a quick hack) a few nice-to-haves are a bit restricted, such as operator returning an always const reference. Allowing the return value there to be modified would make the internal QHashes out of sync. I suspect though that a real proper implementation might be able to solve that, and be a lot faster, but for now the main benefit of this class is the convenient API for managing the two way mapping. It should be useful in QAbstractItemModel implementations too where you need to map from a void * pointer to something else and vice-versa.
Benchmark results comparing KBiHash with QHash are almost entirely unsurprising. All benchmarks were executed ten times and the results averaged. The results before averaging are here. In the plots below N refers to number of instructions executed as reported by callgrind and size is the size of the container under test. Note also that the x axis is logarithmic.
Insert grows linearly, but is usually 100% larger in the KBiHash case. This is because both the left and right side of the mapping need to be hashed on insert. It could be larger by a different percentage depending on the complexity of the hash function for each type if two different types are in the KBiHash.
Key lookup is comparable as expected. The small relative difference may be caused by the fact the the hash is nested in another class.
Reverse lookup cost grows linearly for calls to QHash::key(value). Reverse lookup for KBiHash is still O(1) as it is still just a hash lookup.
Cost for removing items by key is also as expected. The KBiHash case is more expensive because the item must be removed from both internal hashes
Results for removing values from the KBiHash are also comparable to removal of keys in the KBiHash. Removing values from the QHash is slower for the same reason as reverse lookup.
Updating a key is constant for both QHash and KBiHash.
Changes to values are constant in both QHash and KBiHash
The memory usage of KBiHash is also greater than QHash, as expected when maintaining two hashes. The peak and average memory consumption in the KBiHash case is almost twice as large as in the QHash case.
Memory profile of KBiHash
Memory profile of QHash
The case against boost::bimap
An important part of being a software developer is reinventing what already exists in the standard template library or boost. We all do it I think.
boost already has a class, boost::bimap, which implements a forward and backward mapping based on std::map API. There are a few reasons I didn’t use it here.
Firstly, writing KBiHash was a good learning experience. It’s the sort of thing which has lots of hidden corners that you don’t expect. For example, when inserting a new pair into the mapping, it is necessary to explicitly remove entries in the opposite map. I probably would not have remembered to do that each time if I was simply maintaining both mappings manually, but it’s good to learn.
Secondly, boost::bimap is semantically slightly different to what I need. When inserting into the bimap and the left or right value is already there, it is not overwritten, but the operation is ignored. If using boost::bimap I’d end up using a lot of remove followed by insert anyway to work around that.
Thirdly, as std::map maintains sort order of keys, it may be slower than QHash. The average for key lookup and insertion in a map is O(log n), whereas for QHash it is O(1). It would be nice to include it in the benchmarks here, which brings me to the last point.
The API of KBiHash is more Qt-like. I couldn’t figure out how to use boost::bimap. Even though there’s a long tutorial for it, I couldn’t figure out how to compile the benchmarks. My attempt is ifdef’d out in the benchmarks file though, so if anyone else can figure it out I’ll add it to the benchmark plots.