I've blogged about some of the more prominent changes in this new Nepomuk release. I thought it would be a good idea to document all the changes, which Nepomuk has gone through thanks to Blue Systems!
As the release announcement has been saying, the file indexer has undergone the maximum number of changes.
New Double Queue Architecture
We've split the working of the indexer into two parts - The first basic indexing and second full file indexing. The basic indexing quickly indexes the basic information about the file such as the filename and mimetype. This allows us to always at least answer simple queries. The other queue, which is only run when the user is idle, extracts the full information about the file.
New File Indexer
We've had some problems with Strigi earlier. With 4.10, we have finally decided to release our own solution. Our solution is arguably technologically inferior, but it's more maintainable and, for now, provides a better user experience.
One of the advantages of moving to this new file indexing architecture is that mimetypes are a very important part. All of the file indexing plugins use mimetypes to identify which types of files they can index. With this, we decided to allow the user to control the type of files that are indexed.
By default, source code is now no longer indexed. Common stuff like Documents, Images, Audio and Videos are.
Till the 4.9 release, the kioslave code hadn't changed much. With 4.9.1, we managed to optimize some of the code. The 4.10 release however takes this to an entirely different level.
The 'nepomuksearch' tagging slave could initially show both non-file and file data. This means that it would also occasionally show contacts, albums and other details. Selecting any of those would result in another search for resources related to that contact. For this release, we decided to optimize for the most common use case of listing files.
The 'nepomuksearch' kioslave, and all other nepomuk kioslaves, now no longer show any result which does not have a URL. This coupled with a LOT of other optimizations, has now yielded a super fast kioslave which can display thousands of results in under a second.
There is also some interesting userbase documentation about custom queries on the nepomuksearch kioslave.
As previously stated, we are also introding a new tagging kioslave. This slave allows you to easily manage you Nepomuk tags, and browse files based on the different tags it contains.
One of the largest part of the Dolphin Information Panel was the KFileMetadataWidget which was provided by kdelibs/kio. This widget was one of the last parts of Dolphin that still used Nepomuk1. Since kdelibs was frozen, we couldn't port it to Nepomuk2. Thus emerged the Nepomuk2::FileMetadataWidget in nepomuk-widgets.
The KFileMetadataWidget historically fetched all the data in another process. This was done because Strigi was a little unreliable. With KDE Workspaces 4.10, we are no longer using Strigi in Nepomuk. This means the widget now uses the nepomukindexer, to extract the data. It also no longer uses this multi-process architecture when loading the Nepomuk data. This result in a massive performance improvement cause we can rely on Nepomuk cache in Dolphin, instead of recreating it each time.
In terms of appearance, the widget has become a little more uniform, and by default only shows the properties that really matter.
Improved Removable Media Handling
Nepomuk has for quite some time supported indexing of removable media handling. However, it didn't always work that great. From a design point of view, the solution was great and extremely robust. This however, came at a steep cost for the rest of Nepomuk. Every other query was affected by these features, and not in a small way. For some simple tests of basic indexing, it made of difference of around 20%.
With this new release, we have gone to a simpler solution which has a lighter performance cost. We have also removed the "Automatic Invalid File Metadata Cleaner" which removed the metadata for any file it could not access. The client code now always checks if the file can be accessed before displaying it to the user.
Nepomuk Backup Changes
With KDE Workspaces 4.6, my Google Summer of Code Project, Nepomuk Backup, was finally merged. It was a very ambitious project which attempted to synchronize, backup and restore data in a non-destructible manner. In the end, it was just a little bit too complex. Large parts of the synchronization code, eventually migrated into the data feeding code which is now used by anyone pushing data into Nepomuk. So, it wasn't a complete loss.
With this new release, I finally got around to throwing away most of the complex code, and implementing a very simple and reliable backup solution. This new method does not require a separate service to be running, and therefore consumes less memory. Additionally, we also have some basic unit tests to ensure that the backups are restored properly!
Please keep in mind that this only backups up the non-destructible data. This does not include the file or email index information. If you want that to be backed up, you're better off just making a copy of the database file.
The Nepomuk Cleaner originated from a series of scripts I was writing to clear up my own database. It eventually occurred to me that other people might suffer from the same problem. The scripts were eventually combined into a cohesive form, and released. The application is very simple right now, but that will change in future releases. I even contemplated not releasing it for 4.10, but it clearly provides some value, even if it doesn't look that great.
Surprisingly, I didn't want to include many new features this releases. I was trying to focus more on stabilization. Over the last 6 months, A total of 246 bugs have been resolved, out of which 188 were reported within the last 6 months. This seems like a good improvement to me.
Apart from these simple changes there have been a number of optimizations all across Nepomuk and Soprano. Nepomuk should be running faster and better than ever before. In some cases we have even seen an over 200% increase in performance.
Anyway, Enjoy the new release! :)