I read and write lots of C code, and I find the cscope tool to be invaluable in searching and navigating code bases. Recently I took over maintership of the xcscope.el Emacs interface to cscope. There are a surprising number of different Emacs interfaces to cscope, and this one seems to be the most mature and full-featured (and I made it much nicer).
One feature that some other interfaces have (ascope for instance) is that instead of running a new cscope process for each query, they leave the process running, and reuse it for each query. This keeps the database in memory, and doesn't waste cycles reloading it every time. This is the major feature of these interfaces, and glorious performance benefits are claimed.
Currently xcscope
does not do this, and I sometimes consider implementing
this feature. It's going to be a pain to do, so I decided to run some tests to
see if the performance benefits really are worth it.
Benchmark machine
All tests were run on my relatively quick server. It has a quad-core Ivy bridge Core-i5 CPU, 4GB of RAM and a non-SSD hard disk.
Test description
The code base under test is the linux kernel. This should be near the upper bound of what most people would be indexing. Sure, larger projects exist, but you're generally working on a contained piece, rather than the whole thing at once (this is true of the kernel too, actually).
I perform multiple discrete cscope operations using the command-line tool. Each
operation starts a new cscope
process, which reloads the database. I.e. I
perform the operation that's supposedly slow every time.
I measure how long it takes to build the search database, then to re-build it,
then to re-build it after touch
-ing a file. Then I measure how long it takes
to run a search, then to re-run it, then to re-run it after touch
-ing a file.
I do all this with the default settings, then again with settings more appropriate for a kernel:
- kernel mode:
-k
option. Doesn't look in/usr/include
- inverted-index mode:
-q
option. Builds an extra index for faster searches
Each search is also run with the -d
option. This only runs the search; it does
not also update the database with each search. By default, cscope does
update the database with every search.
Specifically, I get the list of files with
cscope-indexer -l -r
I build an index with
cscope -b
If I'm indexing in kernel mode and I'm building an inverted index, I also pass
in -q -k
. The test search looks for all uses of the main
symbol:
cscope -L0 main
Once again, if I'm indexing in kernel mode and I'm building an inverted index, I
also pass in -q -k
. When I want to touch an arbitrary file, I do
touch include/drm/drm_edid.h
There's no significance to this file. It's just anything that's in the index.
As one can imagine, the disk cache plays a very significant role here, and subsequent runs of the same command complete faster than the first. For this reason all tests are run with both a cold cache (by dumping the disk cache prior to the test) and a warm cache (not dumping the cache, and pre-running the operation a few times before timing). I also ran these tests on an actual hard disk, and also on a tmpfs ramdisk.
All timings were performed multiple times, with the initial few values and the outliers thrown out. The exact script used to collect the data is described and available in the next post.
Results
All timings in seconds.
Cold disk cache
Normal mode/ext3 | Kernel mode/ext3 | Normal mode/tmpfs | Kernel mode/tmpfs | |
---|---|---|---|---|
Initial database build | 45.9 | 80.2 | 14.0 | 44.2 |
Database re-build after touching a file | 10.4 | 48.9 | 3.2 | 30.1 |
Initial search | 7.5 | 3.0 | 0.8 | 31.2 |
Re-search after touching a file | 12.7 | 43.7 | 3.5 | 32.1 |
Initial no-db-update search | 5.3 | 0.8 | 0.8 | 0.8 |
No-db-update re-search after touching a file | 5.1 | 0.8 | 0.7 | 0.8 |
Warm disk cache
Normal mode/ext3 | Kernel mode/ext3 | Normal mode/tmpfs | Kernel mode/tmpfs | |
---|---|---|---|---|
Initial database build | 13.8 | 49.6 | 12.9 | 44.4 |
Database re-build after touching a file | 3.5 | 35.5 | 2.7 | 30.8 |
Initial search | 0.8 | 0.1 | 0.8 | 30.8 |
Re-search after touching a file | 4.0 | 33.5 | 3.5 | 31.9 |
Initial no-db-update search | 0.7 | 0.0 | 0.7 | 0.7 |
No-db-update re-search after touching a file | 0.7 | 0.0 | 0.7 | 0.7 |
Conclusions
I've known about the cscope inverted index for a while, but never actually tried to use it. Looks like it works as advertised: takes significantly longer to build, but once built the speedup it provides is substantial. It would appear that the main benefit of the inverted index is that less data needs to be read from disk and not that less searching is required. At least on this particular test machine the inverted index has no upside if the data is all in RAM already (tmpfs). On a slower box maybe we'd see the search times become significant, but not here.
It's extremely clear that the overhead of just loading the database is
immaterial. It's effectively instant to load the database and then to run a
search in an inverted index with a warm cache. It's a bit slower without an
inverted index, but all the time there is spent searching, not loading the index
into memory. I know this because I get the same no-inverted-index search timings
with the cscope interactive tool, which loads the database just once. The only
way keeping the cscope
process running is advantageous is if this makes it
more likely the caches stay warm. This is difficult to test, but I doubt it's
true. If I run repeated queries even with a new process every time, the data
stays cached, and things run quickly. What I think is much more likely is that
the people who wrote cscope interfaces such as ascope
only used interfaces
such as xcscope
without the -d
option. I.e. they were updating the database
with every query, which clearly can be slow with a large codebase. Then they
were not doing this with their persistent cscope
sessions, and attributing
the performance gains to not loading the database rather than rebuilding the
index too often. In any case, I think it's pretty clear that this feature is not
worth the work, so I'm keeping xcscope
as is.