I'm not sure what to do with this or if it even qualifies as a bug but here we go...

I back up my notebook using obnam for a while now so I have about 10 clean generations, the repo is 118G big. One of the files in the backup is a 32G VM file that changes slightly between backups, but not during backups. Until a few days ago this worked really well and fast.

Then I tried to do a partial backup via a very slow line, with snapshots every 5MB. This was interrupted (ctrl+c) after a few hours.

The next day I continued the backup using the local network. I forgot to change the snapshot interval, so it was still 5MB. I expected the backup to finish within a few minutes as it did in the past, but after crawling through the first couple of GB of the VM file at 50MB/s the speed went down to below 1MB/s, doing a huge amount of IO on the local machine. I haven't manged to complete a backup since.

Nearly everything I try to do with the repo now takes a lot of time. Listing the huge amount of generations takes over an hour (it lists at a rate of about 1 generation every 10 seconds or so). Listing only the genids is still very quick.

I'm not sure what goes wrong here. Is the massive amount of generations killing the performance?

-- weinzwang


The number of generations shouldn't impact the speed of backups. It does have an impact on the speed of removing generations, so if you backup stalls at the end when it removes checkpoints, then you can avoid that with the --leave-checkpoints option.

Could you check logs to see where time is spent? Or even run with profiling enabled when you list generations? OBNAM_PROFILE=obnam.prof obnam generations should do that. Then use a Python script like the following to get it in cleartext:

import pstats
import sys

if len(sys.argv) not in [2, 3]:
    sys.stderr.write('Usage: viewprof foo.prof [sort-order]\n')
    sys.exit(1)

if len(sys.argv) == 3:
    order = sys.argv[2]
else:
    order = 'cumulative'

p = pstats.Stats(sys.argv[1])
p.strip_dirs()
p.sort_stats(order)
p.print_stats()
p.print_callees()

Then mail me the output, and I'll have a look. Thanks. --liw


Since this problematic repository doesn't work with recent versions of obnam due to on-disk format changes, I can't reproduce this anymore. I haven't encountered this effect with any other repositories I have, so I think the bug can be closed. --weinzwang

--

Ack, closing bug. done --liw