Introduction
Obnam has a number of options for performance tuning. See the manual page for all the details. Below is an adapted excerpt of e-mails written by Lionel Bouton of how to test various values to find a good set for your situation. See the list archive for the e-mails: first and second.
Measurements
Tuning lru-size
and/or upload-queue-size
can make a significant
difference in performance.
Here follows some test results for this situation:
Data to backup stored on a btrfs volume on SSD: ~155000 files, 3.66GiB.
Local system: 64 bit Linux, Python 2.7.5, OpenSSH 6.6p1 with hpn patches.
Local CPU: Intel(R) Core(TM) i5-3317U CPU @ 1.70GHz (mostly idle).
Remote system: 64bit Linux, OpenSSH 6.6p1 with hpn patches, repository data on ext4 on standard SATA 7200rpm disk, large memory (everything should fit in memory, only writes should hit disk).
Very minimal changes in the data backed up during tests so successive backups only check for differences and backup transfers nearly no content.
Backup over WiFi (~1ms RTT, max speed over sftp ~3MB/s).
I use this command line without any configuration file:
obnam -r sftp://obnam@SERVER/~/repo --compress-with=deflate \
--client-name=CLIENT backup DIR
During testing I added --lru-size <l> --upload-queue-size <q>
with
different <l>
and <q>
values.
The resident memory of the Obnam process grows steadily (probably filling caches) until it hits a pretty stable ceiling (cache full or nothing new to put in cache) during the backup. It raises again rapidly at the very end (during commits/unlock/...). The value reported below is obtained either through the RES column reported by the htop utility or the RSS column reported by "ps aux" and is the max witnessed near the end of the backup.
Each combination was tested at least twice unless it was considered not interesting after the first run. Timing seems consistent enough given the systems involved (the system hosting the repository is often busy) and memory usage is very consistent across runs.
Default values as fetched from __init__.py
are: l=256, q=128.
| Conditions | Time | Memory |Number of runs |
+--------------------+-----------------+------------+----------------+
| default values | 22m21s - 24m51s | ~260M | 2 |
| l=10000, q=default | 13m45s - 15m03s | ~332M | 2 |
| l=default, q=250 | 08m23s - 10m29s | ~278M | 5 |
| l=default, q=350 | 02m42s - 02m49s | 272-276M | 2 |
| l=default, q=400 | 02m13s - 02m18s | 268-272M | 3 |
| l=default, q=500 | 02m10s - 02m16s | 267-272M | 3 |
| l=default, q=512 | 02m13s - 02m14s | 265-269M | 2 |
| l=512, q=512 | 01m55s - 02m06s | 322-326M | 3 |
| l=768, q=512 | 01m55s - 01m58s | 397-418M | 3 |
| l=1024, q=512 | 01m53s - 01m55s | 403-418M | 3 |
| l=2048, q=512 | 01m55s - 01m59s | 408-410M | 3 |
| l=4096, q=512 | ~01m58s | ~419M | 1 |
| l=default, q=600 | 02m14s - 02m26s | 269-272M | 4 |
| l=default, q=750 | 02m13s - 02m15s | 266-272M | 2 |
| l=default, q=1000 | 02m19s - 02m20s | ~266M | 2 |
| l=default, q=10000 | 02m23s - 02m35s | ~266M | 2 |
So in my configuration, when nearly no data changes between backups,
--lru-size=1024 --upload-queue-size=512
is at least 11x faster than the
default configuration.
Discussion
--upload-queue-size
seems to have the greatest effect without any
adverse effect (memory usage remains at the same level).
For a little extra boost with a small impact on memory usage, I can
increase --lru-size
to 1024.
Note that Obnam was using 100% of the CPU for most of the time in the
fastest configuration, replacing --verbose
with --quiet
didn't
change the running time.
Please note that the ideal settings for my backup configuration might differ from the ones for yours. You might get even better results after tuning of your own.
These parameters have a nice behaviour for tuning: upload-queue-size
doesn't seem to have much drawback if at all when being increased (it
begins to give signs of slowing down obnam at 10000 here but it might be
the performance variance inherent in my configuration) and increasing
lru-size
only increases memory usage a bit without slowing things
noticeably after reaching the ideal spot.
A good rule of thumb seems to try increasing one of these parameters by 2x or 4x and keep going at it until performance stops increasing.