Open bugs in Obnam
If you have a problem with Obnam, please send mail to the mailing list. Don't add one to this list. See the contact page for information about the list. This wiki page is meant to help developers keep track of confirmed bugs, and not as a support channel. (This has changed on 2012-07-08.)
See bug-reporting page for hints on what to include in a bug report, if you're unsure.
See also bugs that are done, and bugs.
See also:
- done
- non-wishlist (and also not performance related)
- wishlist bugs only
- performance bugs only
- Bugs in Debian
If server's disk is full, Obnam seems to give no useful error message:
2015-08-29 16:31:42 ERROR Can't back up
/home/liw/ick/liw.state/extrautils/builds/106/build.log: RD5FA4X:
System error: chunks/1052/1065/3340/212da18852839223: None:
Failure
2015-08-29 16:31:42 ERROR OSError(None, 'Failure')
It looks like Obnam is getting too little info from paramiko, but maybe that can be fixed.
Reported by Bazyli Brzóska, http://listmaster.pepperfish.net/pipermail/obnam-dev-obnam.org/2015-April/000144.html
Backing up two hardlinks to the same file, then mounting the backup repository with the FUSE plugin, results in the hardlinks being shown as having different inode numbers. They do thus not expose the hardlinks correctly.
Reported by Ilya Zonov (http://listmaster.pepperfish.net/pipermail/obnam-support-obnam.org/2015-April/003516.html, the list archive shows the message wrong, but can be decoded with base64).
Only the last file ends up in the repository.
See http://listmaster.pepperfish.net/pipermail/obnam-support-obnam.org/2014-June/003103.html for details.
Valery Yundin reports:
It is a problem not only when backup root is a symbolic link, but also when any parent directory of backup root is a symbolic link. Unless you explicitly ask to restore a directory which is already below symlink.
http://listmaster.pepperfish.net/pipermail/obnam-support-obnam.org/2014-April/002976.html
--liw
Alexander Sitnik reported that he'd noticed that "obnam fsck" checks all chunks in all generations, resulting in the same chunks being checked over and over. It should speed up "obnam fsck" if each chunk was checked only once. I note that this may need to be done with some care so that memory use doesn't increase too much. --liw
It would be good if Obnam could have a setting for what the uid and gid (when run as root) and mode should be for new files it creates in the repository.
See:
The following suggestion was sent to me by private e-mail (not sure if I can show the sender's name):
Date: Mon, 14 Oct 2013 11:48:41 +1100
Subject: Re: Obnam repo size with .sql dump files seem too big
Lars,
We implemented a strategy for identifying repeated chunks, even in
gzip-compressed files that have changes between versions. It might
work for obnam. The downside is it creates variable-sized chunks,
though you can set upper and lower limits.
Firstly, we identify chunk boundaries by using a rolling checksum
like Adler32, rolling over say a 128 byte contiguous region. Any
other efficient FIR (box-car) checksum algorithm would work. When
the bottom N bits of the checksum are zero, declare a block
boundary. This gives blocks of average size 2**N. If you want more
certainty, you can enforce a lower limit of 2**(N-3), and upper
limit of 2**N or 2**(N+1), for example (thereby either creating,
or ignoring, the boundaries that are defined by the checksum.
These variable-sized chunks will re-synchronise after differences
in two streams. To make it work with deflate, we flush the
compression context (the dictionary, we maintain the history,
losing 1-2% of compression) on each block boundary… but I'm not
sure that compression is necessary for obnam.
By choosing N appropriately, you get the block size you want. We
used N=12, for internet delivery (using HTTP subrange requests) of
updated files. For each file, we publish a catalog of Adler32 and
SHA-1 block checksums. Clients download that using HTTP, then
analyse their files before making requests for blocks they lack.
The invention of doing this in a deflate stream is due to Tim
Adam. When we discover we already have a given block, we can prime
the decompressor with that history before decompressing a received
update block. Really it should be implemented using 7zip instead
of deflate; a 32KB quotation history is too small.
I haven't tried this yet. It will require a new repository format version, so I've been working on adding that support instead. --liw
Currently, there seems to be no easy way to forget all (or all but the newest) checkpoint generations. Something like
obnam --keep 1c forget
would be nice.
-- weinzwang
Obnam does not currently seem to notice when the sftp connection breaks. It should, and it should then abort the backup. --liw
Obnam needs tests for using every filesystem type available as live data or repository.
Not sure how to arrange that without root access, but there's a need to do that.
Instead of in-place conversions, which are error prone and clunky, a better way would be nice. Maybe some kind of dump/undump pair, using a streamable format?
Obnam could do with a mode in which it backs up the data from a block device, instead of the device node. If the block device contains a filesystem, it should backup only the parts of the device that are used by the filesystem, and skip unused parts. That could be used for backing up disk images as well.
Obnam should, arguably, use ctime changes to trigger backups, so that if a file's size and mtime are the same, because whatever fool program modified a file reset the mtime, obnam will still backup the changed data.
I have code for this, but it requires a repository format change, and breaks the upgrade from format 5 to 6. --liw
From Enrico: It might be good to have a way for Obnam to automatically exclude certain kinds of common stuff, such as web browser caches, Liferea caches, etc. This should be easy to enable, and should be off by default (safe defaults are important).
If you accidentally backup some large or sensitive files, but don't want to delete all the generations they're in, it would be handy for Obnam to be able to delete just the specific files from the generations, and leave the rest.
When --one-file-system is used, it would be nice to not cross bind-mounts. No idea how to figure that out, but it must be possible. --liw
You could look at the inode numbers for . and ./foodir/.. and check they're the same? -- kinnison
The inode check will not work if foodir is a symlink. --mathstuf
Make obnam fsck remove extraneous files (e.g., tmp*). --liw
It seems obnam mount
(the FUSE plugin) can't handle a client without
non-checkpoint generations. This is unfortunate, even if it is fairly
unlikely to happen. Should be easy enough to fix.
--liw
S.B. suggests that backup generations have an optional description.
- Named generations -- There are certain generations that are more important than others. Some are automatically created by Obnam itself, some are routinely scheduled, and some were explicitly created. For example, I always run Obam immediately before traveling with my laptop in case it gets stolen or broken. The same goes for backups before major system upgrades. It would be nice to have something approximately analogous to the Windows "restore point" functionality, which has a description field. Sometimes they are only automatically created system checkpoints. But if the user explicitly creates a new restore point, he can add the description "before traveling to Europe" or "before upgrading OS" or whatever. Similarly, the automatic backup script could be programmed to label it as "cron backup".
S.B. suggests that generations could be tagged so they aren't automatically deleted.
- Unforgettable generations -- In scenarios similar to the above, I would also find it useful to be able to mark certain important generations as "unforgettable". That way, when I run an automatic time based forget command, I can be sure that it will preserve certain milestone generations, even if they weren't the last generation of the month or the week or the day or whatever.
There needs to be tools and documentation for key managment with Obnam.
How does not one replace one's key, or subkey, when it expires?
From Joey Hess:
My take on this is that, by choosing to use a tool that uses hashes, I am giving up (near-)absolute certainty for speed, or space, or whatever. So it's important that the hash type be good at collision resistance (for example, no two likely filenames should hash the same; "/etc/passwd" should only tend to collide with blobs that are very unlike a filename). It's also important that the tool be upfront about using hashes, and about what hash it uses. And if it's not designed to allow swapping the hash out when it gets broken, I will trust it less (hello git).
Ah, the replacement of hash functions is an interesting problem.
For pathnames, it's not at all important, I think, except perhaps for performance, since pathnames will be compared byte-by-byte instead of by hashes.
For file data, replacing is easy, if one is willing to back up everything from scratch. Supporting several hashes in the same backup store is a little bit more work, but not a whole lot: instead of having just one tree for mapping checksums to chunk identifiers, one would have one per checksum algorithm.
--liw
Obnam does not support the ext2/3/4 chattr attributes. It should back them up and set them on restore, when possible.
In addition, it should support the d attribute to exclude files from being backed up.
--liw
Obnam needs a way to remove clients from the repository. The current remove-client command just deals with encryption.
Suggested-by: Daniel Silverstone
Obnam needs a way to rename clients in the client list.
Suggested-by: Daniel Silverstone
If a file is sparse, and has a large hole, it would be good to skip over
it with SEEK_HOLE
and SEEK_DATA
. --liw
Obnam is currently using paramiko as the SFTP implementation. It is a bit more limited than the SFTP protocol is, and so some stuff that Obnam should be doing, such as restoring hardlinks across SFTP, are not possible. There may also be some bugs with regards to timestamp handling.
Possible fixes:
- patch paramiko to support more of SFTP
- switch to twisted's conch or libssh or http://pypi.python.org/pypi/ssh/ or python-ssh2
--liw
Obnam has no test for what happens when the filesystem fills up.
The Idea is to change or extend the --exclude-caches feature so that one can configure which filename to look for that will make obnam skip the directory
Changing --exclude-caches
seems wrong to me: it has a specific purpose (to implement the cache directory tagging spec, http://www.bford.info/cachedir/spec.html).
Adding a new option to ignore directories that contain a specific file (or directory) would be fine.
--liw
bwh points out that the owner of the directory and the tag file should be the same.
Obnam should, at least optionally, use fsync or other methods to ensure that everything gets committed to disk by the kernel by the end of a backup run. --liw
I want this to not have a huge performance impact, though. Learning
from the lessons of dpkg, sqlite/liferea/firefox, etc, and using fsync/fdatasync
and sync_file_range
in the right ways is going to be necessary. --liw
Not doable over sftp, of course. --liw