bugs

If server's disk is full, Obnam seems to give no useful error message:

2015-08-29 16:31:42 ERROR Can't back up
/home/liw/ick/liw.state/extrautils/builds/106/build.log: RD5FA4X:
System error: chunks/1052/1065/3340/212da18852839223: None:
Failure

2015-08-29 16:31:42 ERROR OSError(None, 'Failure')

It looks like Obnam is getting too little info from paramiko, but maybe that can be fixed.

Reported by Bazyli Brzóska, http://listmaster.pepperfish.net/pipermail/obnam-dev-obnam.org/2015-April/000144.html

Posted Sat Aug 29 13:36:50 2015

Backing up two hardlinks to the same file, then mounting the backup repository with the FUSE plugin, results in the hardlinks being shown as having different inode numbers. They do thus not expose the hardlinks correctly.

Reported by Ilya Zonov (http://listmaster.pepperfish.net/pipermail/obnam-support-obnam.org/2015-April/003516.html, the list archive shows the message wrong, but can be decoded with base64).

Posted Sat Aug 29 13:16:47 2015

Only the last file ends up in the repository.

See http://listmaster.pepperfish.net/pipermail/obnam-support-obnam.org/2014-June/003103.html for details.

Posted Sun Nov 30 09:04:48 2014

Valery Yundin reports:

It is a problem not only when backup root is a symbolic link, but also when any parent directory of backup root is a symbolic link. Unless you explicitly ask to restore a directory which is already below symlink.

http://listmaster.pepperfish.net/pipermail/obnam-support-obnam.org/2014-April/002976.html

--liw

Posted Wed Jun 4 16:23:59 2014

Alexander Sitnik reported that he'd noticed that "obnam fsck" checks all chunks in all generations, resulting in the same chunks being checked over and over. It should speed up "obnam fsck" if each chunk was checked only once. I note that this may need to be done with some care so that memory use doesn't increase too much. --liw

Posted Wed Jun 4 14:26:55 2014

It would be good if Obnam could have a setting for what the uid and gid (when run as root) and mode should be for new files it creates in the repository.

See:

Posted Tue Jun 3 14:41:40 2014 Tags: obnam-wishlist

The following suggestion was sent to me by private e-mail (not sure if I can show the sender's name):

Date: Mon, 14 Oct 2013 11:48:41 +1100
Subject: Re: Obnam repo size with .sql dump files seem too big

Lars,

We implemented a strategy for identifying repeated chunks, even in
gzip-compressed files that have changes between versions. It might
work for obnam. The downside is it creates variable-sized chunks,
though you can set upper and lower limits.

Firstly, we identify chunk boundaries by using a rolling checksum
like Adler32, rolling over say a 128 byte contiguous region. Any
other efficient FIR (box-car) checksum algorithm would work. When
the bottom N bits of the checksum are zero, declare a block
boundary. This gives blocks of average size 2**N. If you want more
certainty, you can enforce a lower limit of 2**(N-3), and upper
limit of 2**N or 2**(N+1), for example (thereby either creating,
or ignoring, the boundaries that are defined by the checksum.

These variable-sized chunks will re-synchronise after differences
in two streams. To make it work with deflate, we flush the
compression context (the dictionary, we maintain the history,
losing 1-2% of compression) on each block boundary… but I'm not
sure that compression is necessary for obnam.

By choosing N appropriately, you get the block size you want. We
used N=12, for internet delivery (using HTTP subrange requests) of
updated files. For each file, we publish a catalog of Adler32 and
SHA-1 block checksums. Clients download that using HTTP, then
analyse their files before making requests for blocks they lack.

The invention of doing this in a deflate stream is due to Tim
Adam. When we discover we already have a given block, we can prime
the decompressor with that history before decompressing a received
update block. Really it should be implemented using 7zip instead
of deflate; a 32KB quotation history is too small.

I haven't tried this yet. It will require a new repository format version, so I've been working on adding that support instead. --liw

Posted Sat May 17 15:42:15 2014

Currently, there seems to be no easy way to forget all (or all but the newest) checkpoint generations. Something like

obnam --keep 1c forget

would be nice.

-- weinzwang

Posted Tue Apr 22 17:49:48 2014 Tags: obnam-wishlist

Obnam does not currently seem to notice when the sftp connection breaks. It should, and it should then abort the backup. --liw

Posted Tue Apr 22 17:49:48 2014 Tags: obnam-wishlist

Obnam needs tests for using every filesystem type available as live data or repository.

Not sure how to arrange that without root access, but there's a need to do that.

Posted Tue Apr 22 17:49:48 2014

Instead of in-place conversions, which are error prone and clunky, a better way would be nice. Maybe some kind of dump/undump pair, using a streamable format?

Posted Tue Apr 22 17:49:48 2014 Tags: obnam-wishlist

Obnam could do with a mode in which it backs up the data from a block device, instead of the device node. If the block device contains a filesystem, it should backup only the parts of the device that are used by the filesystem, and skip unused parts. That could be used for backing up disk images as well.

Posted Tue Apr 22 17:49:48 2014 Tags: obnam-wishlist

Obnam should, arguably, use ctime changes to trigger backups, so that if a file's size and mtime are the same, because whatever fool program modified a file reset the mtime, obnam will still backup the changed data.

I have code for this, but it requires a repository format change, and breaks the upgrade from format 5 to 6. --liw

Posted Tue Apr 22 17:49:48 2014 Tags: obnam-wishlist

From Enrico: It might be good to have a way for Obnam to automatically exclude certain kinds of common stuff, such as web browser caches, Liferea caches, etc. This should be easy to enable, and should be off by default (safe defaults are important).

Posted Tue Apr 22 17:49:48 2014 Tags: obnam-wishlist

If you accidentally backup some large or sensitive files, but don't want to delete all the generations they're in, it would be handy for Obnam to be able to delete just the specific files from the generations, and leave the rest.

Posted Tue Apr 22 17:49:48 2014 Tags: obnam-wishlist

When --one-file-system is used, it would be nice to not cross bind-mounts. No idea how to figure that out, but it must be possible. --liw

You could look at the inode numbers for . and ./foodir/.. and check they're the same? -- kinnison

The inode check will not work if foodir is a symlink. --mathstuf

Posted Tue Apr 22 17:49:48 2014 Tags: obnam-wishlist

Make obnam fsck remove extraneous files (e.g., tmp*). --liw

Posted Tue Apr 22 17:49:48 2014 Tags: obnam-wishlist

It seems obnam mount (the FUSE plugin) can't handle a client without non-checkpoint generations. This is unfortunate, even if it is fairly unlikely to happen. Should be easy enough to fix. --liw

Posted Tue Apr 22 17:49:48 2014

S.B. suggests that backup generations have an optional description.

Named generations -- There are certain generations that are more important than others. Some are automatically created by Obnam itself, some are routinely scheduled, and some were explicitly created. For example, I always run Obam immediately before traveling with my laptop in case it gets stolen or broken. The same goes for backups before major system upgrades. It would be nice to have something approximately analogous to the Windows "restore point" functionality, which has a description field. Sometimes they are only automatically created system checkpoints. But if the user explicitly creates a new restore point, he can add the description "before traveling to Europe" or "before upgrading OS" or whatever. Similarly, the automatic backup script could be programmed to label it as "cron backup".

Posted Tue Apr 22 17:49:48 2014 Tags: obnam-wishlist

S.B. suggests that generations could be tagged so they aren't automatically deleted.

Unforgettable generations -- In scenarios similar to the above, I would also find it useful to be able to mark certain important generations as "unforgettable". That way, when I run an automatic time based forget command, I can be sure that it will preserve certain milestone generations, even if they weren't the last generation of the month or the week or the day or whatever.

Posted Tue Apr 22 17:49:48 2014 Tags: obnam-wishlist

There needs to be tools and documentation for key managment with Obnam.

How does not one replace one's key, or subkey, when it expires?

Posted Tue Apr 22 17:49:48 2014 Tags: obnam-wishlist

From Joey Hess:

My take on this is that, by choosing to use a tool that uses hashes, I am giving up (near-)absolute certainty for speed, or space, or whatever. So it's important that the hash type be good at collision resistance (for example, no two likely filenames should hash the same; "/etc/passwd" should only tend to collide with blobs that are very unlike a filename). It's also important that the tool be upfront about using hashes, and about what hash it uses. And if it's not designed to allow swapping the hash out when it gets broken, I will trust it less (hello git).

Ah, the replacement of hash functions is an interesting problem.

For pathnames, it's not at all important, I think, except perhaps for performance, since pathnames will be compared byte-by-byte instead of by hashes.

For file data, replacing is easy, if one is willing to back up everything from scratch. Supporting several hashes in the same backup store is a little bit more work, but not a whole lot: instead of having just one tree for mapping checksums to chunk identifiers, one would have one per checksum algorithm.

--liw

Posted Tue Apr 22 17:49:48 2014 Tags: obnam-wishlist

Obnam does not support the ext2/3/4 chattr attributes. It should back them up and set them on restore, when possible.

In addition, it should support the d attribute to exclude files from being backed up.

--liw

Posted Tue Apr 22 17:49:48 2014 Tags: obnam-wishlist

Obnam needs a way to remove clients from the repository. The current remove-client command just deals with encryption.

Suggested-by: Daniel Silverstone

Posted Tue Apr 22 17:49:48 2014 Tags: obnam-wishlist

Obnam needs a way to rename clients in the client list.

Suggested-by: Daniel Silverstone

Posted Tue Apr 22 17:49:48 2014 Tags: obnam-wishlist

If a file is sparse, and has a large hole, it would be good to skip over it with SEEK_HOLE and SEEK_DATA. --liw

Posted Tue Apr 22 17:49:48 2014 Tags: obnam-wishlist

Obnam is currently using paramiko as the SFTP implementation. It is a bit more limited than the SFTP protocol is, and so some stuff that Obnam should be doing, such as restoring hardlinks across SFTP, are not possible. There may also be some bugs with regards to timestamp handling.

Possible fixes:

patch paramiko to support more of SFTP
switch to twisted's conch or libssh or http://pypi.python.org/pypi/ssh/ or python-ssh2

--liw

Posted Tue Apr 22 17:49:48 2014 Tags: obnam-wishlist

Obnam has no test for what happens when the filesystem fills up.

Posted Tue Apr 22 17:49:48 2014

The Idea is to change or extend the --exclude-caches feature so that one can configure which filename to look for that will make obnam skip the directory

Changing --exclude-caches seems wrong to me: it has a specific purpose (to implement the cache directory tagging spec, http://www.bford.info/cachedir/spec.html).

Adding a new option to ignore directories that contain a specific file (or directory) would be fine.

--liw

bwh points out that the owner of the directory and the tag file should be the same.

Posted Tue Apr 22 17:49:48 2014 Tags: obnam-wishlist

Obnam should, at least optionally, use fsync or other methods to ensure that everything gets committed to disk by the kernel by the end of a backup run. --liw

I want this to not have a huge performance impact, though. Learning from the lessons of dpkg, sqlite/liferea/firefox, etc, and using fsync/fdatasync and sync_file_range in the right ways is going to be necessary. --liw

Not doable over sftp, of course. --liw

Posted Tue Apr 22 17:49:48 2014 Tags: obnam-wishlist

Open bugs in Obnam