It would be good if Obnam could have a setting for what the uid and gid (when run as root) and mode should be for new files it creates in the repository.
See:
Currently, there seems to be no easy way to forget all (or all but the newest) checkpoint generations. Something like
obnam --keep 1c forget
would be nice.
-- weinzwang
Obnam does not currently seem to notice when the sftp connection breaks. It should, and it should then abort the backup. --liw
Instead of in-place conversions, which are error prone and clunky, a better way would be nice. Maybe some kind of dump/undump pair, using a streamable format?
Obnam could do with a mode in which it backs up the data from a block device, instead of the device node. If the block device contains a filesystem, it should backup only the parts of the device that are used by the filesystem, and skip unused parts. That could be used for backing up disk images as well.
Obnam should, arguably, use ctime changes to trigger backups, so that if a file's size and mtime are the same, because whatever fool program modified a file reset the mtime, obnam will still backup the changed data.
I have code for this, but it requires a repository format change, and breaks the upgrade from format 5 to 6. --liw
From Enrico: It might be good to have a way for Obnam to automatically exclude certain kinds of common stuff, such as web browser caches, Liferea caches, etc. This should be easy to enable, and should be off by default (safe defaults are important).
If you accidentally backup some large or sensitive files, but don't want to delete all the generations they're in, it would be handy for Obnam to be able to delete just the specific files from the generations, and leave the rest.
When --one-file-system is used, it would be nice to not cross bind-mounts. No idea how to figure that out, but it must be possible. --liw
You could look at the inode numbers for . and ./foodir/.. and check they're the same? -- kinnison
The inode check will not work if foodir is a symlink. --mathstuf
Make obnam fsck remove extraneous files (e.g., tmp*). --liw
S.B. suggests that backup generations have an optional description.
- Named generations -- There are certain generations that are more important than others. Some are automatically created by Obnam itself, some are routinely scheduled, and some were explicitly created. For example, I always run Obam immediately before traveling with my laptop in case it gets stolen or broken. The same goes for backups before major system upgrades. It would be nice to have something approximately analogous to the Windows "restore point" functionality, which has a description field. Sometimes they are only automatically created system checkpoints. But if the user explicitly creates a new restore point, he can add the description "before traveling to Europe" or "before upgrading OS" or whatever. Similarly, the automatic backup script could be programmed to label it as "cron backup".
S.B. suggests that generations could be tagged so they aren't automatically deleted.
- Unforgettable generations -- In scenarios similar to the above, I would also find it useful to be able to mark certain important generations as "unforgettable". That way, when I run an automatic time based forget command, I can be sure that it will preserve certain milestone generations, even if they weren't the last generation of the month or the week or the day or whatever.
There needs to be tools and documentation for key managment with Obnam.
How does not one replace one's key, or subkey, when it expires?
From Joey Hess:
My take on this is that, by choosing to use a tool that uses hashes, I am giving up (near-)absolute certainty for speed, or space, or whatever. So it's important that the hash type be good at collision resistance (for example, no two likely filenames should hash the same; "/etc/passwd" should only tend to collide with blobs that are very unlike a filename). It's also important that the tool be upfront about using hashes, and about what hash it uses. And if it's not designed to allow swapping the hash out when it gets broken, I will trust it less (hello git).
Ah, the replacement of hash functions is an interesting problem.
For pathnames, it's not at all important, I think, except perhaps for performance, since pathnames will be compared byte-by-byte instead of by hashes.
For file data, replacing is easy, if one is willing to back up everything from scratch. Supporting several hashes in the same backup store is a little bit more work, but not a whole lot: instead of having just one tree for mapping checksums to chunk identifiers, one would have one per checksum algorithm.
--liw
Obnam does not support the ext2/3/4 chattr attributes. It should back them up and set them on restore, when possible.
In addition, it should support the d attribute to exclude files from being backed up.
--liw
Obnam needs a way to remove clients from the repository. The current remove-client command just deals with encryption.
Suggested-by: Daniel Silverstone
Obnam needs a way to rename clients in the client list.
Suggested-by: Daniel Silverstone
If a file is sparse, and has a large hole, it would be good to skip over
it with SEEK_HOLE
and SEEK_DATA
. --liw
Obnam is currently using paramiko as the SFTP implementation. It is a bit more limited than the SFTP protocol is, and so some stuff that Obnam should be doing, such as restoring hardlinks across SFTP, are not possible. There may also be some bugs with regards to timestamp handling.
Possible fixes:
- patch paramiko to support more of SFTP
- switch to twisted's conch or libssh or http://pypi.python.org/pypi/ssh/ or python-ssh2
--liw
The Idea is to change or extend the --exclude-caches feature so that one can configure which filename to look for that will make obnam skip the directory
Changing --exclude-caches
seems wrong to me: it has a specific purpose (to implement the cache directory tagging spec, http://www.bford.info/cachedir/spec.html).
Adding a new option to ignore directories that contain a specific file (or directory) would be fine.
--liw
bwh points out that the owner of the directory and the tag file should be the same.
Obnam should, at least optionally, use fsync or other methods to ensure that everything gets committed to disk by the kernel by the end of a backup run. --liw
I want this to not have a huge performance impact, though. Learning
from the lessons of dpkg, sqlite/liferea/firefox, etc, and using fsync/fdatasync
and sync_file_range
in the right ways is going to be necessary. --liw
Not doable over sftp, of course. --liw