First of all, I realise that Obnam stores full paths because it is necessary for saving every file in the system, even when belonging to different users.

However, for certain cases where just a backup of a directory is needed, this could be flexibilized, letting the backup store only the path that is given in the command line, following rsync's spirit.

When does this show up? For example, when migrating from any other backup system, the easier way would be to dump all the generations from the older backup system, one by one, to a temporal place. For each generation, Obnam is run in order to replicate the same history. However, since Obnam stores full paths, the path to the temporary directory used for the migration is also stored. This can happen in production servers, where making the conversion into the original directory where the data belongs to is not possible.

Rickard Nilsson suggested on the mailing list to have a "root" option that could be used for stripping the first part of the path. The stripped part would be the one not mentioned in the command line. That way, the backup will have a path computed like this: root + given_path.

~$ obnam backup --repository=/media/backups/... --root=/ mydata

...would be stored into /mydata instead of in /home/${USER}/mydata

Thank you for taking this into consideration!


If this gets implemented, I suggest the following:

  • The Repository class will provide a hook for mangling the pathnames.
  • The hook will get the pathname as it exists in live data and will return the pathname to store in the backup.
  • The hook will be called at every point where live data pathnames are used by Repository.
  • Someone writes a plugin that adds the suitable functionality.

--liw


For the fun of it I added a mangle_filanem() method to Repository. What I quickly learned: If you backup /root/bar the process backs up "/", "/root", "/foo/bar". In reality you only want "bar" in the backup.

So either the mangling hook is allowed to drop paths entirely. But this feels very crude.

I propose not to change Repository and think of Repository just getting virtual paths from its callers. So instead the functions calling into Repository should be changed. In this case the backup command. I have stopped here.

-- Elrond

If this ever gets implemented, the paths returned by a filename mangling hook will have to still be absolute, I think. --liw

However, I don't particularly want this feature. It has the promise of hours upon hours of debugging over email, when people use it. If someone makes a clean patch (with tests and everyting), I'll consider it, but keeping this bug open for years hasn't resulted in that. done --liw