Thursday, April 28, 2011

Rsync, Level: Expert

Learning all kinds of new things, like how to escape spaces in filenames. When I tried the command
$ sudo rsync -avr --progress /etc/ /media/FreeAgent Backup/etc/
I got only error messages. What works is quoting,
sudo rsync -avr --progress /etc/ "/media/FreeAgent Backup/etc/"
or escaping:
sudo rsync -avr --progress /etc/ /media/FreeAgent\ Backup/etc/
. I back up /etc as it's small, and may be useful. The main thing to backup, of course, is /home:
sudo rsync -avr --progress /home/valorie/ "/media/FreeAgent Backup/home/"
where /home is the source, and /media/FreeAgent Backup/ is the destination. Thanks to sbeattie and maco in #ubuntu-women for showing me this.

Also, empty trash BEFORE running a backup of /home!
sudo rm -r /home/valorie/.local/share/Trash/*
. In #ubuntu-women, JanC suggested using "the GUI or a special commandline tool to empty the trash, as it keeps references around etc. The trash-cli package contains a command
empty-trash
." When I asked what she meant by references, JanC explained, "references as to what file has to be restored to where (if you ever want to do so). And of course there are separate trash directories on every disk you have etc. AFAIK trash-cli implements the XDG spec about this, just like GNOME & KDE do."

In #kubuntu-offtopic, James147 suggested that rather than just emptying the trash, I use --exclude=, ie rsync ... --exclude=*/Trash/* --exclude=*.tmp --exclude=*.bak ... etc. He suggested "useful patterns might be: *.tmp *.bak *.backup *~ *.swp */lost+found/*. I really wish I had thought to do this, and I definitely will in future. Watching that stuff scroll past as I waited was not fun!

I asked, "so each of the excludes needs to be separate, like: --exclude=*/Trash/* --exclude=*.tmp --exclude=*.bak"

[15:27] <james147> yes
[15:27] <james147> each pattern you want to exclude needs a separate --exclude=
[15:27] <james147> ^^ can get long but thats what scripts are for :D
[15:28] <valorie> that maybe my next step
[15:28] <valorie> for now, a good string I can copy/paste will be good

[15:29] <james147> valorie: a script isn't hard to do...

[15:29] <valorie> the up-arrow in bash makes things pretty easy
[15:30] <james147> valorie: can be as simple as
1. #!/bin/bash
   2. rsync ... --exclude=*.tmp --exclude=/Trash/* ... # add and edit this line till it suits your needs
[15:30] ^^ then you just need to run "chmod +x scriptname"
[15:31] then run it with ./scriptname or bash scriptname or sh scriptname
[15:31] (use sudo if it needs root)
[15:31] valorie: bash scripts can just be a bunch of commands you want to run

I'm not quite ready to wrap all this into a script, but will definitely think about do so before my next backup. http://www.linux.com/news/enterprise/storage/8200-back-up-like-an-expert-with-rsync discussing scripting the backup process too. For now, on to Natty Narwhal!

11 comments:

  1. No need to reinvent the wheel:
    http://www.dirvish.org/

    :)

    ReplyDelete
  2. An alternative way to specify filters is to use the -F rsync option. If you do so, you can specify your filters in files named .rsync-filter. These files follow a simple syntax. In your case you could have created a .rsync-filter in $HOME with the following content:

    - Trash
    - *.tmp
    - *.bak

    You can learn more about .rsync-filter files in rsync man page

    ReplyDelete
  3. I've worked with rsync a couple of times, but it can get complex quickly. These days I've settled on rsnapshot. It uses rsync for multiple back-ups (using hard links so files that don't change from one back-up to another aren't backed up multiple times).

    With the hard linking, only changed files get backed up, meaning back-ups are a lot quicker than backing up everything every time. Cron can run it recurringly, so I have back-ups every four hours for the day, every day for a week, and every week for the month.

    Setting up things manually by writing out a script using rsync is great for learning how rsync works. rsnapshot is great for when you want to get a nice back-up system up and running so you can go play with something else ;)

    ReplyDelete
  4. Instead of repeating "--exclude=...", you can put all your exclude patterns in one file with one pattern by line and tell rsync to load this pattern file with the "--exclude-from=/path/to/pattern/file" option.

    ReplyDelete
  5. Strongly recommend grsync, - it's a great frontend for rsync!

    ReplyDelete
  6. Rather than using a lot of '--exclude' options,
    it's better to use '--exclude-from=SOME_FILE',
    since there's a lot of things you don't want
    to backup. Here is an example :

    --------SOME_FILE-------->8
    # Special
    .Trash

    # Cache directories
    cache
    .cache
    .fontconfig
    .thumbnails
    .config/chromium/Default/Application\ Cache

    .java
    .netx
    .adobe/Flash_Player/AssetCache
    .mozilla/extensions
    .mozilla/firefox/*.default/Cache
    .mozilla/firefox/*.default/OfflineCache
    .mozilla/firefox/*.default/bookmarkbackups
    .macromedia/Flash_Player

    # History files
    .bash_history
    .lesshst
    .recently-used
    .recently-used.xbel
    .viminfo
    .vim/.netrwhist

    # Session-related & co.
    session
    .dbus/session-bus
    .tmp/orbit-*
    .tmp/pulse-*
    .tmp/virtual-*
    .mozilla/firefox/Crash\ Reports
    .*.sw*
    .gvfs
    .dmrc
    .dvdcss
    .esd_auth
    .pulse
    .pulse-cookie
    .sudo_as_admin_successful
    .update-notifier
    .ICEauthority
    .Xauthority
    .xsession-errors*
    ---------------->8

    ReplyDelete
  7. Speaking of escaping, it seems the chatlog got parsed as raw HTML. Here, have some < and > escapes.

    ReplyDelete
  8. You can use luckybackup also, a front-end for rsync

    ReplyDelete
  9. @workman161 - thanks, fixed!

    @all - As for the alternatives to rsync, I might try them again once I understand the process better. Luckybackup is what I used for my backup-which-didn't-exist.

    ReplyDelete
  10. http://dropbox.leftyfb.com/backup.sh.txt

    That's a script I've been working on for a while now. It uses rsync and hardlinks similar to rsnapshot. I use the same script in cron on 12 different machines by only giving the machine name as the first argument. It's got checks for accessing a remote machine and for the backup location existing.

    I'm also adding in the ability to backup through a middle-man machine via ssh, sort of like tunneling.

    ReplyDelete
  11. option 'a' equals -rlptgoD so not need extra 'r' in '-avr'
    '-av' wiil do the same

    ReplyDelete