Syncing external hard-drive with Dropbox for backup

Simple trick to run a secondary Dropbox daemon in Linux. Useful to generate alternative backups.

Syncing external hard-drive with Dropbox for backup

This little project started because Bitcasa is dropping their Personal Drive product which I used to use.
This forced me to change to another cloud storage provider and I decided to use Dropbox.
(During this process I found out how broken Bitcasa is/was and got really furious. But it will be a topic for another blogpost)

One of the things I liked about Bitcasa is that they provided a FUSE that
I could just mount anywhere.
There was no "syncing" of the files in the sense that the files only existed in the cloud provider.
It would download the chunks of the requested files on demand and keep them in a cache.
This allowed me to not have to worry about disk-space in my physical hard-drive.

Dropbox, on the other hand, doesn't work like this.
When you setup the daemon, you select a folder to be mirrored to the cloud.
The daemon monitors any changes in the folder or cloud and keeps both copies synced.
The problem with this is that it requires to have as much space in the device where the dropbox folder is as the contents stored in Dropbox.
For my immediate situation, that would work but it is definitely not going to scale.
I have a 256Gb disk and around 100Gb of data to store in Dropbox.

One possibility is to restrict the content to be mirrored.
With this you get a partial syncing of your Dropbox account in your local folder.
But after what happened to me with Bitcasa (I lost files, MANY files), I want to have a physical backup copy in an external HD to be on the safe side in any event.

Approach

After doing some research I decided to take the following approach in order to tackle the problem.

I run an instance of Dropbox solely for the purpose of syncing my external hard-drive. In this way it doesn't interfere with the files that I actually want to have always synced in my desktop.

I run the external hd Dropbox instance manually and I haven't automated this process. The reason behind this decision is that if I accidentally delete something from Dropbox, the backup will still have it and it won't sync until I tell it to do so.

Running a second instance of Dropbox

Dropbox installs the folders .dropbox and .dropbox-dist under the home directory.

The first one has all the configuration for the Dropbox instance, while the latter has the binary dropboxd and the files required by it.

If you try executing dropboxd, it will complain saying that Dropbox is already running (for syncing the folder in the home directory).

The key to be able to run more than one Dropbox instance is to know how Dropbox determines the location of the .dropbox configuration folder.
As it is in this folder where all the configuration for an instance is stored, where all the cached elements are kept and also where the pid file is kept what prevents multiple instances using the same config.

The location used by Dropbox for the configuration directory is $HOME/.dropbox.
Thus by changing the value of the HOME environmental variable when we execute dropboxd, we can change the configuration folder and have as many instances as we want.

I mount my external hard-drive on /mnt/external-hd/, so I just execute HOME=/mnt/external-hd/ /home/santiago/.dropbox-dist/dropboxd.

The first time it will ask for the instance setup information: account, password, location of the mirrored folder, etc. After the first time, it will run silently.

One caveat is that if the mount directory of your external hard-drive changes, then you should be careful when starting the external-hd's Dropbox service.
If dropbox thinks you have deleted the data, it will sync that upstream and you will lose the data.
To prevent this, before running it, create a symlink from the old location to the new and then move the location to the new one using Dropbox's configuration setup.

Comments powered by Talkyard.