A partial archive of meta.discourse.org as of Tuesday July 18, 2017.

Rails Girls Summer of Code 2017: Backup Providers

eviltrout

We’ve officially kicked off the Rails Girls Summer of Code 2017 with our group, berlin diamonds.

They’re going to be working on adding extra backup providers to Discourse. This topic can be used by the teams involved for questions and answers, as well to track progress.


Background

Discourse has always provided administrators the ability to back up their community data. It is important to us as an open source project that regardless of where you run your forum, that you be able to take your data with you.

Currently, we support downloading data dumps to your local computer as well as uploading data to Amazon S3. There is also a plugin for uploading to Dropbox available that could be used as an example.


First Steps

The team has decided to focus on a backup provider for Google Drive. Here’s how I’d suggest tackling this problem, although the team is free to approach this on their own schedule and pace:

  1. Become very familiar with the dropbox plugin for backups. Try installing it locally and using it to connect to Dropbox. @Falco is also around to answer questions you might have about how it works and how it is designed.

  2. Review how Google Drive’s API works. What is required to upload a file there? Are there any rubygems you can use to contact the API easily?

  3. Create a new repository for the Google Drive backup provider, and try to build it up from the simplest possible thing that could work. Use Site Settings for any API keys and variables. When it doubt, base it on the Dropbox plugin.

  4. Discourse will provide code reviews and offer assistance along the way.

  5. The plugin is working! :birthday:

Once we’re done the above we can figure out following steps for more providers, and perhaps DRYing up the plugins to avoid repeated code.

kajatiger

@Falco Can I install the dropbox plugin without using the docker container? I would like to just clone it into the discourse/plugins folder…

sam

Yes, just cloning it into the plugins folder should do the trick!

eviltrout

To be more specific, you can do something like this locally:

$ cd discourse/plugins
$ git clone git@github.com:xfalcox/discourse-backups-to-dropbox.git
$ cd ..
$ rm -rm tmp
$ bundle exec rails server

Then the plugin will be installed in plugins/discourse-backups-to-dropbox

Jen_Lijo

Hi! This is Jen, and together with @kajatiger will be working on creating the google-drive plugin during this Rails Girls Summer of Code, happy to meet the community of Discourse! :allthethings:

Jen_Lijo

hi @Falco! should we be able to do backups from dev? We keep getting a “backup failed” with a long bunch of errors in our log.

2017-07-06 13:32:48] EXCEPTION: Failed to archive uploads.
tar: Option --warning=no-file-changed is not supported

@kajatiger

sam

Looks to me like the version of the tar program you are running is missing some features we need. What OS are you running? If this is a mac can you try brew install tar

Jen_Lijo

We’ve solved the issue installing a linux-compatible version of tar for OSX:

$ brew install coreutils
$ brew install gnu-tar --with-default-names
kajatiger

Here is a little update on what we have been doing so far:

  • we looked at the drop-box plugin and learned that it overwrites the after_create_hook method inside the backup.rb model with a Backup.class_eval
  • we used the same method to hook into the backup in discourse
  • we realized that the GoogleDrive OAuth uses a complex way to get the access_token for a certain user with his account, so
  • we decided to use another way of authorization: the GoogleDrive Service Account (https://github.com/gimite/google-drive-ruby/blob/master/doc/authorization.md look at the 3rd option here)
  • we successfully sent our backup files to our service account in the development mode
  • we pushed our changes to a repository (https://github.com/berlindiamonds/discourse-googledrive-backup)
  • we started writing tests with Rspec

It’s kind of working, but still needs a lot of improvement. Please feel free to comment, criticize, suggest and ask :slight_smile:

Jen_Lijo

[status update]

Hi all!

We’ve created the first version of the Google Drive backup plugin. It works with a service account from Google Drive. How to use it it’s described in the README file :slight_smile:

This is the link to the repository:

We are open to suggestions / improvements, please let us know! It’s been a great experience to solve this little :discourse: “plugin” puzzle!

kajatiger

here is also a new blog post to have an overview: http://berlindiamonds.blogspot.de/2017/07/creating-plugin-for-discourse.html

Mittineague

If you add a line to the plugin.rb file, you can have a link to the plugins “home” in the Installed Plugins list in the Admin - Plugins panel. eg.

# name: discourse-backup-to-googledrive
# about: -
# version: 1.0
# authors: Kaja & Jen
# url: https://github.com/berlindiamonds/discourse-googledrive-backup
eviltrout

Wow you got that working fast, so great job! I do think we should should spend a little time making it more awesome before moving on to another backup provider. I know you based this on @falco’s work and that’s a great start, but I’d like to see us move towards something that is more reusable and easier to test.

DriveSynchronizer

This class does all the work currently in a sync class method, including finding the backups to synchronize and then copying them up to Google Drive.

I think it would be better if DriveSynchronizer used more principals of object oriented programming, such as instance variables. I was thinking it might be nice DriveSynchronizer was responsible for uploading one backup to google drive at a time, since they can’t be uploaded in bulk anyway. What if it worked like this?

ds = DriveSynchronizer.new(backup)
ds.sync

It would also be good if it had a method called can_sync? that would return true or false if the file can be synchronized.

Finally I think it would be good to have a reader method to return the backup that the synchronizer is for.

I’ve created a pull request for your repo that adds a bunch of failing tests for the behavior I’ve described above.

You can consider this a form of test driven development, where the specification is the tests and the exercise is to make your code pass them :slight_smile:

To run the tests, once you’ve accepted the pull request, from your discourse directory do this:

bundle exec rake plugin:spec["discourse-googledrive-backup"]

You should see 5 test failures. Your goal will be to get all those passing. In the process I hope you’ll learn how tests work in Discourse and how I’d like to see the class designed. As always feel free to post questions here on meta or in our slack channel :slight_smile:

Cheers!