A partial archive of meta.discourse.org as of Tuesday July 18, 2017.

Installing on Kubernetes


I'd like to install Discourse on our Kubernetes cluster. Kubernetes does not use the docker command to deploy images, so I'd need to build a Discourse image locally, push it to our private Docker registry, and then have Kubernetes pull the image from there. This is how we deploy other Docker-based software.

Can I do this by using ./launcher bootstrap locally, then pushing the built image to our registry? Are there likely to be any problems from this approach? How would upgrading work in this case? (Kubernetes container instances can be destroyed at any time, so to upgrade, we'd want to build a new image and re-deploy, which Kubernetes lets us do without downtime.)

I understand I'll need to provide settings for database (via environment variables), volume mounts for storage, etc. We plan to use our existing redundant Redis, PostgreSQL and GlusterFS services for this, like we do with other Docker applications.


Yes, that is what we do in our infrastructure, build image, re-tag it, push to our private repo, then pull from our private repo to deploy.

Note on GlusterFS, we used to use it but had some extreme fail cases that lead us down a windy road of trying out Ceph (which also had issues) and finally moving to NetApp which has been rock solid.


Thanks, Sam. How do you handle upgrades? I noticed the Discourse version is hardcoded in the launcher script; manually updating it there feels like the wrong way?

Re Gluster: Yes, I'm not terribly happy with it; I'm thinking of moving back to NFS with DRBD and Pacemaker. We aren't at the scale we can justify the expensive of dedicated storage hardware yet.


There are a bunch of tricks, you can always specify a custom base image in you yaml file eg:

Eg: https://github.com/SamSaffron/message_bus/blob/master/examples/chat/docker_container/chat.yml

But you should not really need to do that, for upgrades simply bootstrap a new image and push that our, you can add hooks to apt get upgrade and so on.


Okay, I think I'm missing something about how the launcher script (or pups) works here.

I have one YAML file for each instance, which looks like this:

discourse@hawthorn:~/discourse$ cat containers/clicsargent.yml 
  - "templates/torchbox-web.template.yml"
  - "private/clicsargent.template.yml"

  DISCOURSE_HOSTNAME: 'clicsargent-stage-discourse.torchboxapps.com'
  DISCOURSE_DB_USERNAME: clicsargent_discourse
  DISCOURSE_DB_NAME: clicsargent_discourse

torchbox-web.template.yml sets the global database configuration:

discourse@hawthorn:~/discourse$ cat templates/torchbox-web.template.yml 
# This is a Discourse build for Torchbox servers on Kubernetes.

  - "templates/web.template.yml"
  - "templates/web.ratelimited.template.yml"

  LANG: en_US.UTF-8
  DISCOURSE_DEVELOPER_EMAILS: 'sysadmin@torchbox.com'
  DISCOURSE_DB_HOST: postgres-2.itl.rslon.torchbox.net
  DISCOURSE_REDIS_HOST: redis-1-cache.itl.rslon.torchbox.net
  DISCOURSE_SMTP_ADDRESS: mailer.itl.svc.torchbox.net

Then I set a database password in a separate file, so it doesn't need to go in the Git repository:

discourse@hawthorn:~/discourse$ cat private/clicsargent.template.yml 

However, running the build doesn't seem to pick up the correct database settings from the template

discourse@hawthorn:~/discourse$ ./launcher bootstrap clicsargent
 Failed to initialize site default
rake aborted!
PG::ConnectionBad: could not connect to server: No such file or directory
        Is the server running locally and accepting
        connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?

If I move all the database settings except password to containers/<instance>.yml, then it does work, and it correctly picks up the password from private/<instance>.yml.

What am I missing here? My understanding was that all the env settings would be merged from templates.


Sounds like a pups bug / problem, so that's where you'll want to focus your debug efforts.


Yeah @tgxworld mentioned this to me in the past... it is a launcher bug, it does not do 2 levels of inheritance, only 1.

I want to get it fixed, maybe try to patch that bash file up to support it


Have you considered using https://hub.docker.com/r/bitnami/discourse/ ? It seems with that container you can just supply env vars and it's not required to run a launcher or your own images.. I think.. :slight_smile:

How amazing would a Kubernetes Helm Chart, or even an Operator for Discourse be... :open_mouth:


Sure, please see

There are a lot of details there.


Tried bitnami/discourse:latest (1.8.3-dirty as of date) and having a few issues with sidekiq

And there’s probably no way for me to fix it myself as bitnami’s packaging is all a black-box.


I bootstrap with the following in containers/app.yml:

  - "templates/web.template.yml"
  - "templates/redis.template.yml"

  LANG: en_US.UTF-8
  DISCOURSE_HOSTNAME: 'discourse.mysite.org'
  DISCOURSE_DB_USERNAME: discourseuser
  DISCOURSE_DB_NAME: discourse
  DISCOURSE_DEVELOPER_EMAILS: 'jasmin.hassan@mysite.org'
  DISCOURSE_DB_HOST: 'x.x.x.rds.amazonaws.com'
  DISCOURSE_SMTP_ADDRESS: 'email-smtp.x.amazonaws.com'
  DISCOURSE_DB_PASSWORD: securepassword

Then I push to private docker hub repo, and write a yaml file for kubernetes to pull my newly pushed private image, and apply it. However, without a “command” and/or “args” set in the kubernetes yaml file for the deployment, the container/pod starts up but immediately errors with:

I, [2017-07-15T12:52:21.697829 #13] INFO – : Loading --stdin
/pups/lib/pups/config.rb:23:in initialize': undefined method[]’ for nil:NilClass (NoMethodError)
from /pups/lib/pups/cli.rb:27:in new' from /pups/lib/pups/cli.rb:27:inrun’
from /pups/bin/pups:8:in `’

After some research and digging in, I realize I have to set a custom command in kubernetes yaml file, so a part of it might look like:

- image: myorg/discourse:latest
  name: discourse
  command: ["/bin/bash"]
  args: ["-c", "cd /var/www/discourse && bin/bundle exec rails server && bin/bundle exec sidekiq -q critical,low,default -d -l log/sidekiq.log && nginx"]
  imagePullPolicy: Always
  - containerPort: 80
      memory: "2Gi"

Then the ENV vars (postgres, redis, smtp, etc.) and volume mounts.

However, puma server (tcp/3000) dies silently after daemonizing according to the logs.
Fix (from containers/app.yml):
sed -i 's#/home/discourse/discourse#/var/www/discourse#' config/puma.rb

Site then loads, but all assets are not loaded (css, js, etc).
sed -i 's/GlobalSetting.serve_static_assets/true/' config/environments/production.rb

so basically I ended up with the additional section in containers/app.yml:

  - exec:
      cd: /var/www/discourse
        - sed -i 's#/home/discourse/discourse#/var/www/discourse#' config/puma.rb
        - sed -i 's/GlobalSetting.serve_static_assets/true/' config/environments/production.rb
        - bash -c "touch -a /shared/log/rails/{sidekiq,puma.err,puma}.log"
        - bash -c "ln -s /shared/log/rails/{sidekiq,puma.err,puma}.log log/"

additionally, because of SSL termination outside, for actions like trigger/delete/retry/etc at https://discourse/sidekiq one gets 403 forbidden errors and in puma.err.log it complains about HttpOrigin.
So I just fix that by adding:

    - sed -i 's/default \$scheme;/default https;/' /etc/nginx/conf.d/discourse.conf

and rebuilding.

latest build as of today
v1.9.0.beta4 +61


We usually use NGINX+Unicorn as our minimal deployment unit and boot to runit. This should run find with minimal hacking. I there a reason why you are trying to decompose this further?

Also I would definitely decompose redis in this kind of setup. Mixing it with the app makes it very hard to scale.


Hi Sam,

Thanks for replying. No, I have no particular reason to decompose, except that Kubernetes expects the entrypoint process to not exit and monitors that process and if it dies it restarts the container. Ideally I would like to run supervisord to spawn and monitor the main processes.
So you’re saying I can just supply /sbin/boot as the command for the container to run?

Redis runs in Kubernetes and is only accessible as a service, so because I cannot connect to it during manual bootstrapping process as I’m not yet inside that kube cluster, the bootstrap fails unless I give it a valid redis server. Therefore, I temporarily bootstrap with redis as well, but later override the redis env vars when running container in kubernetes. A mere workaround.

Also is there a reason against linking $home/plugins is to /shared/plugins?