Running psiTurk on Heroku¶
Heroku is a cloud service that lets you run applications in the cloud. You can run psiTurk on Heroku by preparing a git repository and then pushing it to Heroku which will deploy and autorun the code for you.
The benefits of Heroku include the following:
- It’s somewhat easier to manage than Amazon Web Services EC2 for the tech-wary (no need for security groups, no need to ssh in).
- You can set up a free PostgreSQL server (which is highly recommended to use over the default SQLite database that psiTurk uses). A database server is required on heroku as files, including participants.db, are ephemeral. Data would be lost every time the app spins down.
- You get free SSL for hosting your own ad.
- It’s scalable.
- You get a Heroku buffering server in front of your psiTurk gunicorn instance, which helps with performance a little bit.
One downside with Heroku is that it can get expensive if you need any kind of horsepower beyond 512MB memory and one node.
What follows is a step-by-step tutorial for setting up a psiTurk example experiment on Heroku (both the experiment itself and ad) with a PostgreSQL database for collecting data.
All commands listed in this tutorial are meant to be typed into your terminal application.
Go to the Heroku website and create a new account if you don’t already have one.
If you don’t already have a psiturk experiment:
Create a psiTurk example at a desired location
Navigate into your newly created psiTurk example folder:
If you are starting from an already-existing psiturk project:
Navigate to your project root directory.
If your experiment is not already in a git repository: Initialize a Git repository in the root dir of your psiturk project the psiTurk (your current working directory):
Log in to Heroku, entering your heroku credentials when promted for them:
Create a new app on Heroku:
Running this command will add a
git remoteto your
.git/configfile, which will make it so that any heroku commands run from your project folder will be run against your newly-created heroku app.
Run the following psiturk shell command:
Running this command copies all files from psiturk’s heroku_files folder into your experiment’s root directory. These are needed for your experiment to run on Heroku.
This command also runs
heroku config:set ON_CLOUD=1in your shell on your behalf. This sets an environment variable called
ON_CLOUDto the value
1in your heroku app’s environment. Setting
ON_CLOUD=1in your environment tells psiturk to use some sensible defaults for several config settings. Specifically, it sets defaults for
Heads up! The sample config.txt file generated by psiturk 3 shows defaults in your config.txt commented out (prepended with a
;). Cloud defaults will override any defaults that are commented-out in your config.txt.
But if the cloud defaults are set in your config.txt then the cloud defaults will be overridden. To remedy this, you will need to either:
- change them in your config.txt or re-comment them out, or
- set environment variables on heroku for the corresponding cloud defaults that take precedence over your
For the latter, any of the config settings can be overridden in the heroku environment by setting
heroku config:set. For example, to override a config.txt
threadson heroku, one could run the following:
heroku config:set PSITURK_THREADS=1
Set a database that your heroku app will use.
To get a free heroku-hosted postgresql database:
Create a Postgres database on the newly created Heroku app:
heroku addons:create heroku-postgresql
This will provision a psiturk-compatible postgresql database, and set an environment variable on your app called
DATABASE_URLthat points to your database.
To see the
DATABASE_URLgiven to you by heroku for this newly-provisioned postgresql database, you can run the following:
This URL includes your username and password. Anyone who has access to the
database_urlcan connect to your database and has access to the data stored in it!
If you already have a publicly-accessible database hosted elsewhere:
Then you can do one of the following:
- list its url as your
database_urlin your config.txt and be sure that
DATABASE_URLis not set in your heroku environment (check
heroku config), or
- set its url in your heroku environment (
heroku config:set DATABASE_URL=your-url)
- list its url as your
psiTurk prefers environment variables over all other config file settings. Most environment settings need to prepend
PSITURK_to the corresponding config setting name, with the exception of two environment variables:
These two, if present in the environment, are respected even if not prepended by
This means that if
DATABASE_URLis set in your heroku environment, it will override any setting you have in
Set your AWS credentials as environment variables within your heroku app, replacing
<XYZ>with your access and secret keys for Amazon Web Services:
heroku config:set AWS_ACCESS_KEY_ID=<XYZ> heroku config:set AWS_SECRET_ACCESS_KEY=<XYZ>
Stage all the files in your psiTurk example to your Git repository:
git add .
Commit all the staged files to your Git repository:
git commit -m "Initial commit"
Push the code to your Heroku git remote, which will trigger a build process on Heroku, which, in turn, runs the command specified in Procfile, which autolaunches your psiTurk server on the Heroku platform:
git push heroku master
Any time you want to push changes to your heroku-hosted psiturk experiment, you will need to repeat the above flow of
You can run through your heroku-hosted experiment by visiting your heroku app’s url.
To get your app’s url, run
heroku domainsfrom the root of your local psiturk app, and visit your app’s reported domain url in a browser. From that url, you can conveniently obtain a debugging url by clicking “Begin by viewing the ad.”
To download data from your heroku app using a locally-run psiturk, set your local psiTurk app to use the same database that your experiment uses when it runs on heroku.
To do so, get the
DATABASE_URLof your heroku psiturk instance by running
heroku config, and set the database url in any of the following local places:
- your own local environment.
If you opt to set your database url in your
config.txtfile, then be cautious about sharing your experiment code – the url contains your database username and password!
Once your local psiturk app uses the same database as your heroku app, then you can run the following to download your experiment data, regardless of whether you have run through your experiment hosted locally or on Heroku:
This should generate three datafiles for you in your local directory:
Congratulations, you’ve now gathered data from an experiment running on Heroku!
psiTurk will look for a file called
.envin the root of your psiturk app and read in any KEY=VALUE settings in there as environment variables for your psiturk app. Therefore, one could put the following content in a file called
.envto set the database_url:
To post a hit to MTurk that uses your heroku app, set your local psiTurk config.txt’s ad_url settings to point to your heroku app. The easiest way to do this is to set ad_url_domain in your config.txt’s
[HIT Configuration]section to equal your heroku domain name.
For example, if running
heroku domainsreported that your heroku domain was
example-app.herokuap.com, then you would simply set
ad_url_domain = example-app.herokuapp.comin your config.txt’s
[HIT Configuration]setting. With that, HITs posted to mturk should correctly point to your heroku app.
See the Hit Configuration – Ad Url for more information.
From your local
psiTurk session, you can now
create and modify HITs. When these are accessed by
Amazon Mechanical Turk workers, the workers will be directed to the psiTurk
session running on your Heroku app. This means that it is never necessary to
launch psiTurk and run server on from anywhere to run an experiment on
Heroku. The server is automatically running, accessible via your Heroku domain
url. (Of course, if you want to debug locally, you can still run a local server.)
If you stay on the “Free” Heroku tier, your app will go to “sleep” after a period of inactivity. If your app has gone to sleep, it will take a few seconds before it responds if you visit its url. It should respond quickly once it “awakens”. Consider upgrading to a “Hobby” heroku dyno to prevent your app from going to sleep.
If you want to run commands against your postgresql db, you can run heroku pg:psql to connect, from where you can issue postgres commands. You can also connect directly to your heroku postgres db by installing and running postgresql on your local machine, and passing the DATABASE_URL that your heroku app uses as a command-line option.