Tracking a box office draft in a Google Sheet
This repository is a tool used to track a box office draft in a Google Sheet. It produces a dashboard in the Google Sheet that shows the box office data for the year.
Here is an example of the dashboard:
In this example, I drafted with 4 friends. We did a snake draft, and we each picked 15 movies. There's a concept of "scored revenue," which is the revenue of a movie multiplied by a multiplier. The multipliers are set in the config.yml
file. You can set a multiplier for a round or a movie. In this example, the revenue of the movies that we drafted in the last round is multiplied by 5 for scoring purposes. You can also set a multiplier for a movie, which is useful if you want to ensure a movie doesn't skew the results too much.
There are 3 main sections of the dashboard:
- Standings
- On the top left, each drafter's name is listed with their scored revenue, as well as the number of movies they drafted that have been released. There's also a column to show how many picks they chose that were optimal, which is a pick that was better than any movie available at the time of that pick.
- Released Movies
- Every movie that has been drafted or is listed in
manual_adds.csv
is listed here. - The movies are sorted by scored revenue.
- On the far right column, there's a flag that shows if the movie is still updating. If no movies are still updating, then the timestamp next to the dashboard will show a message.
- Every movie that has been drafted or is listed in
- Worst Picks
- The worst picks are listed here, which are the movies that missed out on the most revenue. For example, if you picked a movie first that made $100 million, and then the next best movie made $200 million, you missed out on $100 million.
- The movies are sorted by the maximum amount of revenue they missed out on.
Throughout the year, I ensure that each movie has the correct title in the box_office_draft.csv
file. Each movie will only be picked up if it's in the top 200 on the Box Office Mojo Worldwide page.
The way I originally set this up was to scrape the box office data from the Box Office Mojo page each day and store it in an S3 bucket. Then I would read the data from the S3 bucket and load it into the Google Sheet. This is more accurate because it's possible for a smaller movie to show up in the top 200 at one point in time, but then drop out of the top 200 by the end of the year as larger movies get released. I have some logging in the code to record these movies that may have missed revenue.
If you want to do a smaller draft or don't want to set up an S3 bucket, you can choose for a draft to pull from the Box Office Mojo page each time the dashboard is updated. This is easier to set up, but it's not as accurate as the S3 method.
To run your own drafts, you will need a config.yml
file in the config
folder. This file should contain an ID for each dashboard you want to update. The top-level tag is dashboards
, and each ID must include the following keys:
name
: Name of the draft.- This will be the title of the dashboard.
description
: Description of the draft.- This isn't used for anything but is useful for you to distinguish between drafts.
year
: Release year of movies to be scraped.- This is used to determine which Box Office Mojo page to scrape. Only movies released in this year will be added to the dashboard.
update_type
: Can bescrape
orS3
.scrape
: Scrapes box office data from Box Office Mojo.S3
: Assumes box office data is already in an S3 bucket and loads it into DuckDB.
sheet_name
: Name of the Google Sheet where the dashboard will be written.
Optional keys include:
gspread_credentials_name
: Name of the gspread credentials variable in the.env
file.- Defaults to
GSPREAD_CREDENTIALS_<year>
.
- Defaults to
- Under the
bucket
key, you can set the following keys ifupdate_type
isS3
:bucket
: Name of the S3 bucket (and path, if applicable) that contains the box office data.s3_read_access_key_id_var_name
: Name of the S3 access key id variable in the.env
file.- This access_key must have read access to the bucket.
- Defaults to
S3_ACCESS_KEY_ID
.
s3_read_secret_access_key_var_name
: Name of the S3 secret access key variable in the.env
file.- This secret_key must have read access to the bucket.
- Defaults to
S3_SECRET_ACCESS_KEY
.
s3_write_access_key_id_var_name
: Name of the S3 access key id variable in the.env
file.- This access_key must have write access to the bucket.
- Defaults to
S3_ACCESS_KEY_ID
.
s3_write_secret_access_key_var_name
: Name of the S3 secret access key variable in the.env
file.- This secret_key must have write access to the bucket.
- Defaults to
S3_SECRET_ACCESS_KEY
.
For each ID in your config.yml
file, you will also need:
- A Google Sheet named
sheet_name
with a tab called "Dashboard"- This sheet will be used to display the box office data for the year.
- See below about how to set up access to this sheet.
- The sheet must have the following tabs:
Draft
- Source of truth for the box office draft.
- Columns: round, overall_pick, name, movie
round
: Round number of draftoverall_pick
: Overall pick number of draftname
: Name of the person pickingmovie
: Movie picked
Dashboard
- Dashboard that displays the box office data for the year.
- This script updates it every day.
Multipliers and Exclusions
- Multipliers and exclusions for the box office draft.
- Columns: type, value, multiplier
type
: Type of record. Can bemovie
,round
, orexclusion
.value
: Value of record. Either a round number, a movie title for a multiplier, or a movie title to exclude.multiplier
: Multiplier for the record, if applicable.
Manual Adds
- Manual adds for the box office draft. Used for movies that are not in the top 200 at the end of the year.
- Columns: title, revenue, domestic_rev, foreign_rev, release_date
title
: Title of the movierevenue
: Revenue of the moviedomestic_rev
: Domestic revenue of the movieforeign_rev
: Foreign revenue of the movierelease_date
: Release date of the movie
- The following variables in your
.env
file:MODAL_TOKEN_ID
andMODAL_TOKEN_SECRET
- Token ID and secret for the modal account.
GSPREAD_CREDENTIALS_<year>
- Credentials for the Google Sheet that data will be written to for that year. Here is how to generate these credentials and add the account to the Google Sheet.
- You can change the name of this variable in the
config.yml
file using thegspread_credentials_name
key, but it defaults toGSPREAD_CREDENTIALS_<year>
.
S3_ACCESS_KEY_ID
andS3_SECRET_ACCESS_KEY
- Only needed if
update_type
isS3
. - For security, I recommend creating 2 pairs of keys, one for read access and one for write access.
- You can change the name of these in the
config.yml
file using thes3_read_access_key_id_var_name
,s3_read_secret_access_key_var_name
,s3_write_access_key_id_var_name
, ands3_write_secret_access_key_var_name
keys, but they default toS3_ACCESS_KEY_ID
andS3_SECRET_ACCESS_KEY
.
- Only needed if
If any of your dashboards have an update_type
of S3
, the script will run an ETL process to load the data from Box Office Mojo into your bucket. If you registered a modal account, S3 bucket, and have the correct keys in your .env
file, the script will update the data in the bucket every day at 4am UTC.
Here is an example of the config/config.yml
file:
dashboards:
my_2025_draft:
name: 2025 Fantasy Box Office Standings
description: Draft with Holden, Troy, Nahiyan, and Larry for 2025
year: 2025
update_type: s3
sheet_name: 2025 Fantasy Box Office Draft
gspread_credentials_name: GSPREAD_CREDENTIALS_FRIENDS_2025
# The following are only needed if update_type is s3
bucket:
bucket: box-office-tracking
s3_write_access_key_id_var_name: S3_WRITE_ACCESS_KEY_ID_MY_2025_DRAFT
s3_write_secret_access_key_var_name: S3_WRITE_SECRET_ACCESS_KEY_MY_2025_DRAFT
s3_read_access_key_id_var_name: S3_READ_ACCESS_KEY_ID_MY_2025_DRAFT
s3_read_secret_access_key_var_name: S3_READ_SECRET_ACCESS_KEY_MY_2025_DRAFT
- Download this repo, either by cloning it or downloading the zip file.
- Set up a modal account.
- (Optional) Set up a S3 bucket to store the box office data. I like to use Digital Ocean Spaces.
- Create a google sheet with a tab called "Dashboard". Follow this guide to set up APIaccess to this sheet.
- Create a
.env
file in the root directory that follows the format above. - Create a
config/config.yml
file in the root directory that follows the format above. - Add your draft data to the
assets/<folder_name>
folder. See the section above to follow the format for these files. - If you want to run the script locally, you can do so by installing python and running
poetry install
and thenpoetry run python sync_and_update.py
.
To deploy the sync_and_update.py
script, use the provided shell script deploy_modal.sh
. This script automates the deployment process using Poetry and Modal.
- Ensure that both
poetry
andmodal
are installed and available in your system's PATH.
-
Make the Script Executable: If you haven't already, make the script executable:
chmod +x deploy_modal.sh
-
Basic Deployment: Run the following command to deploy the script:
./deploy_modal.sh
-
Add .env file to modal secrets: Go to your app on modal.com and add your
.env
file to a new secret calledbox-office-tracking-secrets
.
The S3 sync updates the data in the bucket every day at 4am UTC, and the dashboard is updated every day at 5am UTC.