You can go multiple routes to setup and run Paperless:
- Use the easy install docker script
- Pull the image from Docker Hub
- Build the Docker image yourself
- Install Paperless directly on your system manually (bare metal)
The Docker routes are quick & easy. These are the recommended routes. This configures all the stuff from the above automatically so that it just works and uses sensible defaults for all configuration options. Here you find a cheat-sheet for docker beginners: CLI Basics
The bare metal route is complicated to setup but makes it easier should you want to contribute some code back. You need to configure and run the above mentioned components yourself.
Paperless provides an interactive installation script. This script will ask you for a couple configuration options, download and create the necessary configuration files, pull the docker image, start paperless and create your user account. This script essentially performs all the steps described in Docker setup automatically.
-
Make sure that Docker and Docker Compose are installed.
!!! tip
See the Docker installation instructions at https://docs.docker.com/engine/install/
-
Download and run the installation script:
$ bash -c "$(curl --location --silent --show-error https://raw.githubusercontent.com/paperless-ngx/paperless-ngx/main/install-paperless-ngx.sh)"
!!! note
macOS users will need to install e.g. [gnu-sed](https://formulae.brew.sh/formula/gnu-sed) with support for running as `sed`.
-
Login with your user and create a folder in your home-directory to have a place for your configuration files and consumption directory.
$ mkdir -v ~/paperless-ngx
-
Go to the /docker/compose directory on the project page and download one of the
docker-compose.*.yml
files, depending on which database backend you want to use. Rename this file todocker-compose.yml
. If you want to enable optional support for Office documents, download a file with-tika
in the file name. Download thedocker-compose.env
file and the.env
file as well and store them in the same directory.!!! tip
For new installations, it is recommended to use PostgreSQL as the database backend.
-
Install Docker and Docker Compose.
!!! warning
If you want to use the included `docker-compose.*.yml` file, you need to have at least Docker version **17.09.0** and Docker Compose version **v2**. To check do: `docker compose version` or `docker -v` See the [Docker installation guide](https://docs.docker.com/engine/install/) on how to install the current version of Docker for your operating system or Linux distribution of choice. To get the latest version of Docker Compose, follow the [Docker Compose installation guide](https://docs.docker.com/compose/install/linux/) if your package repository doesn't include it.
-
Modify
docker-compose.yml
to your preferences. You may want to change the path to the consumption directory. Find the line that specifies where to mount the consumption directory:- ./consume:/usr/src/paperless/consume
Replace the part BEFORE the colon with a local directory of your choice:
- /home/jonaswinkler/paperless-inbox:/usr/src/paperless/consume
Don't change the part after the colon or paperless won't find your documents.
You may also need to change the default port that the webserver will use from the default (8000):
ports: - 8000:8000
Replace the part BEFORE the colon with a port of your choice:
ports: - 8010:8000
Don't change the part after the colon or edit other lines that refer to port 8000. Modifying the part before the colon will map requests on another port to the webserver running on the default port.
Rootless
!!! warning
It is currently not possible to run the container rootless if additional languages are specified via `PAPERLESS_OCR_LANGUAGES`.
If you want to run Paperless as a rootless container, you will need to do the following in your
docker-compose.yml
:- set the
user
running the container to map to thepaperless
user in the container. This value (user_id
below), should be the same id thatUSERMAP_UID
andUSERMAP_GID
are set to in the next step. SeeUSERMAP_UID
andUSERMAP_GID
here.
Your entry for Paperless should contain something like:
webserver: image: ghcr.io/paperless-ngx/paperless-ngx:latest user: <user_id>
- set the
-
Modify
docker-compose.env
, following the comments in the file. The most important change is to setUSERMAP_UID
andUSERMAP_GID
to the uid and gid of your user on the host system. Useid -u
andid -g
to get these.This ensures that both the docker container and you on the host machine have write access to the consumption directory. If your UID and GID on the host system is 1000 (the default for the first normal user on most systems), it will work out of the box without any modifications.
id "username"
to check.!!! note
You can copy any setting from the file `paperless.conf.example` and paste it here. Have a look at [configuration](configuration.md) to see what's available.
!!! note
You can utilize Docker secrets for configuration settings by appending `_FILE` to configuration values. For example [`PAPERLESS_DBUSER`](configuration.md#PAPERLESS_DBUSER) can be set using `PAPERLESS_DBUSER_FILE=/var/run/secrets/password.txt`.
!!! warning
Some file systems such as NFS network shares don't support file system notifications with `inotify`. When storing the consumption directory on such a file system, paperless will not pick up new files with the default configuration. You will need to use [`PAPERLESS_CONSUMER_POLLING`](configuration.md#PAPERLESS_CONSUMER_POLLING), which will disable inotify. See [here](configuration.md#polling).
-
Run
docker compose pull
. This will pull the image. -
To be able to login, you will need a super user. To create it, execute the following command:
$ docker compose run --rm webserver createsuperuser
or using docker exec from within the container:
$ python3 manage.py createsuperuser
This will prompt you to set a username, an optional e-mail address and finally a password (at least 8 characters).
-
Run
docker compose up -d
. This will create and start the necessary containers. -
The default
docker-compose.yml
exports the webserver on your local port8000. If you did not change this, you should now be able to visit your Paperless instance at
http://127.0.0.1:8000
or your servers IP-Address:8000. Use the login credentials you have created with the previous step.
-
Clone the entire repository of paperless:
git clone https://github.com/paperless-ngx/paperless-ngx
The main branch always reflects the latest stable version.
-
Copy one of the
docker/compose/docker-compose.*.yml
todocker-compose.yml
in the root folder, depending on which database backend you want to use. Copydocker-compose.env
into the project root as well. -
In the
docker-compose.yml
file, find the line that instructs Docker Compose to pull the paperless image from Docker Hub:webserver: image: ghcr.io/paperless-ngx/paperless-ngx:latest
and replace it with a line that instructs Docker Compose to build the image from the current working directory instead:
webserver: build: context: .
-
Follow steps 3 to 8 of Docker Setup. When asked to run
docker compose pull
to pull the image, do$ docker compose build
instead to build the image.
Paperless runs on linux only. The following procedure has been tested on a minimal installation of Debian/Buster, which is the current stable release at the time of writing. Windows is not and will never be supported.
-
Install dependencies. Paperless requires the following packages.
python3
- 3.9 - 3.11 are supportedpython3-pip
python3-dev
default-libmysqlclient-dev
for MariaDBpkg-config
for mysqlclient (python dependency)fonts-liberation
for generating thumbnails for plain text filesimagemagick
>= 6 for PDF conversiongnupg
for handling encrypted documentslibpq-dev
for PostgreSQLlibmagic-dev
for mime type detectionmariadb-client
for MariaDB compile timemime-support
for mime type detectionlibzbar0
for barcode detectionpoppler-utils
for barcode detection
Use this list for your preferred package management:
python3 python3-pip python3-dev imagemagick fonts-liberation gnupg libpq-dev default-libmysqlclient-dev pkg-config libmagic-dev mime-support libzbar0 poppler-utils
These dependencies are required for OCRmyPDF, which is used for text recognition.
unpaper
ghostscript
icc-profiles-free
qpdf
liblept5
libxml2
pngquant
(suggested for certain PDF image optimizations)zlib1g
tesseract-ocr
>= 4.0.0 for OCRtesseract-ocr
language packs (tesseract-ocr-eng
,tesseract-ocr-deu
, etc)
Use this list for your preferred package management:
unpaper ghostscript icc-profiles-free qpdf liblept5 libxml2 pngquant zlib1g tesseract-ocr
On Raspberry Pi, these libraries are required as well:
libatlas-base-dev
libxslt1-dev
You will also need
build-essential
,python3-setuptools
andpython3-wheel
for installing some of the python dependencies. -
Install
redis
>= 6.0 and configure it to start automatically. -
Optional. Install
postgresql
and configure a database, user and password for paperless. If you do not wish to use PostgreSQL, MariaDB and SQLite are available as well.!!! note
On bare-metal installations using SQLite, ensure the [JSON1 extension](https://code.djangoproject.com/wiki/JSON1Extension) is enabled. This is usually the case, but not always.
-
Create a system user with a new home folder under which you wish to run paperless.
adduser paperless --system --home /opt/paperless --group
-
Get the release archive from https://github.com/paperless-ngx/paperless-ngx/releases for example with
curl -O -L https://github.com/paperless-ngx/paperless-ngx/releases/download/v1.10.2/paperless-ngx-v1.10.2.tar.xz
Extract the archive with
tar -xf paperless-ngx-v1.10.2.tar.xz
and copy the contents to the home folder of the user you created before (
/opt/paperless
).Optional: If you cloned the git repo, you will have to compile the frontend yourself, see here and use the
build
step, notserve
. -
Configure paperless. See configuration for details. Edit the included
paperless.conf
and adjust the settings to your needs. Required settings for getting paperless running are:PAPERLESS_REDIS
should point to your redis server, such as redis://localhost:6379.PAPERLESS_DBENGINE
optional, and should be one ofpostgres
,mariadb
, orsqlite
PAPERLESS_DBHOST
should be the hostname on which your PostgreSQL server is running. Do not configure this to use SQLite instead. Also configure port, database name, user and password as necessary.PAPERLESS_CONSUMPTION_DIR
should point to a folder which paperless should watch for documents. You might want to have this somewhere else. Likewise,PAPERLESS_DATA_DIR
andPAPERLESS_MEDIA_ROOT
define where paperless stores its data. If you like, you can point both to the same directory.PAPERLESS_SECRET_KEY
should be a random sequence of characters. It's used for authentication. Failure to do so allows third parties to forge authentication credentials.PAPERLESS_URL
if you are behind a reverse proxy. This should point to your domain. Please see configuration for more information.
Many more adjustments can be made to paperless, especially the OCR part. The following options are recommended for everyone:
- Set
PAPERLESS_OCR_LANGUAGE
to the language most of your documents are written in. - Set
PAPERLESS_TIME_ZONE
to your local time zone.
!!! warning
Ensure your Redis instance [is secured](https://redis.io/docs/getting-started/#securing-redis).
-
Create the following directories if they are missing:
/opt/paperless/media
/opt/paperless/data
/opt/paperless/consume
Adjust as necessary if you configured different folders. Ensure that the paperless user has write permissions for every one of these folders with
ls -l -d /opt/paperless/media
If needed, change the owner with
sudo chown paperless:paperless /opt/paperless/media sudo chown paperless:paperless /opt/paperless/data sudo chown paperless:paperless /opt/paperless/consume
-
Install python requirements from the
requirements.txt
file. It is up to you if you wish to use a virtual environment or not. First you should update your pip, so it gets the actual packages.sudo -Hu paperless pip3 install -r requirements.txt
This will install all python dependencies in the home directory of the new paperless user.
-
Go to
/opt/paperless/src
, and execute the following commands:# This creates the database schema. sudo -Hu paperless python3 manage.py migrate # This creates your first paperless user sudo -Hu paperless python3 manage.py createsuperuser
-
Optional: Test that paperless is working by executing
# Manually starts the webserver sudo -Hu paperless python3 manage.py runserver
and pointing your browser to http://localhost:8000 if accessing from the same devices on which paperless is installed. If accessing from another machine, set up systemd services. You may need to set
PAPERLESS_DEBUG=true
in order for the development server to work normally in your browser.!!! warning
This is a development server which should not be used in production. It is not audited for security and performance is inferior to production ready web servers.
!!! tip
This will not start the consumer. Paperless does this in a separate process.
-
Setup systemd services to run paperless automatically. You may use the service definition files included in the
scripts
folder as a starting point.Paperless needs the
webserver
script to run the webserver, theconsumer
script to watch the input folder,taskqueue
for the background workers used to handle things like document consumption and thescheduler
script to run tasks such as email checking at certain times .!!! note
The `socket` script enables `gunicorn` to run on port 80 without root privileges. For this you need to uncomment the `Require=paperless-webserver.socket` in the `webserver` script and configure `gunicorn` to listen on port 80 (see `paperless/gunicorn.conf.py`).
You may need to adjust the path to the
gunicorn
executable. This will be installed as part of the python dependencies, and is either located in thebin
folder of your virtual environment, or in~/.local/bin/
if no virtual environment is used.These services rely on redis and optionally the database server, but don't need to be started in any particular order. The example files depend on redis being started. If you use a database server, you should add additional dependencies.
!!! warning
The included scripts run a `gunicorn` standalone server, which is fine for running paperless. It does support SSL, however, the documentation of GUnicorn states that you should use a proxy server in front of gunicorn instead. For instructions on how to use nginx for that, [see the wiki](https://github.com/paperless-ngx/paperless-ngx/wiki/Using-a-Reverse-Proxy-with-Paperless-ngx#nginx).
!!! warning
If celery won't start (check with `sudo systemctl status paperless-task-queue.service` for paperless-task-queue.service and paperless-scheduler.service ) you need to change the path in the files. Example: `ExecStart=/opt/paperless/.local/bin/celery --app paperless worker --loglevel INFO`
-
Optional: Install a samba server and make the consumption folder available as a network share.
-
Configure ImageMagick to allow processing of PDF documents. Most distributions have this disabled by default, since PDF documents can contain malware. If you don't do this, paperless will fall back to ghostscript for certain steps such as thumbnail generation.
Edit
/etc/ImageMagick-6/policy.xml
and adjust<policy domain="coder" rights="none" pattern="PDF" />
to
<policy domain="coder" rights="read|write" pattern="PDF" />
-
Optional: Install the jbig2enc encoder. This will reduce the size of generated PDF documents. You'll most likely need to compile this by yourself, because this software has been patented until around 2017 and binary packages are not available for most distributions.
-
Optional: If using the NLTK machine learning processing (see
PAPERLESS_ENABLE_NLTK
for details), download the NLTK data for the Snowball Stemmer, Stopwords and Punkt tokenizer to yourPAPERLESS_DATA_DIR/nltk
. Refer to the NLTK instructions for details on how to download the data.
Migration is possible both from Paperless-ng or directly from the 'original' Paperless.
Paperless-ngx is meant to be a drop-in replacement for Paperless-ng and
thus upgrading should be trivial for most users, especially when using
docker. However, as with any major change, it is recommended to take a
full backup first. Once you are ready, simply change the docker image to
point to the new source. E.g. if using Docker Compose, edit
docker-compose.yml
and change:
image: jonaswinkler/paperless-ng:latest
to
image: ghcr.io/paperless-ngx/paperless-ngx:latest
and then run docker compose up -d
which will pull the new image
recreate the container. That's it!
Users who installed with the bare-metal route should also update their
Git clone to point to https://github.com/paperless-ngx/paperless-ngx
,
e.g. using the command
git remote set-url origin https://github.com/paperless-ngx/paperless-ngx
and then pull the latest version.
At its core, paperless-ngx is still paperless and fully compatible. However, some things have changed under the hood, so you need to adapt your setup depending on how you installed paperless.
This setup describes how to update an existing paperless Docker installation. The important things to keep in mind are as follows:
- Read the changelog and take note of breaking changes.
- You should decide if you want to stick with SQLite or want to migrate your database to PostgreSQL. See documentation for details on how to move your data from SQLite to PostgreSQL. Both work fine with paperless. However, if you already have a database server running for other services, you might as well use it for paperless as well.
- The task scheduler of paperless, which is used to execute periodic tasks such as email checking and maintenance, requires a redis message broker instance. The Docker Compose route takes care of that.
- The layout of the folder structure for your documents and data remains the same, so you can just plug your old docker volumes into paperless-ngx and expect it to find everything where it should be.
Migration to paperless-ngx is then performed in a few simple steps:
-
Stop paperless.
$ cd /path/to/current/paperless $ docker compose down
-
Do a backup for two purposes: If something goes wrong, you still have your data. Second, if you don't like paperless-ngx, you can switch back to paperless.
-
Download the latest release of paperless-ngx. You can either go with the Docker Compose files from here or clone the repository to build the image yourself (see above). You can either replace your current paperless folder or put paperless-ngx in a different location.
!!! warning
Paperless-ngx includes a `.env` file. This will set the project name for docker compose to `paperless`, which will also define the name of the volumes by paperless-ngx. However, if you experience that paperless-ngx is not using your old paperless volumes, verify the names of your volumes with ``` shell-session $ docker volume ls | grep _data ``` and adjust the project name in the `.env` file so that it matches the name of the volumes before the `_data` part.
-
Download the
docker-compose.sqlite.yml
file todocker-compose.yml
. If you want to switch to PostgreSQL, do that after you migrated your existing SQLite database. -
Adjust
docker-compose.yml
anddocker-compose.env
to your needs. See Docker setup details on which edits are advised. -
In order to find your existing documents with the new search feature, you need to invoke a one-time operation that will create the search index:
$ docker compose run --rm webserver document_index reindex
This will migrate your database and create the search index. After that, paperless will take care of maintaining the index by itself.
-
Start paperless-ngx.
$ docker compose up -d
This will run paperless in the background and automatically start it on system boot.
-
Paperless installed a permanent redirect to
admin/
in your browser. This redirect is still in place and prevents access to the new UI. Clear your browsing cache in order to fix this. -
Optionally, follow the instructions below to migrate your existing data to PostgreSQL.
As with any upgrades and large changes, it is highly recommended to create a backup before starting. This assumes the image was running using Docker Compose, but the instructions are translatable to Docker commands as well.
- Stop and remove the paperless container
- If using an external database, stop the container
- Update Redis configuration
a) If
REDIS_URL
is already set, change it toPAPERLESS_REDIS
and continue to step 4. b) Otherwise, in thedocker-compose.yml
add a new service for Redis, following the example compose files c) Set the environment variablePAPERLESS_REDIS
so it points to the new Redis container - Update user mapping
a) If set, change the environment variable
PUID
toUSERMAP_UID
b) If set, change the environment variablePGID
toUSERMAP_GID
- Update configuration paths
a) Set the environment variable
PAPERLESS_DATA_DIR
to/config
- Update media paths
a) Set the environment variable
PAPERLESS_MEDIA_ROOT
to/data/media
- Update timezone
a) Set the environment variable
PAPERLESS_TIME_ZONE
to the same value asTZ
- Modify the
image:
to point toghcr.io/paperless-ngx/paperless-ngx:latest
or a specific version if preferred. - Start the containers as before, using
docker compose
.
Moving your data from SQLite to PostgreSQL or MySQL/MariaDB is done via executing a series of django management commands as below. The commands below use PostgreSQL, but are applicable to MySQL/MariaDB with the
!!! warning
Make sure that your SQLite database is migrated to the latest version.
Starting paperless will make sure that this is the case. If your try to
load data from an old database schema in SQLite into a newer database
schema in PostgreSQL, you will run into trouble.
!!! warning
On some database fields, PostgreSQL enforces predefined limits on
maximum length, whereas SQLite does not. The fields in question are the
title of documents (128 characters), names of document types, tags and
correspondents (128 characters), and filenames (1024 characters). If you
have data in these fields that surpasses these limits, migration to
PostgreSQL is not possible and will fail with an error.
!!! warning
MySQL is case insensitive by default, treating values like "Name" and
"NAME" as identical. See [MySQL caveats](advanced_usage.md#mysql-caveats) for details.
!!! warning
MySQL also enforces limits on maximum lengths, but does so differently than
PostgreSQL. It may not be possible to migrate to MySQL due to this.
!!! warning
Using mariadb version 10.4+ is recommended. Using the `utf8mb3` character set on
an older system may fix issues that can arise while setting up Paperless-ngx but
`utf8mb3` can cause issues with consumption (where `utf8mb4` does not).
-
Stop paperless, if it is running.
-
Tell paperless to use PostgreSQL:
a) With docker, copy the provided
docker-compose.postgres.yml
file todocker-compose.yml
. Remember to adjust the consumption directory, if necessary. b) Without docker, configure the database in yourpaperless.conf
file. See configuration for details. -
Open a shell and initialize the database:
a) With docker, run the following command to open a shell within the paperless container:
``` shell-session $ cd /path/to/paperless $ docker compose run --rm webserver /bin/bash ``` This will launch the container and initialize the PostgreSQL database.
b) Without docker, remember to activate any virtual environment, switch to the
src
directory and create the database schema:``` shell-session $ cd /path/to/paperless/src $ python3 manage.py migrate ``` This will not copy any data yet.
-
Dump your data from SQLite:
$ python3 manage.py dumpdata --database=sqlite --exclude=contenttypes --exclude=auth.Permission > data.json
-
Load your data into PostgreSQL:
$ python3 manage.py loaddata data.json
-
If operating inside Docker, you may exit the shell now.
$ exit
-
Start paperless.
Lets say you migrated to Paperless-ngx and used it for a while, but decided that you don't like it and want to move back (If you do, send me a mail about what part you didn't like!), you can totally do that with a few simple steps.
Paperless-ngx modified the database schema slightly, however, these changes can be reverted while keeping your current data, so that your current data will be compatible with original Paperless. Thumbnails were also changed from PNG to WEBP format and will need to be re-generated.
Execute this:
$ cd /path/to/paperless
$ docker compose run --rm webserver migrate documents 0023
Or without docker:
$ cd /path/to/paperless/src
$ python3 manage.py migrate documents 0023
After regenerating thumbnails, you'll need to clear your cookies (Paperless-ngx comes with updated dependencies that do cookie-processing differently) and probably your cache as well.
Paperless runs on Raspberry Pi. However, some things are rather slow on the Pi and configuring some options in paperless can help improve performance immensely:
- Stick with SQLite to save some resources.
- Consider setting
PAPERLESS_OCR_PAGES
to 1, so that paperless will only OCR the first page of your documents. In most cases, this page contains enough information to be able to find it. PAPERLESS_TASK_WORKERS
andPAPERLESS_THREADS_PER_WORKER
are configured to use all cores. The Raspberry Pi models 3 and up have 4 cores, meaning that paperless will use 2 workers and 2 threads per worker. This may result in sluggish response times during consumption, so you might want to lower these settings (example: 2 workers and 1 thread to always have some computing power left for other tasks).- Keep
PAPERLESS_OCR_MODE
at its default valueskip
and consider OCR'ing your documents before feeding them into paperless. Some scanners are able to do this! - Set
PAPERLESS_OCR_SKIP_ARCHIVE_FILE
towith_text
to skip archive file generation for already ocr'ed documents, oralways
to skip it for all documents. - If you want to perform OCR on the device, consider using
PAPERLESS_OCR_CLEAN=none
. This will speed up OCR times and use less memory at the expense of slightly worse OCR results. - If using docker, consider setting
PAPERLESS_WEBSERVER_WORKERS
to 1. This will save some memory. - Consider setting
PAPERLESS_ENABLE_NLTK
to false, to disable the more advanced language processing, which can take more memory and processing time.
For details, refer to configuration.
!!! note
Updating the
[automatic matching algorithm](advanced_usage.md#automatic-matching) takes quite a bit of time. However, the update mechanism
checks if your data has changed before doing the heavy lifting. If you
experience the algorithm taking too much cpu time, consider changing the
schedule in the admin interface to daily. You can also manually invoke
the task by changing the date and time of the next run to today/now.
The actual matching of the algorithm is fast and works on Raspberry Pi
as well as on any other device.
Please see the wiki for user-maintained documentation of using nginx with Paperless-ngx.
Please see the wiki for user-maintained documentation of how to configure security tools like Fail2ban with Paperless-ngx.