Skip to content

Commit

Permalink
add script to initialise virtualenv (apache#22971)
Browse files Browse the repository at this point in the history

Co-authored-by: Jarek Potiuk <[email protected]>
  • Loading branch information
joppevos and potiuk authored Apr 21, 2022
1 parent 03f7d85 commit 03bef08
Show file tree
Hide file tree
Showing 6 changed files with 214 additions and 28 deletions.
4 changes: 2 additions & 2 deletions BREEZE.rst
Original file line number Diff line number Diff line change
Expand Up @@ -454,7 +454,7 @@ Regular development tasks:
* Enter interactive shell in CI container when ``shell`` (or no command) is specified
* Start containerised, development-friendly airflow installation with ``breeze start-airflow`` command
* Build documentation with ``breeze build-docs`` command
* Initialize local virtualenv with ``./breeze-legacy initialize-local-virtualenv`` command
* Initialize local virtualenv with ``./scripts/tools/initialize_virtualenv.py`` command
* Build CI docker image with ``breeze build-image`` command
* Cleanup breeze with ``breeze cleanup`` command
* Run static checks with autocomplete support ``breeze static-check`` command
Expand Down Expand Up @@ -969,7 +969,7 @@ To use your host IDE with Breeze:

.. code-block:: bash
./breeze-legacy initialize-local-virtualenv --python 3.8
./scripts/tools/initialize_virtualenv.py
.. warning::
Make sure that you use the right Python version in this command - matching the Python version you have
Expand Down
39 changes: 18 additions & 21 deletions CONTRIBUTING.rst
Original file line number Diff line number Diff line change
Expand Up @@ -257,13 +257,14 @@ to make them immediately visible in the environment.

.. code-block:: bash
mkvirtualenv myenv --python=python3.7
mkvirtualenv myenv --python=python3.9
5. Initialize the created environment:

.. code-block:: bash
./breeze-legacy initialize-local-virtualenv --python 3.7
./scripts/tools/initialize_virtualenv.py
6. Open your IDE (for example, PyCharm) and select the virtualenv you created
as the project's default virtualenv in your IDE.
Expand Down Expand Up @@ -886,39 +887,33 @@ There are several sets of constraints we keep:

We also have constraints with "source-providers" but they are used i

The first ones can be used as constraints file when installing Apache Airflow in a repeatable way.
The first two can be used as constraints file when installing Apache Airflow in a repeatable way.
It can be done from the sources:

.. code-block:: bash
from the PyPI package:

pip install -e . \
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints-3.7.txt"
.. code-block:: bash
pip install apache-airflow==2.2.5 \
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.2.5/constraints-3.7.txt"
or from the PyPI package:
When you install airflow from sources (in editable mode) you should use "constraints-source-providers"
instead (this accounts for the case when some providers have not yet been released and have conflicting
requirements).

.. code-block:: bash
pip install apache-airflow \
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints-3.7.txt"
pip install -e . \
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints-source-providers-3.7.txt"
This works also with extras - for example:

.. code-block:: bash
pip install .[ssh] \
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints-3.7.txt"
As of apache-airflow 1.10.12 it is also possible to use constraints directly from GitHub using specific
tag/hash name. We tag commits working for particular release with constraints-<version> tag. So for example
fixed valid constraints 1.10.12 can be used by using ``constraints-1.10.12`` tag:

.. code-block:: bash
pip install ".[ssh]" \
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints--source-providers-3.7.txt"
pip install apache-airflow[ssh]==1.10.12 \
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-1.10.12/constraints-3.7.txt"
There are different set of fixed constraint files for different python major/minor versions and you should
use the right file for the right python version.
Expand All @@ -940,7 +935,9 @@ if the tests are successful.
Documentation
=============

Documentation for ``apache-airflow`` package and other packages that are closely related to it ie. providers packages are in ``/docs/`` directory. For detailed information on documentation development, see: `docs/README.rst <docs/README.rst>`_
Documentation for ``apache-airflow`` package and other packages that are closely related to it ie.
providers packages are in ``/docs/`` directory. For detailed information on documentation development,
see: `docs/README.rst <docs/README.rst>`_

Static code checks
==================
Expand Down
4 changes: 2 additions & 2 deletions CONTRIBUTORS_QUICK_START.rst
Original file line number Diff line number Diff line change
Expand Up @@ -321,7 +321,7 @@ Installing airflow in the local virtual environment ``airflow-env`` with breeze.

.. code-block:: bash
$ ./breeze-legacy initialize-local-virtualenv --python 3.8
$ ./scripts/tools/initialize_virtualenv.py
3. Add following line to ~/.bashrc in order to call breeze command from anywhere.

Expand Down Expand Up @@ -1132,7 +1132,7 @@ Installing airflow in the local virtual environment ``airflow-env`` with breeze.
.. code-block:: bash
$ sudo apt-get install sqlite libsqlite3-dev default-libmysqlclient-dev postgresql
$ ./breeze-legacy initialize-local-virtualenv --python 3.8
$ ./scripts/tools/initialize_virtualenv.py
2. Add following line to ~/.bashrc in order to call breeze command from anywhere.
Expand Down
6 changes: 3 additions & 3 deletions dev/provider_packages/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -227,19 +227,19 @@ that any new added providers are not added as packages (in case they are not yet

```shell script
INSTALL_PROVIDERS_FROM_SOURCES="true" pip install -e ".[devel_all]" \
--constraint https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints-3.6.txt
--constraint https://raw.githubusercontent.com/apache/airflow/constraints-main/constraintsource-providers-3.7.txt
```

Note that you might need to add some extra dependencies to your system to install "devel_all" - many
dependencies are needed to make a clean install - the `Breeze` environment has all the
dependencies installed in case you have problem with setting up your local virtualenv.

You can also use `breeze` to prepare your virtualenv (it will print extra information if some
You can also use the script `initialize_virtualenv.py` to prepare your virtualenv (it will print extra information if some
dependencies are missing/installation fails and it will also reset your SQLite test db in
the `${HOME}/airflow` directory:

```shell script
./breeze initialize-local-virtualenv
./scripts/tools/initialize_virtualenv.py
```

You can find description of all the commands and more information about the "prepare"
Expand Down
3 changes: 3 additions & 0 deletions docs/apache-airflow/installation/installing-from-pypi.rst
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,9 @@ the time of preparing of the airflow version. However, usually you can use "main
to install latest version of providers. Usually the providers work with most versions of Airflow, if there
will be any incompatibilities, it will be captured as package dependencies.

Note that "main" is just an example - you might need to choose a specific airflow version to install providers
in specific version.

.. code-block:: bash
PYTHON_VERSION="$(python --version | cut -d " " -f 2 | cut -d "." -f 1-2)"
Expand Down
186 changes: 186 additions & 0 deletions scripts/tools/initialize_virtualenv.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,186 @@
#!/usr/bin/env python3

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

import os
import shlex
import shutil
import subprocess
import sys
from pathlib import Path

if __name__ not in ("__main__", "__mp_main__"):
raise SystemExit(
"This file is intended to be executed as an executable program. You cannot use it as a module."
f"To run this script, run the ./{__file__} command"
)


def clean_up_airflow_home(airflow_home: Path):
if airflow_home.exists():
print(f"Removing {airflow_home}")
shutil.rmtree(airflow_home, ignore_errors=True)


def check_if_in_virtualenv() -> bool:
return hasattr(sys, 'real_prefix') or (hasattr(sys, 'base_prefix') and sys.base_prefix != sys.prefix)


def check_for_package_extras() -> str:
"""
check if the user provided any extra packages to install.
defaults to package 'devel'.
"""
if len(sys.argv) > 1:
if len(sys.argv) > 2:
print("Provide extras as 1 argument like: \"devel,google,snowflake\"")
sys.exit(1)
return sys.argv[1]
return "devel"


def pip_install_requirements() -> int:
"""
install the requirements of the current python version.
return 0 if success, anything else is an error.
"""

extras = check_for_package_extras()
print(
f"""
Installing requirements.
Airflow is installed with "{extras}" extra.
----------------------------------------------------------------------------------------
IMPORTANT NOTE ABOUT EXTRAS !!!
You can specify extras as single coma-separated parameter to install. For example
* google,amazon,microsoft.azure
* devel_all
Note that "devel_all" installs all possible dependencies and we have > 600 of them,
which might not be possible to install cleanly on your host because of lack of
system packages. It's easier to install extras one-by-one as needed.
----------------------------------------------------------------------------------------
"""
)
version = get_python_version()
constraint = (
f"https://raw.githubusercontent.com/apache/airflow/constraints-main/"
f"constraints-source-providers-{version}.txt"
)
pip_install_command = ["pip", "install", "-e", f".[{extras}]", "--constraint", constraint]
quoted_command = " ".join([shlex.quote(parameter) for parameter in pip_install_command])
print()
print(f"Running command: \n {quoted_command}\n")
e = subprocess.run(pip_install_command)
return e.returncode


def get_python_version() -> str:
"""
return the version of python we are running.
"""
major = sys.version_info[0]
minor = sys.version_info[1]
return f"{major}.{minor}"


def main():
"""
Setup local virtual environment.
"""
airflow_home_dir = os.environ.get("AIRFLOW_HOME", Path.home() / "airflow")
airflow_sources = str(Path(__file__).parents[2])

if not check_if_in_virtualenv():
print(
"Local virtual environment not activated.\nPlease create and activate it "
"first. (for example using 'python3 -m venv venv && source venv/bin/activate')"
)
sys.exit(1)

print("Initializing environment...")
print(f"This will remove the folder {airflow_home_dir} and reset all the databases!")
response = input("Are you sure? (y/N/q)")
if response != "y":
sys.exit(2)

print(f"\nWiping and recreating {airflow_home_dir}")

if airflow_home_dir == airflow_sources:
print("AIRFLOW_HOME and Source code are in the same path")
print(
f"When running this script it will delete all files in path {airflow_home_dir} "
"to clear dynamic files like config/logs/db"
)
print("Please move the airflow source code elsewhere to avoid deletion")

sys.exit(3)

clean_up_airflow_home(airflow_home_dir)

return_code = pip_install_requirements()

if return_code != 0:
print(
"To solve persisting issues with the installation, you might need the "
"prerequisites installed on your system.\n "
"Try running the command below and rerun virtualenv installation\n"
)

os_type = sys.platform
if os_type == "darwin":
print("brew install sqlite mysql postgresql openssl")
print("export LDFLAGS=\"-L/usr/local/opt/openssl/lib\"")
print("export CPPFLAGS=\"-I/usr/local/opt/openssl/include\"")
else:
print(
"sudo apt install build-essential python3-dev libsqlite3-dev openssl"
"sqlite default-libmysqlclient-dev libmysqlclient-dev postgresql"
)
sys.exit(4)

print("\nResetting AIRFLOW sqlite database...")
env = os.environ.copy()
env["AIRFLOW__CORE__LOAD_EXAMPLES"] = "False"
env["AIRFLOW__CORE__UNIT_TEST_MODE"] = "False"
env["AIRFLOW__DATABASE__SQL_ALCHEMY_POOL_ENABLED"] = "False"
env["AIRFLOW__CORE__DAGS_FOLDER"] = f"{airflow_sources}/empty"
env["AIRFLOW__CORE__PLUGINS_FOLDER"] = f"{airflow_sources}/empty"
subprocess.run(["airflow", "db", "reset", "--yes"], env=env)

print("\nResetting AIRFLOW sqlite unit test database...")
env = os.environ.copy()
env["AIRFLOW__CORE__LOAD_EXAMPLES"] = "True"
env["AIRFLOW__CORE__UNIT_TEST_MODE"] = "False"
env["AIRFLOW__DATABASE__SQL_ALCHEMY_POOL_ENABLED"] = "False"
env["AIRFLOW__CORE__DAGS_FOLDER"] = f"{airflow_sources}/empty"
env["AIRFLOW__CORE__PLUGINS_FOLDER"] = f"{airflow_sources}/empty"
subprocess.run(["airflow", "db", "reset", "--yes"], env=env)

print("\nInitialization of environment complete! Go ahead and develop Airflow!")


if __name__ == "__main__":
main()

0 comments on commit 03bef08

Please sign in to comment.