fix: attempt to make upgrade much clearer

`upgrade` had several issues, which are summarized here: https://discuss.overhang.io/t/confusing-instructions-during-upgrade/2281/7 - The docs say that you should run quickstart, but what most people will see is the big command tutor local upgrade --from=lilac verbatim paragraph. - The local upgrade command should be very explicit about the fact that users need to run quickstart. - Maybe the name of the local upgrade command should be improved. - When upgrading tutor from one major release to the next, there should be a more explicit warning to inform users of what they are doing (see this other conversation 1) - We should tell people that they almost certainly need to enable the tutor and the mfe plugins, if they are not enabled during upgrade. - A link to all of the breaking changes from the changelog should be prominently displayed during upgrade. - The docs should emphasize that upgrading from one major release to the next is potentially a risky endeavor and that downgrading is not possible. The docs should also link to the changelog. This commit has grown slightly beyond the intended scope, but the changes should be mostly positive.
overhangio · Jan 8, 2022 · 4dc772d · 4dc772d
1 parent 1daba42
commit 4dc772d
Show file tree

Hide file tree

Showing 9 changed files with 243 additions and 142 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -4,10 +4,11 @@ Note: Breaking changes between versions are indicated by "💥".
 
 ## Unreleased
 
+- [Improvement] Provide much more comprehensive instructions when upgrading.
 - [Bugfix] During upgrade, make sure that environment is up-to-date prior to prompting to rebuild the custom images.
 - [Bugfix] Fix ownership of mysql data, in particular when upgrading a Kubernetes cluster to Maple.
 - [Bugfix] Ensure that ``tutor k8s upgrade`` is run during ``tutor k8s quickstart``, when necessary.
-- [Bugfix] By default, upgrade from Lilac and not Koa during ``tutor k8s upgrade``.
+- 💥[Bugfix] By default, detect the current version during ``tutor k8s/local upgrade``.
 - [Bugfix] Fix upgrading from Lilac to Maple on Kubernetes by deleting deployments and services.
 
 ## v13.0.3 (2022-01-04)

diff --git a/docs/install.rst b/docs/install.rst
@@ -87,11 +87,45 @@ Tutor can be launched on Amazon Web Services very quickly with the `official Tut
 Upgrading
 ---------
 
-With Tutor, it is very easy to upgrade to a more recent Open edX or Tutor release. Just install the latest ``tutor`` version (using either methods above) and run the ``quickstart`` command again. If you have :ref:`customised <configuration_customisation>` your docker images, you will have to re-build them prior to running ``quickstart``.
+To upgrade Open edX or benefit from the latest features and bug fixes, you should simply upgrade Tutor. Start by upgrading the "tutor" package and its dependencies::
 
-``quickstart`` should take care of automatically running the upgrade process. If for some reason you need to *manually* upgrade from an Open edX release to the next, you should run ``tutor local upgrade``. For instance, to upgrade from Lilac to Maple, run::
+    pip install --upgrade tutor[full]
 
+Then run the ``quickstart`` command again. Depending on your deployment target, run either::
+
+    tutor local quickstart # for local installations
+    tutor k8s quickstart   # for Kubernetes installation
+
+Upgrading with custom Docker images
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If you run :ref:`customised <configuration_customisation>` Docker images, you need to rebuild them prior to running ``quickstart``::
+
+    tutor config save
+    tutor images build all # specify here the images that you need to build
+    tutor local quickstart
+
+Upgrading to a new Open edX release
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Major Open edX releases are published twice a year, in June and December, by the Open edX `Build/Test/Release working group <https://discuss.openedx.org/c/working-groups/build-test-release/30>`__. When a new Open edX release comes out, Tutor gets a major version bump (see :ref:`versioning`). Such an upgrade typically includes multiple breaking changes. Any upgrade is final, because downgrading is not supported. Thus, when upgrading your platform from one major version to the next, it is strongly recommended to do the following:
+
+1. Read the changes listed in the `CHANGELOG.md <https://github.com/overhangio/tutor/blob/master/CHANGELOG.md>`__ file. Breaking changes are identified by a "💥".
+2. Perform a backup. On a local installation, this is typically done with::
+
+    tutor local stop
+    sudo rsync -avr "$(tutor config printroot)"/ /tmp/tutor-backup/
+
+3. If you created custom plugins, make sure that they are compatible with the newer release.
+4. Test the new release in a sandboxed environment.
+5. If you are running edx-platform, or some other repository from a custom branch, then you should rebase (and test) your changes on top of the latest release tag (see :ref:`edx_platform_fork`).
+
+The process for upgrading from one major release to the next works similarly to any other upgrade, with the ``quickstart`` command (see above). The single difference is that if the ``quickstart`` command detects that your tutor environment was generated with an older release, it will perform a few release-specific upgrade steps. These extra upgrade steps will be performed just once. But they will be ignored if you updated your local environment (for instance: with ``tutor config save``) prior to running ``quickstart``. This situation typically occurs if you need to re-build some Docker images (see above). In such a case, you should make use of the ``upgrade`` command. For instance, to upgrade a local installation from Lilac to Maple and rebuild some Docker images, run::
+
+    tutor config save
+    tutor images build all # list the images that should be rebuilt here
     tutor local upgrade --from=lilac
+    tutor local quickstart
 
 .. _autocomplete:
 

diff --git a/tests/test_env.py b/tests/test_env.py
@@ -213,11 +213,13 @@ def test_iter_values_named(self) -> None:
             ),
         )
 
+
+class CurrentVersionTests(unittest.TestCase):
     def test_current_version_in_empty_env(self) -> None:
         with temporary_root() as root:
             self.assertIsNone(env.current_version(root))
-            self.assertIsNone(env.current_release(root))
-            self.assertFalse(env.needs_major_upgrade(root))
+            self.assertIsNone(env.get_env_release(root))
+            self.assertIsNone(env.should_upgrade_from_release(root))
             self.assertTrue(env.is_up_to_date(root))
 
     def test_current_version_in_lilac_env(self) -> None:
@@ -230,8 +232,8 @@ def test_current_version_in_lilac_env(self) -> None:
             ) as f:
                 f.write("12.0.46")
             self.assertEqual("12.0.46", env.current_version(root))
-            self.assertEqual("lilac", env.current_release(root))
-            self.assertTrue(env.needs_major_upgrade(root))
+            self.assertEqual("lilac", env.get_env_release(root))
+            self.assertEqual("lilac", env.should_upgrade_from_release(root))
             self.assertFalse(env.is_up_to_date(root))
 
     def test_current_version_in_latest_env(self) -> None:
@@ -244,6 +246,6 @@ def test_current_version_in_latest_env(self) -> None:
             ) as f:
                 f.write(__version__)
             self.assertEqual(__version__, env.current_version(root))
-            self.assertEqual("maple", env.current_release(root))
-            self.assertFalse(env.needs_major_upgrade(root))
+            self.assertEqual("maple", env.get_env_release(root))
+            self.assertIsNone(env.should_upgrade_from_release(root))
             self.assertTrue(env.is_up_to_date(root))
diff --git a/tutor/commands/k8s.py b/tutor/commands/k8s.py
@@ -51,11 +51,11 @@ def load_job(self, name: str) -> Any:
             job_name = job["metadata"]["name"]
             if not isinstance(job_name, str):
                 raise exceptions.TutorError(
-                    "Invalid job name: '{}'. Expected str.".format(job_name)
+                    f"Invalid job name: '{job_name}'. Expected str."
                 )
             if job_name == name:
                 return job
-        raise exceptions.TutorError("Could not find job '{}'".format(name))
+        raise exceptions.TutorError(f"Could not find job '{name}'")
 
     def active_job_names(self) -> List[str]:
         """
@@ -71,7 +71,7 @@ def active_job_names(self) -> List[str]:
         ]
 
     def run_job(self, service: str, command: str) -> int:
-        job_name = "{}-job".format(service)
+        job_name = f"{service}-job"
         job = self.load_job(job_name)
         # Create a unique job name to make it deduplicate jobs and make it easier to
         # find later. Logs of older jobs will remain available for some time.
@@ -83,7 +83,7 @@ def run_job(self, service: str, command: str) -> int:
             if not active_jobs:
                 break
             fmt.echo_info(
-                "Waiting for active jobs to terminate: {}".format(" ".join(active_jobs))
+                f"Waiting for active jobs to terminate: {' '.join(active_jobs)}"
             )
             sleep(5)
 
@@ -106,7 +106,9 @@ def run_job(self, service: str, command: str) -> int:
         job["spec"]["backoffLimit"] = 1
         job["spec"]["ttlSecondsAfterFinished"] = 3600
         # Save patched job to "jobs.yml" file
-        with open(tutor_env.pathjoin(self.root, "k8s", "jobs.yml"), "w") as job_file:
+        with open(
+            tutor_env.pathjoin(self.root, "k8s", "jobs.yml"), "w", encoding="utf-8"
+        ) as job_file:
             serialize.dump(job, job_file)
         # We cannot use the k8s API to create the job: configMap and volume names need
         # to be found with the right suffixes.
@@ -115,7 +117,7 @@ def run_job(self, service: str, command: str) -> int:
             "--kustomize",
             tutor_env.pathjoin(self.root),
             "--selector",
-            "app.kubernetes.io/name={}".format(job_name),
+            f"app.kubernetes.io/name={job_name}",
         )
 
         message = (
@@ -127,7 +129,7 @@ def run_job(self, service: str, command: str) -> int:
         fmt.echo_info(message)
 
         # Wait for completion
-        field_selector = "metadata.name={}".format(job_name)
+        field_selector = f"metadata.name={job_name}"
         while True:
             namespaced_jobs = K8sClients.instance().batch_api.list_namespaced_job(
                 k8s_namespace(self.config), field_selector=field_selector
@@ -137,13 +139,11 @@ def run_job(self, service: str, command: str) -> int:
             job = namespaced_jobs.items[0]
             if not job.status.active:
                 if job.status.succeeded:
-                    fmt.echo_info("Job {} successful.".format(job_name))
+                    fmt.echo_info(f"Job {job_name} successful.")
                     break
                 if job.status.failed:
                     raise exceptions.TutorError(
-                        "Job {} failed. View the job logs to debug this issue.".format(
-                            job_name
-                        )
+                        f"Job {job_name} failed. View the job logs to debug this issue."
                     )
             sleep(5)
         return 0
@@ -158,34 +158,41 @@ def k8s() -> None:
 @click.option("-I", "--non-interactive", is_flag=True, help="Run non-interactively")
 @click.pass_context
 def quickstart(context: click.Context, non_interactive: bool) -> None:
-    if tutor_env.needs_major_upgrade(context.obj.root):
+    run_upgrade_from_release = tutor_env.should_upgrade_from_release(context.obj.root)
+    if run_upgrade_from_release is not None:
         click.echo(fmt.title("Upgrading from an older release"))
         context.invoke(
             upgrade,
-            from_version=tutor_env.current_release(context.obj.root),
-            non_interactive=non_interactive,
+            from_version=tutor_env.get_env_release(context.obj.root),
         )
 
     click.echo(fmt.title("Interactive platform configuration"))
     context.invoke(
         config_save_command,
         interactive=(not non_interactive),
     )
-    config = tutor_config.load(context.obj.root)
-    if not config["ENABLE_WEB_PROXY"]:
-        fmt.echo_alert(
-            "Potentially invalid configuration: ENABLE_WEB_PROXY=false\n"
-            "This setting might have been defined because you previously set WEB_PROXY=true. This is no longer"
-            " necessary in order to get Tutor to work on Kubernetes. In Tutor v11+ a Caddy-based load balancer is"
-            " provided out of the box to handle SSL/TLS certificate generation at runtime. If you disable this"
-            " service, you will have to configure an Ingress resource and a certificate manager yourself to redirect"
-            " traffic to the caddy service. See the Kubernetes section in the Tutor documentation for more"
-            " information."
+
+    if run_upgrade_from_release and not non_interactive:
+        question = f"""Your platform is being upgraded from {run_upgrade_from_release.capitalize()}.
+
+If you run custom Docker images, you must rebuild and push them to your private repository now by running the following
+commands in a different shell:
+
+    tutor images build all # add your custom images here
+    tutor images push all
+
+Press enter when you are ready to continue"""
+        click.confirm(
+            fmt.question(question), default=True, abort=True, prompt_suffix=" "
         )
+
     click.echo(fmt.title("Starting the platform"))
     context.invoke(start)
+
     click.echo(fmt.title("Database creation and migrations"))
     context.invoke(init, limit=None)
+
+    config = tutor_config.load(context.obj.root)
     fmt.echo_info(
         """Your Open edX platform is ready and can be accessed at the following urls:
 
@@ -253,7 +260,7 @@ def start(context: Context, names: List[str]) -> None:
                 "--kustomize",
                 tutor_env.pathjoin(context.root),
                 "--selector",
-                "app.kubernetes.io/name={}".format(name),
+                f"app.kubernetes.io/name={name}",
             )
 
 
@@ -345,8 +352,8 @@ def scale(context: Context, deployment: str, replicas: int) -> None:
         *resource_namespace_selector(
             config,
         ),
-        "--replicas={}".format(replicas),
-        "deployment/{}".format(deployment),
+        f"--replicas={replicas}",
+        f"deployment/{deployment}",
     )
 
 
@@ -443,29 +450,41 @@ def wait(context: Context, name: str) -> None:
     wait_for_pod_ready(config, name)
 
 
-@click.command(help="Upgrade from a previous Open edX named release")
+@click.command(
+    short_help="Perform release-specific upgrade tasks",
+    help="Perform release-specific upgrade tasks. To perform a full upgrade remember to run `quickstart`.",
+)
 @click.option(
     "--from",
-    "from_version",
-    default="lilac",
+    "from_release",
     type=click.Choice(["ironwood", "juniper", "koa", "lilac"]),
 )
-@click.option("-I", "--non-interactive", is_flag=True, help="Run non-interactively")
-@click.pass_obj
-def upgrade(context: Context, from_version: str, non_interactive: bool) -> None:
-    upgrade_from(context, from_version, interactive=not non_interactive)
+@click.pass_context
+def upgrade(context: click.Context, from_release: Optional[str]) -> None:
+    if from_release is None:
+        from_release = tutor_env.get_env_release(context.obj.root)
+    if from_release is None:
+        fmt.echo_info("Your environment is already up-to-date")
+    else:
+        fmt.echo_alert(
+            "This command only performs a partial upgrade of your Open edX platform. "
+            "To perform a full upgrade, you should run `tutor k8s quickstart`."
+        )
+        upgrade_from(context.obj, from_release)
+    # We update the environment to update the version
+    context.invoke(config_save_command)
 
 
 def kubectl_exec(
     config: Config, service: str, command: str, attach: bool = False
 ) -> int:
-    selector = "app.kubernetes.io/name={}".format(service)
+    selector = f"app.kubernetes.io/name={service}"
     pods = K8sClients.instance().core_api.list_namespaced_pod(
         namespace=k8s_namespace(config), label_selector=selector
     )
     if not pods.items:
         raise exceptions.TutorError(
-            "Could not find an active pod for the {} service".format(service)
+            f"Could not find an active pod for the {service} service"
         )
     pod_name = pods.items[0].metadata.name
 
@@ -486,10 +505,10 @@ def kubectl_exec(
 
 
 def wait_for_pod_ready(config: Config, service: str) -> None:
-    fmt.echo_info("Waiting for a {} pod to be ready...".format(service))
+    fmt.echo_info(f"Waiting for a {service} pod to be ready...")
     utils.kubectl(
         "wait",
-        *resource_selector(config, "app.kubernetes.io/name={}".format(service)),
+        *resource_selector(config, f"app.kubernetes.io/name={service}"),
         "--for=condition=ContainersReady",
         "--timeout=600s",
         "pod",