Xavier law dpoa example (#34)

* workflows: law dpoa example (#25) (#27) * workflows: update readme * workflows: after local test * workflows-PFCands: add template for plotting PFCands * workflows-PFCands: working workflow * workflows-PFCands: edit README
cms-dpoa · Apr 12, 2023 · 9aec608 · 9aec608
1 parent 39cca75
commit 9aec608
Showing 1 changed file with 120 additions and 38 deletions.
diff --git a/workflows/PFCands_plotting/README.md b/workflows/PFCands_plotting/README.md
@@ -31,32 +31,46 @@ You should see:
 indexing tasks in 1 module(s)
 loading module 'dpoa.tasks', done
 
-module 'dpoa.tasks.test', 2 task(s):
+module 'dpoa.tasks.test', 5 task(s):
     - NanoProducer
-    - CreatePlots
+    - Repository
+    - CoffeaPlotting
+    - RDFPlotting
+    - Final
 
-written 2 task(s) to index file 
-'/law-dpoa-example-main/.law/index'
+written 5 task(s) to index file '/law-dpoa-example-main/.law/index'
 ```
 
 #### 2. Check the status of the CreatePlots task
 
 ```shell
-law run CreatePlots --print-status -1
+law run Final --print-status -1
 ```
 
 No tasks ran so far, so no output target should exist yet. You will see this output:
 
 ```output
 print task status with max_depth -1 and target_depth 0
 
-0 > CreatePlots()
-│     LocalFileTarget(fs=local_fs, path=$DPOA_STORE_DIR/CreatePlots/some_nice_plot.png)
+0 > Final()
+│     LocalFileTarget(fs=local_fs, path=$DPOA_STORE_DIR/Final/some_fake_file.txt)
 │       absent
 │
-└──1 > NanoProducer()
-         LocalFileTarget(fs=local_fs, path=$DPOA_STORE_DIR/NanoProducer/some_fake_file.root)
-           absent
+├──1 > RDFPlotting()
+│  │     LocalFileTarget(fs=local_fs, path=$DPOA_STORE_DIR/RDFPlotting/rdataframe_output)
+│  │       absent
+│  │
+│  └──2 > Repository()
+│           LocalFileTarget(fs=local_fs, path=$DPOA_STORE_DIR/Repository/cat-hackathon)
+│             absent
+│
+└──1 > CoffeaPlotting()
+   │     LocalFileTarget(fs=local_fs, path=$DPOA_STORE_DIR/CoffeaPlotting/coffea_output)
+   │       absent
+   │
+   └──2 > Repository()
+            LocalFileTarget(fs=local_fs, path=$DPOA_STORE_DIR/Repository/cat-hackathon)
+              absent
 ```
 
 #### 3. Run the CreatePlots task
@@ -67,27 +81,27 @@ To trigger the *second* task, run:
 law run CreatePlots
 ```
 
-This should take only a few seconds to process.
+This will reference you local docker, so it must be up and running.
 
 ```output
 ===== Luigi Execution Summary =====
 
-Scheduled 2 tasks of which:
-* 1 complete ones were encountered:
-    - 1 NanoProducer(...)
-* 1 ran successfully:
-    - 1 CreatePlots(...)
+Scheduled 4 tasks of which:
+* 4 ran successfully:
+    - 1 CoffeaPlotting(...)
+    - 1 Final(...)
+    - 1 RDFPlotting(...)
+    - 1 Repository(...)
 
 This progress looks :) because there were no failed tasks or missing dependencies
 
 ===== Luigi Execution Summary =====
-
 ```
 
-By default, this example uses a local scheduler, which - by definition - offers no visualization tools in the browser. If you want to see how the task tree is built and subsequently run ``luigid`` in a second terminal. This will start a central scheduler at [localhost:8082](localhost:8082) (the default address). To inform tasks (or rather *workers*) about the scheduler, either add ``--local-scheduler False`` to the ``law run`` command as such:
+By default, this example uses a local scheduler, which - by definition - offers no visualization tools in the browser. If you want to see how the task tree is built and subsequently run ``luigid`` in a second terminal. This will start a central scheduler at [localhost:8080](localhost:8080) (the default address). To inform tasks (or rather *workers*) about the scheduler, either add ``--local-scheduler False`` to the ``law run`` command as such:
 
 ```shell
-law run CreatePlots --local-scheduler False
+law run Final --local-scheduler False
 ```
 
 or set the ``local-scheduler`` value in the ``[luigi_core]`` config section in the ``law.cfg`` file to ``False``.
@@ -99,21 +113,33 @@ FIXME
 #### 4. Check the status again
 
 ```shell
-law run CreatePlots --print-status 1
+law run Final --print-status -1
 ```
 
 When step 2 succeeded, all output targets should exist:
 
 ```output
-print task status with max_depth 1 and target_depth 0
+print task status with max_depth -1 and target_depth 0
 
-0 > CreatePlots()
-│     LocalFileTarget(fs=local_fs, path=$DPOA_STORE_DIR/CreatePlots/some_nice_plot.png)
+0 > Final()
+│     LocalFileTarget(fs=local_fs, path=$DPOA_STORE_DIR/Final/some_fake_file.txt)
 │       existent
 │
-└──1 > NanoProducer()
-         LocalFileTarget(fs=local_fs, path=$DPOA_STORE_DIR/NanoProducer/some_fake_file.root)
-           existent
+├──1 > RDFPlotting()
+│  │     LocalFileTarget(fs=local_fs, path=$DPOA_STORE_DIR/RDFPlotting/rdataframe_output)
+│  │       existent
+│  │
+│  └──2 > Repository()
+│           LocalFileTarget(fs=local_fs, path=$DPOA_STORE_DIR/Repository/cat-hackathon)
+│             existent
+│
+└──1 > CoffeaPlotting()
+   │     LocalFileTarget(fs=local_fs, path=$DPOA_STORE_DIR/CoffeaPlotting/coffea_output)
+   │       existent
+   │
+   └──2 > Repository()
+            LocalFileTarget(fs=local_fs, path=$DPOA_STORE_DIR/Repository/cat-hackathon)
+              existent
 ```
 
 To see the status of the targets in the collection, i.e., the grouped outputs of the branch tasks,
@@ -129,18 +155,63 @@ You will have created the following tree in your directory:
 
 ```output
 store
-├── CreatePlots
-│   └── some_nice_plot.png
-└── NanoProducer
-    └── some_fake_file.root
-
-3 directories, 2 files
+├── CoffeaPlotting
+│   └── coffea_output
+│       ├── PF_n.png
+│       ├── PF_n.txt
+│       └── PF_pt.png
+├── Final
+│   └── some_fake_file.txt
+├── RDFPlotting
+│   └── rdataframe_output
+│       └── PFCands_pt.png
+└── Repository
+    └── cat-hackathon
+        ├── README.md
+        ├── analysis
+        │   ├── coffea
+        │   │   ├── README.md
+        │   │   └── coffea_plot.py
+        │   └── rdataframe
+        │       ├── README.md
+        │       └── rdf_plot.py
+        ├── data
+        │   └── doubleeg_nanoaod_eg.root
+        ├── production
+        │   └── pfnano
+        │       ├── README.md
+        │       └── pf_production.sh
+        └── workflows
+            ├── PFCands_plotting
+            │   ├── LICENSE
+            │   ├── README.md
+            │   ├── dpoa
+            │   │   ├── __init__.py
+            │   │   └── tasks
+            │   │       ├── __init__.py
+            │   │       ├── base.py
+            │   │       └── test.py
+            │   ├── law.cfg
+            │   └── setup.sh
+            └── law-dpoa-example
+                ├── LICENSE
+                ├── README.md
+                ├── dpoa
+                │   ├── __init__.py
+                │   └── tasks
+                │       ├── __init__.py
+                │       ├── base.py
+                │       └── test.py
+                ├── law.cfg
+                └── setup.sh
+
+21 directories, 29 files
 ```
 
 #### 6. Cleanup the results
 
 ```shell
-law run CreatePlots --remove-output -1
+law run Final --remove-output -1
 ```
 
 You should see:
@@ -150,11 +221,22 @@ remove task output with max_depth -1
 removal mode? [i*(interactive), d(dry), a(all)] a
 selected all mode
 
-0 > CreatePlots()
-│     LocalFileTarget(fs=local_fs, path=$DPOA_STORE_DIR/CreatePlots/some_nice_plot.png)
+0 > Final()
+│     LocalFileTarget(fs=local_fs, path=$DPOA_STORE_DIR/Final/some_fake_file.txt)
 │       removed
 │
-└──1 > NanoProducer()
-         LocalFileTarget(fs=local_fs, path=$DPOA_STORE_DIR/NanoProducer/some_fake_file.root)
-           removed
+├──1 > RDFPlotting()
+│  │     LocalFileTarget(fs=local_fs, path=$DPOA_STORE_DIR/RDFPlotting/rdataframe_output)
+│  │       removed
+│  │
+│  └──2 > Repository()
+│           LocalFileTarget(fs=local_fs, path=$DPOA_STORE_DIR/Repository/cat-hackathon)
+│             removed
+│
+└──1 > CoffeaPlotting()
+   │     LocalFileTarget(fs=local_fs, path=$DPOA_STORE_DIR/CoffeaPlotting/coffea_output)
+   │       removed
+   │
+   └──2 > Repository()
+            already handled
 ```