Update TrackClusterMergeSplitter to output track-cluster associations (PFA0) #1699

ruse-traveler · 2025-01-09T22:57:46Z

Briefly, what does this PR introduce?

This PR updates the TrackClusterMergeSplitter algorithm to output both edm4eic::TrackClusterMatch and MC particle-cluster associations. In this process, it reaps what was sown by originally writing the algorithm to operate on protoclusters rather than clusters: the algorithm will now ingest fully formed cluster and update relevant quantities.

What kind of change does this PR introduce?

Bug fix (issue #__)
New feature (issue Update Track-Cluster Merge/Splitter to output Track-Cluster Associations #1645 )
Documentation update
Other: __

Please check if this PR fulfills the following:

Tests for the changes have been added
Documentation has been added / updated
Changes have been communicated to collaborators

Does this PR introduce breaking changes? What changes might users need to make to their code?

No.

Does this PR change default behavior?

Yes. Track-cluster and MC particle-cluster associations will now be produced by the algorithm.

for more information, see https://pre-commit.ci

…/EICrecon into output-splitmerge-track-associations

for more information, see https://pre-commit.ci

…/EICrecon into output-splitmerge-track-associations

github-actions · 2025-02-04T22:13:50Z

Capybara summary for PR 1699

rec_dis_10x100_minQ2=0_craterlake
rec_dis_10x100_minQ2=1000_craterlake_tracking_only
rec_dis_18x275_minQ2=0_craterlake_18x275
rec_dis_18x275_minQ2=1000_craterlake_18x275
rec_dis_5x41_minQ2=0_craterlake_5x41
rec_e_1GeV_20GeV_craterlake
rec_pi_1GeV_20GeV_craterlake
^{_{Last updated 2025-03-17T10:35-04:00 27711f5}}

for more information, see https://pre-commit.ci

src/algorithms/calorimetry/TrackClusterMergeSplitter.cc

src/detectors/EHCAL/EHCAL.cc

src/algorithms/calorimetry/TrackClusterMergeSplitter.h

…/EICrecon into output-splitmerge-track-associations

for more information, see https://pre-commit.ci

…/EICrecon into output-splitmerge-track-associations

veprbl

A few comments here.

veprbl · 2025-03-24T01:04:51Z

src/algorithms/calorimetry/TrackClusterMergeSplitter.cc

-    // grab total energy
-    const float eClust = get_cluster_energy(clust) * m_cfg.sampFrac;
+    // lambda to compare MCParticles
+    auto compare = [](const edm4hep::MCParticle& lhs, const edm4hep::MCParticle& rhs) {


Use CompareObjectID?

Whoops! Good catch!

veprbl · 2025-03-24T02:15:15Z

src/algorithms/calorimetry/TrackClusterMergeSplitter.cc

+    // average of positions of clusters to merge
+    edm4hep::Vector3f rClust = new_clust.getPosition();
+    for (const auto& old_clust : old_clusts) {
+      rClust = rClust + ((old_clust.getEnergy() / eClust) * old_clust.getPosition());


Does this not completely ignore split_weights? Why not do it in the same loop over hits above?

Ahhh good catch!

veprbl · 2025-03-24T02:22:09Z

src/algorithms/calorimetry/TrackClusterMergeSplitter.h

    >,
    algorithms::Output<
-      edm4eic::ProtoClusterCollection
+      edm4eic::ClusterCollection,


So, we were discussing that this should not re-implement shape calculation, and that was factorized out of RecoCoG, but a copy of RecoCoG still becomes part of this algorithm? And we don't actually output associations without shapes anymore, which, I believe, was the original point of this PR. This makes me think that we should have just made a factory to propagate associations from protoclusters to clusters instead of doing this change.

So the primary point of this PR was to output specifically track-cluster matches. To do this, this requires updating the algorithm to operate on clusters rather than protoclusters since -- quite reasonably in my opinion! -- we don't have track-protocluster matches. I don't think an association propagator would help us in this context (esp. since it would necessitate a data-model change).

The partial duplication of RecoCoG then followed from that switch to clusters. My intent wasn't so much to completely duplicate the algorithm but to update various quantities in a way that's consistent with what RecoCoG does, including the handling of particle-cluster associations. My thinking is that the merging functionality here is useful beyond PF reconstruction, and so I would prefer that the produced clusters here are consistent with those that RecoCoG produces.

That being said, in the interest of keeping PF development moving, I would be willing to excise the RecoCoG bits and go with a "bare-bones" energy/position reconstruction for the time being. That should be good enough for implementing the rest of the baseline. But we should revisit this topic, though, since this certainly won't be the last reclustering algorithm we write!

So the primary point of this PR was to output specifically track-cluster matches. To do this, this requires updating the algorithm to operate on clusters rather than protoclusters since -- quite reasonably in my opinion! -- we don't have track-protocluster matches. I don't think an association propagator would help us in this context (esp. since it would necessitate a data-model change).

Right, so my suggestion was to add track-protocluster association type. Then a factory (that is separate from RecoCoG) would convert track-protocluster associations to track-cluster associations. I now realize this would require factory to do matching between protoclusters and clusters, which is possible but is not as trivial as I'd like it to be.

The partial duplication of RecoCoG then followed from that switch to clusters. My intent wasn't so much to completely duplicate the algorithm but to update various quantities in a way that's consistent with what RecoCoG does, including the handling of particle-cluster associations. My thinking is that the merging functionality here is useful beyond PF reconstruction, and so I would prefer that the produced clusters here are consistent with those that RecoCoG produces.

Right, but that's not making it better, if we do things "like RecoCoG, but sometimes not". And, if we want to make this the default for non-PF clustering path, then it makes even more sense to use RecoCoG.

That being said, in the interest of keeping PF development moving, I would be willing to excise the RecoCoG bits and go with a "bare-bones" energy/position reconstruction for the time being. That should be good enough for implementing the rest of the baseline. But we should revisit this topic, though, since this certainly won't be the last reclustering algorithm we write!

No need to excise parts of RecoCoG. If we go the way proposed in the PR, it better be fit for your purpose. My counter-argument against expediency is that if TrackClusterMergeSplitter development is not aligned with rest of the facilities in EICrecon, you can't expect everyone else to align with it, which is fine if your plan to do all of development yourself.

Reading my response back, I feel that not only was my tone unprofessional and dismissive, I did not clearly articulate (or articulate at all) some of my questions and concerns, and for that I apologize.

The approach took here was informed by the intent to eventually move this algorithm downstream of a centralized track-cluster matching algorithm. As such, this algorithm reclusters already constructed clusters. This PR was my first thought as to how to do such a reclustering and I sincerely believed it was aligned with our framework. However, it clearly isn't and certainly isn't the best approach.

The points you raised make sense, and it didn't occur to me to go back through the protoclusters. I'm not completely clear on how this might look, though, so help me understand. My first thought is that such an algorithm flow might look something like this:

<start>-->[protoclusters]--(Reco CoG)-->[clusters]----(track matching)-->[cluster matches]-- --(convert clusters)-->[protoclusters]----(copy matches)-->[protocluster matches]-- --(merge/split)-->[protoclusters + matches]-->(Reco CoG)-->[clusters]-- --(copy matches)-->[updated matches]--><end>

Is this in the right direction?

Reading my response back, I feel that not only was my tone unprofessional and dismissive, I did not clearly articulate (or articulate at all) some of my questions and concerns, and for that I apologize.

No, it was all good points and I was able to finally understand some of these things better. I did have to push back on merge splitter being able to do its thing in isolation.

The approach took here was informed by the intent to eventually move this algorithm downstream of a centralized track-cluster matching algorithm. As such, this algorithm reclusters already constructed clusters. This PR was my first thought as to how to do such a reclustering and I sincerely believed it was aligned with our framework. However, it clearly isn't and certainly isn't the best approach.

I'm starting to think that edm4eic::ProtoCluster is not such a great type. edm4eic::Cluster is already a superset of it, and we could have used just that one type.

The points you raised make sense, and it didn't occur to me to go back through the protoclusters. I'm not completely clear on how this might look, though, so help me understand. My first thought is that such an algorithm flow might look something like this:

<start>-->[protoclusters]--(Reco CoG)-->[clusters]----(track matching)-->[cluster matches]-- --(convert clusters)-->[protoclusters]----(copy matches)-->[protocluster matches]-- --(merge/split)-->[protoclusters + matches]-->(Reco CoG)-->[clusters]-- --(copy matches)-->[updated matches]--><end>

Is this in the right direction?

To be honest, I can not tell with one-dimensional graph. This is what I think could work and provide maximal reuse of existing facilities:

Loading

flowchart TD RecHits--> CalorimeterIslandCluster:::algo --> IslandProtoClusters IslandProtoClusters --> CalorimeterClusterRecoCoG:::algo --> ClustersWithoutShapes --> CalorimeterClusterShape:::algo CalorimeterClusterRecoCoG:::algo --> ClusterAssociationsWithoutShapes --> CalorimeterClusterShape:::algo --> Clusters CalorimeterClusterShape:::algo --> ClusterAssociations Clusters --> TrackClusterMergeSplitter:::algo ClusterAssociations --> TrackClusterMergeSplitter:::algo CalorimeterTrackProjections --> TrackClusterMergeSplitter:::algo --> SplitMergeProtoClusters --> CalorimeterClusterRecoCoG':::algo TrackClusterMergeSplitter:::algo -->|remove?| SplitMergeProtoClustersAssociations TrackClusterMergeSplitter:::algo --> TrackSplitMergeProtoClusterMatches CalorimeterClusterRecoCoG':::algo --> SplitMergeClustersWithoutShapes --> CalorimeterClusterShape':::algo CalorimeterClusterRecoCoG':::algo --> SplitMergeClusterAssociationsWithoutShapes --> CalorimeterClusterShape':::algo --> SplitMergeClusters CalorimeterClusterShape':::algo --> SplitMergeClusterAssociations TrackSplitMergeProtoClusterMatches --> CalorimeterProtoClusterMatchPromotion:::algo --> SplitMergeClusterTrackMatches SplitMergeClusters --> ... SplitMergeClusterAssociations --> ... SplitMergeClusterTrackMatches --> ... classDef algo fill:#f96

We would need a new "CalorimeterProtoClusterMatchPromotion" algorithm operating on a new type for "TrackSplitMergeProtoClusterMatches".

(you can click quote my message to see source code of mermaid diagram and modify, if you want to adjust it)

I see! Thanks! The diagram is extremely helpful!!

I did have to push back on merge splitter being able to do its thing in isolation.

Completely fair! I was falling into bad habits of writing big, monolithic pieces of code while working on this PR. The above topology looks a lot more workable!

I'm starting to think that edm4eic::ProtoCluster is not such a great type. edm4eic::Cluster is already a superset of it, and we could have used just that one type.

If we stick with protoclusters for the time being, would we also need an algorithm to "demote" clusters into protoclusters that could be fed into the merge/splitter?

If we stick with protoclusters for the time being, would we also need an algorithm to "demote" clusters into protoclusters that could be fed into the merge/splitter?

The diagram above has TrackClusterMergeSplitter consuming a collection of edm4eic::Clusters (don't you need cluster positions to do the matching?) and outputting edm4eic::ProtoClusters.

... so demotion happens inside TrackClusterMergeSplitter, at least to what's drawn.

Ahhhh I see! Makes sense! That would make it pretty straightforward to keep track of the protocluster energies and positions without having to duplicate getEnergy() and getPosition() as functions.

ruse-traveler added 3 commits January 9, 2025 15:45

Add hooks for track-cluster match outputs

68672a9

Update algorithm to operate on clusters

90a9ccc

Begin filling in cluster reconstruction calculation

Loading
Loading status checks…

335d224

github-actions bot added topic: calorimetry topic: barrel topic: forward topic: backward labels Jan 9, 2025

pre-commit-ci bot and others added 14 commits January 9, 2025 22:58

[pre-commit.ci] auto fixes from pre-commit.com hooks

Loading
Loading status checks…

68fa03d

for more information, see https://pre-commit.ci

Add position calculation

f8605e8

Fix typos in input collection names

de78ffb

Merge branch 'output-splitmerge-track-associations' of github.com:eic…

Loading
Loading status checks…

1d04aa8

…/EICrecon into output-splitmerge-track-associations

Add missing edm4eic version header

Loading
Loading status checks…

30b92d3

Be safer with creating new clusters

Loading
Loading status checks…

021e11b

Rework weight calculation to accomodate cluster re-reconstruction

Loading
Loading status checks…

0307731

Fill track-cluster match output

Loading
Loading status checks…

f8d2050

[pre-commit.ci] auto fixes from pre-commit.com hooks

Loading
Loading status checks…

3dab6d8

for more information, see https://pre-commit.ci

Add missed edm4eic version header

bf31964

Merge branch 'output-splitmerge-track-associations' of github.com:eic…

Loading
Loading status checks…

178ecb8

…/EICrecon into output-splitmerge-track-associations

Add shape calculation

Loading
Loading status checks…

b4b0aca

Wire in associations

Loading
Loading status checks…

303f3f2

Copy associations of unused clusters into output

Loading
Loading status checks…

a2ff2f1

ruse-traveler temporarily deployed to github-pages February 4, 2025 22:12 — with GitHub Actions Inactive

Fill in mergerd cluster associations

Loading
Loading status checks…

f7476b4

ruse-traveler marked this pull request as ready for review February 4, 2025 22:53

[pre-commit.ci] auto fixes from pre-commit.com hooks

Loading
Loading status checks…

14deb24

for more information, see https://pre-commit.ci

pre-commit-ci bot temporarily deployed to github-pages February 4, 2025 23:33 Inactive

ruse-traveler requested review from Chao1009, veprbl and steinber February 5, 2025 15:13

veprbl reviewed Feb 6, 2025

View reviewed changes

src/algorithms/calorimetry/TrackClusterMergeSplitter.cc Outdated Show resolved Hide resolved

veprbl reviewed Feb 6, 2025

View reviewed changes

src/detectors/EHCAL/EHCAL.cc Outdated Show resolved Hide resolved

veprbl reviewed Feb 6, 2025

View reviewed changes

src/algorithms/calorimetry/TrackClusterMergeSplitter.h Outdated Show resolved Hide resolved

veprbl reviewed Feb 6, 2025

View reviewed changes

src/algorithms/calorimetry/TrackClusterMergeSplitter.h Outdated Show resolved Hide resolved

ruse-traveler added 2 commits February 11, 2025 13:16

Merge branch 'main' into output-splitmerge-track-associations

b5a364c

Fix HcalEndcapNClusterAssociations typo

8141330

ruse-traveler mentioned this pull request Feb 11, 2025

Cluster Shape Parameter Calculation Could Be Factorized #1733

Closed

ruse-traveler added 2 commits February 11, 2025 15:19

Merge branch 'output-splitmerge-track-associations' of github.com:eic…

Loading
Loading status checks…

1c17703

…/EICrecon into output-splitmerge-track-associations

Template ObjectID comparator

Loading
Loading status checks…

0c75eb9

ruse-traveler temporarily deployed to github-pages February 11, 2025 21:23 — with GitHub Actions Inactive

ruse-traveler and others added 3 commits February 22, 2025 07:02

Rewrite splitting weight calculation to use maps

bd55165

Merge branch 'main' into output-splitmerge-track-associations

Loading
Loading status checks…

32af87f

[pre-commit.ci] auto fixes from pre-commit.com hooks

Loading
Loading status checks…

cc282d2

for more information, see https://pre-commit.ci

pre-commit-ci bot temporarily deployed to github-pages February 22, 2025 12:38 Inactive

ruse-traveler added 4 commits March 10, 2025 09:58

Merge main

143df2a

Remove shape calculation from merge/splitter

8288600

Merge branch 'output-splitmerge-track-associations' of github.com:eic…

Loading
Loading status checks…

30c9ab2

…/EICrecon into output-splitmerge-track-associations

Make split/merge shape parameters consistent with RecoCoG

Loading
Loading status checks…

9d33912

ruse-traveler temporarily deployed to github-pages March 10, 2025 17:37 — with GitHub Actions Inactive

IWYU

Loading
Loading status checks…

a8537b1

veprbl temporarily deployed to github-pages March 11, 2025 21:59 — with GitHub Actions Inactive

ruse-traveler added 3 commits March 12, 2025 21:15

Merge branch 'main' into output-splitmerge-track-associations

004b053

Disable saving split/merge track-cluster matches to output

87c0458

Merge branch 'output-splitmerge-track-associations' of github.com:eic…

Loading
Loading status checks…

f79708b

…/EICrecon into output-splitmerge-track-associations

ruse-traveler temporarily deployed to github-pages March 13, 2025 02:00 — with GitHub Actions Inactive

Merge branch 'main' into output-splitmerge-track-associations

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Verified
Learn about vigilant mode

Loading
Loading status checks…

27711f5

ruse-traveler temporarily deployed to github-pages March 17, 2025 14:35 — with GitHub Actions Inactive

ruse-traveler requested a review from veprbl March 20, 2025 14:43

veprbl reviewed Mar 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update TrackClusterMergeSplitter to output track-cluster associations (PFA0) #1699

Update TrackClusterMergeSplitter to output track-cluster associations (PFA0) #1699

ruse-traveler commented Jan 9, 2025 •

edited

Loading

github-actions bot commented Feb 4, 2025 •

edited

Loading

veprbl left a comment

veprbl Mar 24, 2025

ruse-traveler Mar 25, 2025

veprbl Mar 24, 2025

ruse-traveler Mar 25, 2025

veprbl Mar 24, 2025

ruse-traveler Mar 25, 2025

veprbl Mar 25, 2025

ruse-traveler Mar 26, 2025 •

edited

Loading

veprbl Mar 26, 2025

ruse-traveler Mar 26, 2025

veprbl Mar 26, 2025

veprbl Mar 26, 2025

ruse-traveler Mar 26, 2025

Update TrackClusterMergeSplitter to output track-cluster associations (PFA0) #1699

Are you sure you want to change the base?

Update TrackClusterMergeSplitter to output track-cluster associations (PFA0) #1699

Conversation

ruse-traveler commented Jan 9, 2025 • edited Loading

Briefly, what does this PR introduce?

What kind of change does this PR introduce?

Please check if this PR fulfills the following:

Does this PR introduce breaking changes? What changes might users need to make to their code?

Does this PR change default behavior?

github-actions bot commented Feb 4, 2025 • edited Loading

Capybara summary for PR 1699

veprbl left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ruse-traveler Mar 26, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ruse-traveler commented Jan 9, 2025 •

edited

Loading

github-actions bot commented Feb 4, 2025 •

edited

Loading

ruse-traveler Mar 26, 2025 •

edited

Loading