-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update TrackClusterMergeSplitter to output track-cluster associations (PFA0) #1699
base: main
Are you sure you want to change the base?
Conversation
for more information, see https://pre-commit.ci
…/EICrecon into output-splitmerge-track-associations
for more information, see https://pre-commit.ci
…/EICrecon into output-splitmerge-track-associations
Capybara summary for PR 1699
|
for more information, see https://pre-commit.ci
…/EICrecon into output-splitmerge-track-associations
for more information, see https://pre-commit.ci
…/EICrecon into output-splitmerge-track-associations
…/EICrecon into output-splitmerge-track-associations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments here.
// grab total energy | ||
const float eClust = get_cluster_energy(clust) * m_cfg.sampFrac; | ||
// lambda to compare MCParticles | ||
auto compare = [](const edm4hep::MCParticle& lhs, const edm4hep::MCParticle& rhs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use CompareObjectID
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops! Good catch!
// average of positions of clusters to merge | ||
edm4hep::Vector3f rClust = new_clust.getPosition(); | ||
for (const auto& old_clust : old_clusts) { | ||
rClust = rClust + ((old_clust.getEnergy() / eClust) * old_clust.getPosition()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this not completely ignore split_weights? Why not do it in the same loop over hits above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh good catch!
>, | ||
algorithms::Output< | ||
edm4eic::ProtoClusterCollection | ||
edm4eic::ClusterCollection, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, we were discussing that this should not re-implement shape calculation, and that was factorized out of RecoCoG, but a copy of RecoCoG still becomes part of this algorithm? And we don't actually output associations without shapes anymore, which, I believe, was the original point of this PR. This makes me think that we should have just made a factory to propagate associations from protoclusters to clusters instead of doing this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the primary point of this PR was to output specifically track-cluster matches. To do this, this requires updating the algorithm to operate on clusters rather than protoclusters since -- quite reasonably in my opinion! -- we don't have track-protocluster matches. I don't think an association propagator would help us in this context (esp. since it would necessitate a data-model change).
The partial duplication of RecoCoG then followed from that switch to clusters. My intent wasn't so much to completely duplicate the algorithm but to update various quantities in a way that's consistent with what RecoCoG does, including the handling of particle-cluster associations. My thinking is that the merging functionality here is useful beyond PF reconstruction, and so I would prefer that the produced clusters here are consistent with those that RecoCoG produces.
That being said, in the interest of keeping PF development moving, I would be willing to excise the RecoCoG bits and go with a "bare-bones" energy/position reconstruction for the time being. That should be good enough for implementing the rest of the baseline. But we should revisit this topic, though, since this certainly won't be the last reclustering algorithm we write!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the primary point of this PR was to output specifically track-cluster matches. To do this, this requires updating the algorithm to operate on clusters rather than protoclusters since -- quite reasonably in my opinion! -- we don't have track-protocluster matches. I don't think an association propagator would help us in this context (esp. since it would necessitate a data-model change).
Right, so my suggestion was to add track-protocluster association type. Then a factory (that is separate from RecoCoG) would convert track-protocluster associations to track-cluster associations. I now realize this would require factory to do matching between protoclusters and clusters, which is possible but is not as trivial as I'd like it to be.
The partial duplication of RecoCoG then followed from that switch to clusters. My intent wasn't so much to completely duplicate the algorithm but to update various quantities in a way that's consistent with what RecoCoG does, including the handling of particle-cluster associations. My thinking is that the merging functionality here is useful beyond PF reconstruction, and so I would prefer that the produced clusters here are consistent with those that RecoCoG produces.
Right, but that's not making it better, if we do things "like RecoCoG, but sometimes not". And, if we want to make this the default for non-PF clustering path, then it makes even more sense to use RecoCoG.
That being said, in the interest of keeping PF development moving, I would be willing to excise the RecoCoG bits and go with a "bare-bones" energy/position reconstruction for the time being. That should be good enough for implementing the rest of the baseline. But we should revisit this topic, though, since this certainly won't be the last reclustering algorithm we write!
No need to excise parts of RecoCoG. If we go the way proposed in the PR, it better be fit for your purpose. My counter-argument against expediency is that if TrackClusterMergeSplitter development is not aligned with rest of the facilities in EICrecon, you can't expect everyone else to align with it, which is fine if your plan to do all of development yourself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading my response back, I feel that not only was my tone unprofessional and dismissive, I did not clearly articulate (or articulate at all) some of my questions and concerns, and for that I apologize.
The approach took here was informed by the intent to eventually move this algorithm downstream of a centralized track-cluster matching algorithm. As such, this algorithm reclusters already constructed clusters. This PR was my first thought as to how to do such a reclustering and I sincerely believed it was aligned with our framework. However, it clearly isn't and certainly isn't the best approach.
The points you raised make sense, and it didn't occur to me to go back through the protoclusters. I'm not completely clear on how this might look, though, so help me understand. My first thought is that such an algorithm flow might look something like this:
<start>-->[protoclusters]--(Reco CoG)-->[clusters]----(track matching)-->[cluster matches]--
--(convert clusters)-->[protoclusters]----(copy matches)-->[protocluster matches]--
--(merge/split)-->[protoclusters + matches]-->(Reco CoG)-->[clusters]--
--(copy matches)-->[updated matches]--><end>
Is this in the right direction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading my response back, I feel that not only was my tone unprofessional and dismissive, I did not clearly articulate (or articulate at all) some of my questions and concerns, and for that I apologize.
No, it was all good points and I was able to finally understand some of these things better. I did have to push back on merge splitter being able to do its thing in isolation.
The approach took here was informed by the intent to eventually move this algorithm downstream of a centralized track-cluster matching algorithm. As such, this algorithm reclusters already constructed clusters. This PR was my first thought as to how to do such a reclustering and I sincerely believed it was aligned with our framework. However, it clearly isn't and certainly isn't the best approach.
I'm starting to think that edm4eic::ProtoCluster
is not such a great type. edm4eic::Cluster
is already a superset of it, and we could have used just that one type.
The points you raised make sense, and it didn't occur to me to go back through the protoclusters. I'm not completely clear on how this might look, though, so help me understand. My first thought is that such an algorithm flow might look something like this:
<start>-->[protoclusters]--(Reco CoG)-->[clusters]----(track matching)-->[cluster matches]-- --(convert clusters)-->[protoclusters]----(copy matches)-->[protocluster matches]-- --(merge/split)-->[protoclusters + matches]-->(Reco CoG)-->[clusters]-- --(copy matches)-->[updated matches]--><end>
Is this in the right direction?
To be honest, I can not tell with one-dimensional graph. This is what I think could work and provide maximal reuse of existing facilities:
flowchart TD
RecHits--> CalorimeterIslandCluster:::algo --> IslandProtoClusters
IslandProtoClusters --> CalorimeterClusterRecoCoG:::algo --> ClustersWithoutShapes
--> CalorimeterClusterShape:::algo
CalorimeterClusterRecoCoG:::algo --> ClusterAssociationsWithoutShapes
--> CalorimeterClusterShape:::algo
--> Clusters
CalorimeterClusterShape:::algo --> ClusterAssociations
Clusters --> TrackClusterMergeSplitter:::algo
ClusterAssociations --> TrackClusterMergeSplitter:::algo
CalorimeterTrackProjections --> TrackClusterMergeSplitter:::algo
--> SplitMergeProtoClusters --> CalorimeterClusterRecoCoG':::algo
TrackClusterMergeSplitter:::algo -->|remove?| SplitMergeProtoClustersAssociations
TrackClusterMergeSplitter:::algo --> TrackSplitMergeProtoClusterMatches
CalorimeterClusterRecoCoG':::algo --> SplitMergeClustersWithoutShapes
--> CalorimeterClusterShape':::algo
CalorimeterClusterRecoCoG':::algo --> SplitMergeClusterAssociationsWithoutShapes
--> CalorimeterClusterShape':::algo
--> SplitMergeClusters
CalorimeterClusterShape':::algo --> SplitMergeClusterAssociations
TrackSplitMergeProtoClusterMatches --> CalorimeterProtoClusterMatchPromotion:::algo --> SplitMergeClusterTrackMatches
SplitMergeClusters --> ...
SplitMergeClusterAssociations --> ...
SplitMergeClusterTrackMatches --> ...
classDef algo fill:#f96
We would need a new "CalorimeterProtoClusterMatchPromotion" algorithm operating on a new type for "TrackSplitMergeProtoClusterMatches".
(you can click quote my message to see source code of mermaid diagram and modify, if you want to adjust it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see! Thanks! The diagram is extremely helpful!!
I did have to push back on merge splitter being able to do its thing in isolation.
Completely fair! I was falling into bad habits of writing big, monolithic pieces of code while working on this PR. The above topology looks a lot more workable!
I'm starting to think that edm4eic::ProtoCluster is not such a great type. edm4eic::Cluster is already a superset of it, and we could have used just that one type.
If we stick with protoclusters for the time being, would we also need an algorithm to "demote" clusters into protoclusters that could be fed into the merge/splitter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we stick with protoclusters for the time being, would we also need an algorithm to "demote" clusters into protoclusters that could be fed into the merge/splitter?
The diagram above has TrackClusterMergeSplitter
consuming a collection of edm4eic::Cluster
s (don't you need cluster positions to do the matching?) and outputting edm4eic::ProtoCluster
s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... so demotion happens inside TrackClusterMergeSplitter, at least to what's drawn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhhh I see! Makes sense! That would make it pretty straightforward to keep track of the protocluster energies and positions without having to duplicate getEnergy()
and getPosition()
as functions.
Briefly, what does this PR introduce?
This PR updates the
TrackClusterMergeSplitter
algorithm to output bothedm4eic::TrackClusterMatch
and MC particle-cluster associations. In this process, it reaps what was sown by originally writing the algorithm to operate on protoclusters rather than clusters: the algorithm will now ingest fully formed cluster and update relevant quantities.What kind of change does this PR introduce?
Please check if this PR fulfills the following:
Does this PR introduce breaking changes? What changes might users need to make to their code?
No.
Does this PR change default behavior?
Yes. Track-cluster and MC particle-cluster associations will now be produced by the algorithm.