-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: add schema for "projection" entry in start document #130
base: main
Are you sure you want to change the base?
Conversation
This PR could use a description or a link to some meeting notes if there are any. |
On a Pilot call @dylanmcreynolds nudged us to move forward with this. I am personally happy with it but given the importance and the cost of any future changes I think we should have at least one more meeting to pick apart the structure and the names and consider alternatives. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to see how this maps on to a more developed schema such as the NeXus SAS definition (https://manual.nexusformat.org/classes/applications/NXsas.html)
"type": "object", | ||
"properties" : { | ||
"stream": {"type": "string"}, | ||
"location": {"enum" : ["event", "configuration"]}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if it's in the stop document? Or elsewhere in the EventDescriptor, such as the source
or shape
? Would a dotted object representation be simpler and more comprehensive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Being in the start document is a far more compelling argument.
I'm skeptical of embracing the dot-ness (as we have previously agreed ed that dot access on dicts is not great) so the code to munge that back to something we can actually use will be annoying, but on the other hand we can write the function once and stuff it it databroker).
"required" : ["stream", "location", "field"], | ||
"additionalProperties": false | ||
}, | ||
"technique": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The term "technique" might be too limiting. This provides a generic mechanism for mapping any externally-defined metadata schema to the contents of the documents to come. Those schemas might be broken up by experimental technique, by downstream analysis process (applicable to more than one technique such as "scattering" and "diffraction"), by institution, by domain, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"data_remapping", "datamap", "DateMap", "Application", "application_definitions"?
I think we have a bunch of helper functions like list_applications(h)
, iter_applications(h, name=None)
that yields dicts (?) full of {base types or xarrays}?
👍 for @stuartcampbell's request to use this on a fully-worked example or three before we commit to it. |
@tacaswell on Slack:
|
I am in favor of proceeding on this but doing so in a separate experimental document type that can evolve quickly and make breaking changes as needed. I think @tacaswell suggested this in passing on Slack, as I tersely documented at the time, above. |
There was a call on this subject today. Things there seemed to be strong consensus on:
Things that might need more thought or discussion to build strong consensus:
Things that need investigation:
|
First off, can someone please edit the top box here and describe clearly the intent of this PR, as previously requested? Without this focus stated, the discussion is not focused. |
I have heard taht NeXus has deprecated XML backends. This leads to the question of how much XML you plan to support transforming to? Namespacing is great, but there seems to be a huge debate on how or if to represent that in JSON. It seems like you could punt on some complexity if you don't intend to output XML. |
That's right. HDF5 is now the only supported on-disk format for NeXus data files. The decision to drop the XML backend was between 2012 and 2014-08. (Can't find the specific decision in the notes yet.) |
Provide a semantic mapping of the keys in the collected data to a known set of keys drawn from an externally owned vocabulary.