-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dmdt feature code to lcstats #262
Add dmdt feature code to lcstats #262
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't seem to see the latest dmdt version. What is it called? It should be something like v202302??, right?
Also, we had removed 1/145,2/145,3/145 days that had been originally used with CRTS because of our cadence. However, if we wish to take advantage of the fast cadence where we sit at one location, we may want to introduce a couple bins at that end. When there is no high-cadence data those set of bins will be empty, but will not hurt the classification.
@AshishMahabal The current default for the bins is |
@AshishMahabal I should also mention that following the recipe of the first Scope paper, before generating features I plan to drop all but the first datapoint in a group of points separated by less than 30 mins. Thus there may be limited use for the short-cadence bins right now. |
Hi @AshishMahabal, please let me know if the intervals under |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. We should consider using the high cadence sequences at least separately. May be a dmdt just when there is high cadence data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a question if you want the bins in the config instead.
tools/featureGeneration/lcstats.py
Outdated
warnings.filterwarnings("ignore") | ||
|
||
|
||
# Define bin intervals for dmdt histograms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this better in the config? Or do you prefer it in the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems better to put it in the config now that you mention it.
This PR adds the
compute_dmdt
function (adapted from https://github.com/dmitryduev/kowalski-archived/blob/7b9582fae6cd37bbd334bca228ef429d96e0e498/kowalski/dev/ingest_ZTF_source_features.py#L439) tolcstats.py
. Anumba
dependency is also added to the scope requirements to satisfy behind-the-scenes code for this feature.The default dmdt bin intervals have been changed from
v20200318
tov20200205
, allowing for longer-timescale variability to be distinguished on the dmdt histograms. See the added code below for the exact bin intervals.