Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dmdt feature code to lcstats #262

Merged
merged 11 commits into from
Mar 7, 2023

Conversation

bfhealy
Copy link
Collaborator

@bfhealy bfhealy commented Feb 21, 2023

This PR adds the compute_dmdt function (adapted from https://github.com/dmitryduev/kowalski-archived/blob/7b9582fae6cd37bbd334bca228ef429d96e0e498/kowalski/dev/ingest_ZTF_source_features.py#L439) to lcstats.py. A numba dependency is also added to the scope requirements to satisfy behind-the-scenes code for this feature.

The default dmdt bin intervals have been changed from v20200318 to v20200205, allowing for longer-timescale variability to be distinguished on the dmdt histograms. See the added code below for the exact bin intervals.

Copy link
Collaborator

@AshishMahabal AshishMahabal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't seem to see the latest dmdt version. What is it called? It should be something like v202302??, right?
Also, we had removed 1/145,2/145,3/145 days that had been originally used with CRTS because of our cadence. However, if we wish to take advantage of the fast cadence where we sit at one location, we may want to introduce a couple bins at that end. When there is no high-cadence data those set of bins will be empty, but will not hurt the classification.

@bfhealy
Copy link
Collaborator Author

bfhealy commented Feb 21, 2023

@AshishMahabal The current default for the bins is v20200205 - I didn't rename it yet from the date that was in the code. I will take a look at adding some of those faster cadence bins again and create a new dictionary entry with something like v20230221. Should we still maintain a 26x26 shape for each histogram?

@bfhealy
Copy link
Collaborator Author

bfhealy commented Feb 21, 2023

@AshishMahabal I should also mention that following the recipe of the first Scope paper, before generating features I plan to drop all but the first datapoint in a group of points separated by less than 30 mins. Thus there may be limited use for the short-cadence bins right now.

@bfhealy
Copy link
Collaborator Author

bfhealy commented Feb 22, 2023

Hi @AshishMahabal, please let me know if the intervals under v20230221 are appropriate. They would maintain a 26x26 histogram, use the same magnitude bins as set for DR5, and extend the time baseline out to 2000 days.

@bfhealy bfhealy temporarily deployed to Integrate Pull Request March 1, 2023 17:56 — with GitHub Actions Inactive
@bfhealy bfhealy temporarily deployed to Integrate Pull Request March 1, 2023 17:57 — with GitHub Actions Inactive
@bfhealy bfhealy requested a review from AshishMahabal March 6, 2023 22:43
@bfhealy bfhealy temporarily deployed to Integrate Pull Request March 6, 2023 22:44 — with GitHub Actions Inactive
@bfhealy bfhealy temporarily deployed to Integrate Pull Request March 6, 2023 22:44 — with GitHub Actions Inactive
Copy link
Collaborator

@AshishMahabal AshishMahabal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. We should consider using the high cadence sequences at least separately. May be a dmdt just when there is high cadence data.

@bfhealy bfhealy temporarily deployed to Integrate Pull Request March 7, 2023 17:26 — with GitHub Actions Inactive
@bfhealy bfhealy had a problem deploying to Integrate Pull Request March 7, 2023 17:26 — with GitHub Actions Failure
@bfhealy bfhealy had a problem deploying to Integrate Pull Request March 7, 2023 18:41 — with GitHub Actions Failure
@bfhealy bfhealy temporarily deployed to Integrate Pull Request March 7, 2023 18:41 — with GitHub Actions Inactive
@bfhealy bfhealy requested a review from mcoughlin March 7, 2023 18:43
Copy link
Collaborator

@mcoughlin mcoughlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question if you want the bins in the config instead.

warnings.filterwarnings("ignore")


# Define bin intervals for dmdt histograms
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this better in the config? Or do you prefer it in the code?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems better to put it in the config now that you mention it.

@bfhealy bfhealy temporarily deployed to Integrate Pull Request March 7, 2023 20:15 — with GitHub Actions Inactive
@bfhealy bfhealy temporarily deployed to Integrate Pull Request March 7, 2023 20:15 — with GitHub Actions Inactive
@bfhealy bfhealy merged commit f3c9cc4 into ZwickyTransientFacility:main Mar 7, 2023
@bfhealy bfhealy deleted the migrate-feature-code branch March 8, 2023 23:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants