-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix drop log #9768
base: main
Are you sure you want to change the base?
Fix drop log #9768
Conversation
Yes this looks like a bug to me
Agreed, I wouldn't bother appending more and more
Agreed we shouldn't go this route unless we really need to / have strong motivating use cases at some point. |
@jsosulski FYI we're planning on releasing 0.24 in just over a week. Do you have time to work on this in that time frame? If not, do you want to work on it afterward for the next release, or would you prefer someone else try to help / take over? |
I will check today evening if I can fix the unit tests while still keeping their semantic logic the same as before. If not I will let you know. |
I managed to fix the unit test for my case but broke several others by doing so. So either I will retry for a future release, or someone else can try it. |
select_all = list(range(len(inst.drop_log))) | ||
for k in np.setdiff1d(select_all, key_selection): | ||
if reason not in drop_log[k]: | ||
drop_log[k] += (reason,) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this fix has an issue. Take for example, epochs1 = Epochs(raw, events, dict(a=1, b=2))
we could drop some events for event type a
, e.g., due to MEG2663. Then imagine we do epochs1['a']
. When this happens, this code will go over all indices that are not in selection
, but we know that some of the ones not in selection
were originally a
type -- and these will end up having 'IGNORED'
added to their drop log even though they shouldn't.
I'm not sure of a good way to fix this, but I'll push a test in a separate PR that exposes the problem in this one at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. Maybe I could add a case differentiation depending on whether the selection was done by class/event or by index. In the latter case, this issue cannot appear as we cannot select dropped epochs by index.
Btw, am I missing any other epochs[idx]
possibilities? I.e. idx
can be: string -> select by one or multiple events or it can be: slice/int/range -> select by epoch index.
I just saw in the docs idx
can be a pandas query, in which case I dont know if unselected epochs should get the 'IGNORED'
tag or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m I missing any other epochs[idx] possibilities
For this first I would look at the doc
https://mne.tools/stable/generated/mne.Epochs.html#mne.Epochs.__getitem__
idx can be a pandas query, in which case I dont know if unselected epochs should get the 'IGNORED' tag or not.
... then for the real answer to all these dig into mne.utils.mixin.py:GetEpochsMixin._getitem
After some detachment of this issue, I think I was trying to misuse the droplog for my purposes. Do you prefer I force push my changes (needed to rebase locally) to this PR, or close it and create a fresh one? |
Force push is better I think, keeps all discussion in one place |
Nevermind, I cant figure out a way to make this work without re-structuring the drop_log / selection / events setup, because lets say you have first five epochs as target events, and the next five epoch as nontargets, and the last 2 of each class are bad: epo # 5 target epochs, 5 nontarget epochs
epo.drop_bad() # 3 target epochs, 3 nontarget epochs
epo.selection # is now [0, 1 ,2, 5, 6, 7]
len(epo.events) # is now 6
epo.drop_log # is now ((), (), (), ('d'), ('d'), (), (), (), ('d'), ('d'))
epo_t = epo['target'] # 3 target epochs, 0 nontarget epochs
epo_t.selection # is now [0, 1, 2]
# Current MNE Version:
epo_t.drop_log # is now ((), (), (), ('d'), ('d'), ('IGNORED'), ('IGNORED'), ('IGNORED'), ('d'), ('d'))
# This PR as of now:
epo_t.drop_log # is now ((), (), (), ('IGNORED','d'), ('IGNORED','d'), ('IGNORED'), ('IGNORED'), ('IGNORED'), ('IGNORED','d'), ('IGNORED','d')) A workaround would be to call What I would need is to have the event-wise information for all ('d') drop log entries, but the events have been dropped as far as I can tell, and I can no longer conclude which of the drops belonged to which event. Or am I missing something? The only solution I can think of is to have the original events array somewhere in the epochs object available. Without this information I think only having |
For this specific use case could you make use of
I could see a |
Sure I can do that. I think I could also call A |
It's not a good idea to use private functions, if this is the preferred solution we'd want to make it public for you. I'm not sure what the best solution is here, what do you think allows us to make the smallest API changes / expansion while still enabling you to do what you need (and also preventing other users from being confused or making mistakes) @jsosulski ? |
It is no urgent need for my use case, as I can just call After thinking about it a bit more, I think the core issues is that Then drop log stats could then simply use Anyway this is a imo larger change, and to preserve backwards compatibility I would instead add, e.g., |
I think there are many (years worth)
Okay, want to update the PR this way? I don't fully follow the case(s) that have changed, I'm hoping that the updated tests will make it clearer. Another complementary option would be to add a short section in the tutorial (or example) docs about how to get the functionality and information you need using the current way of doing things (or way of doing things with the fix in this PR). |
Reference issue
Fixes #9726
What does this implement/fix?
Instead of adding
'IGNORED'
only to empty tuples in the drop log, this will append it to all drop log entries when slicing an epoch object to a subset.Additional information
I fixed one epoch unit test, which was straight forward, however the second one that fails is
test_drop_epochs_mult
where there is a section I don't really get:mne-python/mne/tests/test_epochs.py
Lines 2367 to 2373 in 6c0006d
There are several comparisons to empty lists
[]
which, as far as I understand the drop log correctly, can never be true/false as the entries can only be empty tuples, not empty lists. I could take a stab at fixing this unit test as well, once I understand it correctly.Also currently I assume that a drop reason should be present only once in a drop log entry, however, one could argue that multiples are okay, to e.g. reconstruct the order of subset selections if for some reason someone would be interested in this.
OT: Probably too big of a change, but using
'UNSELECTED'
or'DESELECTED'
as a drop reason when slicing epochs would make the drop log explain more precisely why the epochs were dropped. One could then add'IGNORED'
as a special case that incorporates'UNSELECTED'
and other common reasons, or just use an enum.