You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, thanks for your interesting work.
I found the implementation of NCE loss somehow different from what is described in your paper.
The dictionary videos have many redundant entries, as a class label can appear in multiple videos and be collected in multiple batches. I notice that all pairs of bsl-1k and dictionary features sharing the same class label are sampled as positive pairs even when they belong to two different batches, suggesting that some positive pairs can be included multiple times in the numerator.
At last, for each batch, the log ratio is computed by iterating over the duplicated dictionary entries (num_unique_dicts). According to this paper, it seems more reasonable to iterate over BSL-1k videos.
Moreover, the difference between using BSL-1k video or dictionary video as anchors for contrastive sampling is not reflected in the code implementation. Is it just a concept for better explaining construction of positive/negative pairs?
Thanks a lot for your help~
The text was updated successfully, but these errors were encountered:
Hi, thanks for your interesting work.
I found the implementation of NCE loss somehow different from what is described in your paper.
bsldict/loss/loss.py
Lines 149 to 154 in eea308a
bsldict/loss/loss.py
Lines 169 to 172 in eea308a
bsldict/loss/loss.py
Lines 177 to 179 in eea308a
Moreover, the difference between using BSL-1k video or dictionary video as anchors for contrastive sampling is not reflected in the code implementation. Is it just a concept for better explaining construction of positive/negative pairs?
Thanks a lot for your help~
The text was updated successfully, but these errors were encountered: