Implementation of MIL-NCE loss #2

ChenYutongTHU · 2022-01-07T03:56:21Z

Hi, thanks for your interesting work.
I found the implementation of NCE loss somehow different from what is described in your paper.

The dictionary videos have many redundant entries, as a class label can appear in multiple videos and be collected in multiple batches. I notice that all pairs of bsl-1k and dictionary features sharing the same class label are sampled as positive pairs even when they belong to two different batches, suggesting that some positive pairs can be included multiple times in the numerator.

bsldict/loss/loss.py

Lines 149 to 154 in eea308a

    
           for i, t in enumerate(num_unique_dicts): 
        
               # find the set of pairs with the current dictionary class label 
        
               curr_dict = targets_dict == t 
        
               # find the bsl1k embeddings that share the same class label 
        
               curr_bsl1k = match_multi[:, curr_dict][:, 0]

Pairs from different batches sharing same labels are excluded in the denominator yet included in the numerator.

bsldict/loss/loss.py

Lines 169 to 172 in eea308a

    
           # Account for matches that occur in different batches 
        
           pos_neg_mask = (curr_dict.unsqueeze(0) | curr_bsl1k.unsqueeze(1)) 
        
           pos_neg_mask *= ~diff_batch_match 
        
           where_mask = torch.where(pos_neg_mask)

At last, for each batch, the log ratio is computed by iterating over the duplicated dictionary entries (num_unique_dicts). According to this paper, it seems more reasonable to iterate over BSL-1k videos.

bsldict/loss/loss.py

Lines 177 to 179 in eea308a

    
           for i, t in enumerate(num_unique_dicts): 
        
               numerator[i] = torch.logsumexp(distances[pos_mask_list[i]], dim=0) 
        
               denominator[i] = torch.logsumexp(distances[mask_list[i]], dim=0)

Moreover, the difference between using BSL-1k video or dictionary video as anchors for contrastive sampling is not reflected in the code implementation. Is it just a concept for better explaining construction of positive/negative pairs?

Thanks a lot for your help~

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation of MIL-NCE loss #2

Implementation of MIL-NCE loss #2

ChenYutongTHU commented Jan 7, 2022

Implementation of MIL-NCE loss #2

Implementation of MIL-NCE loss #2

Comments

ChenYutongTHU commented Jan 7, 2022