Grad-CAM extended for ConvNet-RNN structures (optionally) #45

evaldsurtans · 2017-06-27T14:43:30Z

No description provided.

raghakot · 2017-06-28T09:06:10Z

Can you explain the use-case for this? I can't glean much info from the commit.

evaldsurtans · 2017-06-29T08:09:09Z

For example ConNet-RNN looks like this:

model_target.add(TimeDistributed(Conv2D(32, (8, 8), strides=(4, 4), kernel_initializer=glorot_uniform(seed=init_seed), padding='same', activation='relu',
                                        input_shape=(high_dimensions_width, high_dimensions_height, high_dimensions_channels)), input_shape=(params['frames_back'], high_dimensions_width, high_dimensions_height, high_dimensions_channels)))
model_target.add(TimeDistributed(Conv2D(64, (4, 4), strides=(2, 2), kernel_initializer=glorot_uniform(seed=init_seed), activation='relu')))
model_target.add(TimeDistributed(Conv2D(64, (3, 3), kernel_initializer=glorot_uniform(seed=init_seed), padding='same', activation='relu')))
model_target.add(TimeDistributed(Reshape((-1,))))
model_target.add(LSTM(512, kernel_initializer=glorot_uniform(seed=init_seed), recurrent_initializer=orthogonal(seed=init_seed), input_shape=(params['frames_back'], low_dimensions_state), return_sequences=True, dropout=params['dropout'], recurrent_dropout=params['dropout']))
model_target.add(LSTM(512))
model_target.add(Dense(dimensions_actions,kernel_initializer=glorot_uniform(seed=init_seed), name='DenseLinear

This how it can be used:

seed_img = Image.fromarray(env.getScreenRGB())
seed_img = seed_img.convert('L').convert('RGB')
seed_img_arr = np.asarray(seed_img).astype('uint8')

action_idx = np.argmax(raw_q_values)

# x_input.shape = (1, 5, 48, 48, 3) (batch_size, time_steps, pixels_width, pixels_height, pixel_channels)

heatmap = visualize_cam(model_target, layer_idx, [action_idx], seed_img_arr, alpha=0.3, input_data_rnn=x_input)
heatmap_img = Image.fromarray(np.transpose(np.array(heatmap), axes=[1, 0, 2]))
timestamp = time.time()
seed_img = Image.fromarray(np.transpose(np.array(env.getScreenRGB()), axes=[1, 0, 2]))

composite_img = Image.new("RGB", (seed_img.size[0] * 2, seed_img.size[1]))
composite_img.paste(heatmap_img, (0, 0))
composite_img.paste(seed_img, (seed_img.size[0], 0))

This is how output looks like in 3D maze where agent focuses on red doors

raghakot · 2017-06-29T23:11:12Z

Nice. Looks like the model is trained using reinforcement learning. It would be really cool to have an example for this in examples/ if your code is not confidential or proprietary.

So whats the difference between model.input and input_data_rnn? From the code, it appears that you are using it to do this.

model_input = input_data_rnn[-1]
heatmap = heatmap[-1]

I dont quite understand what that does. Also, there was an API change. You should rebase. The code no longer tries to overlay heatmap since folks can use this to find heatmap on non-images or video frames as well.

With the new code, the heatmap will have the same shape as x_input and the overlaying part can be done outside.

Grad-CAM extended for ConvNet-RNN structures (optionally)

7c189b2

keisen requested a review from raghakot September 1, 2018 09:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Grad-CAM extended for ConvNet-RNN structures (optionally) #45

Grad-CAM extended for ConvNet-RNN structures (optionally) #45

evaldsurtans commented Jun 27, 2017

raghakot commented Jun 28, 2017

evaldsurtans commented Jun 29, 2017

raghakot commented Jun 29, 2017

Grad-CAM extended for ConvNet-RNN structures (optionally) #45

Are you sure you want to change the base?

Grad-CAM extended for ConvNet-RNN structures (optionally) #45

Conversation

evaldsurtans commented Jun 27, 2017

raghakot commented Jun 28, 2017

evaldsurtans commented Jun 29, 2017

raghakot commented Jun 29, 2017