Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grad-CAM extended for ConvNet-RNN structures (optionally) #45

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

evaldsurtans
Copy link

No description provided.

@raghakot
Copy link
Owner

Can you explain the use-case for this? I can't glean much info from the commit.

@evaldsurtans
Copy link
Author

For example ConNet-RNN looks like this:

model_target.add(TimeDistributed(Conv2D(32, (8, 8), strides=(4, 4), kernel_initializer=glorot_uniform(seed=init_seed), padding='same', activation='relu',
                                        input_shape=(high_dimensions_width, high_dimensions_height, high_dimensions_channels)), input_shape=(params['frames_back'], high_dimensions_width, high_dimensions_height, high_dimensions_channels)))
model_target.add(TimeDistributed(Conv2D(64, (4, 4), strides=(2, 2), kernel_initializer=glorot_uniform(seed=init_seed), activation='relu')))
model_target.add(TimeDistributed(Conv2D(64, (3, 3), kernel_initializer=glorot_uniform(seed=init_seed), padding='same', activation='relu')))
model_target.add(TimeDistributed(Reshape((-1,))))
model_target.add(LSTM(512, kernel_initializer=glorot_uniform(seed=init_seed), recurrent_initializer=orthogonal(seed=init_seed), input_shape=(params['frames_back'], low_dimensions_state), return_sequences=True, dropout=params['dropout'], recurrent_dropout=params['dropout']))
model_target.add(LSTM(512))
model_target.add(Dense(dimensions_actions,kernel_initializer=glorot_uniform(seed=init_seed), name='DenseLinear

This how it can be used:

seed_img = Image.fromarray(env.getScreenRGB())
seed_img = seed_img.convert('L').convert('RGB')
seed_img_arr = np.asarray(seed_img).astype('uint8')

action_idx = np.argmax(raw_q_values)

# x_input.shape = (1, 5, 48, 48, 3) (batch_size, time_steps, pixels_width, pixels_height, pixel_channels)

heatmap = visualize_cam(model_target, layer_idx, [action_idx], seed_img_arr, alpha=0.3, input_data_rnn=x_input)
heatmap_img = Image.fromarray(np.transpose(np.array(heatmap), axes=[1, 0, 2]))
timestamp = time.time()
seed_img = Image.fromarray(np.transpose(np.array(env.getScreenRGB()), axes=[1, 0, 2]))

composite_img = Image.new("RGB", (seed_img.size[0] * 2, seed_img.size[1]))
composite_img.paste(heatmap_img, (0, 0))
composite_img.paste(seed_img, (seed_img.size[0], 0))

This is how output looks like in 3D maze where agent focuses on red doors
image

@raghakot
Copy link
Owner

Nice. Looks like the model is trained using reinforcement learning. It would be really cool to have an example for this in examples/ if your code is not confidential or proprietary.

So whats the difference between model.input and input_data_rnn? From the code, it appears that you are using it to do this.

model_input = input_data_rnn[-1]
heatmap = heatmap[-1]

I dont quite understand what that does. Also, there was an API change. You should rebase. The code no longer tries to overlay heatmap since folks can use this to find heatmap on non-images or video frames as well.

With the new code, the heatmap will have the same shape as x_input and the overlaying part can be done outside.

@keisen keisen requested a review from raghakot September 1, 2018 09:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants