Imitation Learning

Architecture:

3 Convolutional layers to extract hierarchical features from input data.
3 fully connected layers for the final classification.
Batch Normalization layers to accelerate the training process.
ReLU and sigmoid activation functions are employed for introducing non-linearity between the layers.

Used DAgger to improve my reward after regression models were applied.

My Dagger iteration algorithm:

Policy Execution: The learned policy (infer_action) is executed in the environment, generating an episode of behavior.
Expert Correction: An expert provides corrective actions for “each state” encountered during policy execution. All the states in the episode are saved. Expert data is collected using a timer.
Data Aggregation: The new observations and expert actions are added to the training data.
Policy Refinement: The policy is re-trained using the aggregated dataset.
Model Saving: The updated model is saved after each iteration.

Dagger implementation can be found in dagger.py file.

Packages required:

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
model		model
README.md		README.md
augmentation.py		augmentation.py
dagger.py		dagger.py
imitations.py		imitations.py
main.py		main.py
network.py		network.py
training.py		training.py

Provide feedback