Naming convention - Policy #40

kkshyu · 2016-10-23T21:09:12Z

Environment and Policy both contain act method, but they do quite different things.
In my opinion, act is a verb to perform sth. Therefore, in Policy abstract class, it should be a noun action just like bestAction. However, chooseAction and chooseBestAction are good, too.

class Policy(object):
    """Abstract class for all policies, i.e. objects that can take any space as input, and output an action.
    """

    def __init__(self, q_network, n_actions,random_state):
        self.q_network = q_network
        self.n_actions = n_actions
        self.random_state = random_state

        pass

    def bestAction(self, state):
        """ Returns the best Action
        """
        action = self.q_network.chooseBestAction(state)
        V = max(self.q_network.qValues(state))
        return action, V

    def act(self, state):
        """Main method of the Policy class. It can be called by agent.py, given a state,
        and should return a valid action w.r.t. the environment given to the constructor.
        """
        raise NotImplementedError()

The text was updated successfully, but these errors were encountered:

VinF · 2016-10-24T07:35:13Z

I agree that chooseAction may be more appropriate so as to avoid the confusion with the method in Environment(however I don't think chooseBestAction would be appropriate because the policies have to deal with the exploration/exploitation dilemma which is done within that method).

kkshyu · 2016-10-25T02:28:24Z

Did you mean chooseBestAtction have already dealt with exploration/exploitation dilemma?
I thought the dilemma should be implemented in act(chooseAction).
In my opinion, the method bestAction in policies is similar with chooseBestAction in q-network. Actually I don't understand the purpose of this method.

VinF · 2016-10-25T07:07:08Z

Ok, I think I misunderstood your first message. Either we have act--->chooseAction and bestAction--->chooseBestAction or we keep bestAction and we change act--->action. (of course, chooseBestAction or bestAction has already dealt with exploration/exploitation)
And indeed bestAction in policies returns currently directly chooseBestAction in q-network. It's an additional encapsulation but if it's not required in the future, the best is probably to remove it, indeed.
Thanks for the feedback! Either you can do a PR or I'll do the changes.

kkshyu · 2016-10-26T03:50:31Z

#41

kkshyu closed this as completed Oct 26, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Naming convention - Policy #40

Naming convention - Policy #40

kkshyu commented Oct 23, 2016

VinF commented Oct 24, 2016

kkshyu commented Oct 25, 2016

VinF commented Oct 25, 2016

kkshyu commented Oct 26, 2016

Naming convention - Policy #40

Naming convention - Policy #40

Comments

kkshyu commented Oct 23, 2016

VinF commented Oct 24, 2016

kkshyu commented Oct 25, 2016

VinF commented Oct 25, 2016

kkshyu commented Oct 26, 2016