Аннотация:For large scale problems Q-Learning often suffers from the Curse of Dimensionality due to large numbers of possible state-action pairs. This paper develops a multiresolution state-space discretization method for the episodic unsupervised learning method of Q-Learning, in which a state-space is adaptively discretized by progressively finer grids around the areas of interest within the state or learning space. Optimality of the learning algorithm is addressed by a cost function. Applied to a morphing airfoil with two morphing parameters (two state variables), it is shown that by setting the multiresolution method to define the area of interest by the goal the agent seeks, this method can learn a specific goal within plusmn0.002, while reducing the total number of state-action pairs need to achieve this level of specificity by almost 90%.