O. Abul, F. Polat, and R. Alhajj, "Multiagent reinforcement learning
using function approximation," IEEE Trans. Syst., Man, Cybern. C, Appl.
Rev., vol. 4, no. 4, pp. 485-497, Nov 2000.
 J. Allen, D. Byron, M. Dzikovska, G. Ferguson, L. Galescu, and
A. Stent, "Towards conversational humancomputer interaction," AI Magazine,
vol. 22, no. 4, pp. 27-38, 2001.
 L. Busoniu, B. D. Schutter, and R. Babuska, "Decentralized reinforcementlearning
control of a robotic manipulator," in Proc. 9th Int. Conf.
Control Autom. Robot. Vis. (ICARCV-06), Singapore, 2006, pp. 1347-
 H. Cuayahuitl, S. Renals, O. Lemon, and H. Shimodaira, "Reinforcement
learning of dialogue strategies using hierarchical abstract machines," in
Proc. of IEEE/ACL SLT, 2006.
 L. V. de Wege, "Learning automata as a framework for multi-agent
reinforcement learning," Master-s thesis, Vrije Universiteit Brussel,
 S. Dzeroski, L. D. Raedt, and K. Driessens, "Relational reinforcement
learning," Machine Learning, vol. 43, no. 1-2, pp. 7-52, 2001.
 R. P. E. Levin and W. Eckert, "A stochastic model of humanmachine
interaction for learning dialog strategies," IEEE Trans. Speech Audio
Processing, vol. 8, no. 1, pp. 11-23, 2000.
 F. Fernandez and L. E. Parker, "Learning in large cooperative multirobot
systems," International Journal of Autonomous Robots, vol. 16,
no. 4, pp. 217-226, 2001.
 D. Goddeau, H. Meng, J. Polifroni, S. Seneff, and I. S. Busayapongcha,
"A form-based dialogue manager for spoken language applications," in
Proc. of ICSLP, Philadelphia, USA, 1996, pp. 701-704.
 J. Henderson, O. Lemon, and K. Georgila, "Hybrid reinforcement/
supervised learning for dialogue policies from communicator data,"
in Workshop on Knowledge and Reasoning in Practical Dialogue
Systems (IJCAI), 2005.
 Y. Ishiwaka, T. Sato, and Y. Kakazu, "An approach to the pursuit
problem on a heterogeneous multiagent system using reinforcement
learning," Robot. Autonomous System, vol. 43, no. 4, pp. 245-256, 2003.
 M. Mctear, "Modelling spoken dialogues with state transition diagrams:
Experiences with the cslu toolkit," in Proc. of ICSLP, Sidney, Australia,
1998, pp. 1223-1226.
 K. S. Narendra and S. Lakshmivarrahan, "Learning automataÔÇöa critique,"
Journal of Cybernetics and Information Sciences, pp. 53-66,
 K. S. Narendra and M. A. L. Thathachar, "Learning automata-a survey,"
IEEE Transaction on Systems, Man and Cybernetics-SMC, vol. 4, no. 8,
pp. 323-334, 1974.
 ÔÇöÔÇö, Learning Automata: An Introduction. Englewood Cliffs, NJ:
 A. Now'e, K. Verbeeck, and M. Peeters, "Learning automata as a basis
for multi agent reinforcement learning," Learning and Adaption in Multi-
Agent, pp. 71-85, 2006, ISSN 0302-9743.
 B. J. Oommen and M. Agache, "Continuous and discretized pursuit
learning schemes: Various algorithms and their comparison," IEEE
Transactions on Systems, Man and Cybernetics, Part B:, vol. 32, pp.
 T. Peak and D. Chickering, "The markov assumption in spoken dialogue
management," in 6th SIGDial Workshop on Discourse and Dialogue,
 T. Peak and R. Pieraccini, "Automating spoken dialogue management
design using machine learning: an industry perspective," Speech Communication,
vol. 50, pp. 716-729, 2008.
 O. Pietquin, A Framework for Unsupervised Learning of Dialogue
Strategies. Preses Universitaries de Louvain, 2004.
 O. Pietquin and T. Dutoit, "A probabilistic framework for dialog
simulation and optimal strategy learning," IEEE Transactions on Audio,
Speech and Language Processing, vol. 14, no. 2, pp. 589-599, 2006.
 A. S. Poznyak and K. Najim, Learning Automata and Stochastic
Optimization. Springer, 1997.
 M. K. S. Singh, D. Litman and M. Walker, "Optimizing dialogue
management with reinforcement learning: experiments with the njfun
system," Journal of Artificial Intelligence Research, vol. 16, pp. 105-
 J. Schatzmann, K. Weilhammer, M. N. Stuttle, and S. Young, A Survey
of Statistical User Simulation Techniques for Reinforcement-Learning of
Dialogue Management Strategies. Cambridge University Press, 2006,
vol. 21, pp. 97-126.
 K. Scheffler and S. Young, "Corpus-based dialogue simulation for
automatic strategy learning and evaluation," in Proc. of the NAACL
Workshop on Adaptation in Dialogue Systems, 2001.
 R. Sutton and A. Barto, Reinforcement learning: An introduction. MIT
 H. Tamakoshi and S. Ishii, "Multiagent reinforcement learning applied
to a chase problem in a continuous world," Artif. Life Robot, vol. 5,
no. 4, pp. 202-206, 2001.
 M. A. L. Thathachar and P. S. Sastry, Networks of Learning Automata:
Techniques for Online Stochastic Optimization. Norwell, MA: Kluwer,
 K. Tuyls and A. Now'e, "Evolutionary game theory and multi-agent
reinforcement learning," The Knowledge Engineering Review, vol. 20,
pp. 63-90, 2005, ISSN 0269-8889.
 K. Verbeeck, A. Now'e, P. Vrancx, and P. Maarten, Multi-Automat
Learning, Reinforcement Learning: Theory and Applications. I-Tech
Education and Publishing, 2008.
 K. Verbeeck, P. Vrancx, and A. Now'e, "Networks of learning automata
and limiting games," in Proc. of the 7th ALAMAS Symposium, 2007, pp.
171-182, ISSN 0922-8721.
 M. Walker, D. Litman, C. Kamm, and A. Abella, "Paradise: A framework
for evaluating spoken dialogue agents," in Proc. of the 5th annual
meeting of the association for computational linguistics(ACL-97), 1997,
 C. J. C. H. Watkins and P. Dayan, "Q-learning," Machine learning,
vol. 8, no. 3, pp. 229-256, 1992.
 R. M. Wheeler and K. S. Narendra, "Decentralized learning in finite
markov chains," IEEE Transactions on Automatic Control, vol. AC-31,
pp. 519-526, 1986.
 M. Wiering, R. Salustowicz, and J. Schmidhuber, "Reinforcement learning
soccer teams with incomplete world models," Autonomous. Robots,
vol. 7, no. 1, pp. 77-88, 1999.