Abstract
For many forms of e-learning environments, the system’s behavior can be viewed as a sequential decision process wherein, at each discrete step, the system is responsible for selecting the next action to take. Pedagogical strategies are policies to decide the next system action when there are multiple ones available. In this project we present a Reinforcement Learning (RL) approach for inducing effective pedagogical strategies and empirical evaluations of the induced strategies. This paper addresses the technical challenges in applying RL to Cordillera, a Natural Language Tutoring System teaching students introductory college physics. The algorithm chosen for this project is a model-based RL approach, Policy Iteration, and the training corpus for the RL approach is an exploratory corpus, which was collected by letting the system make random decisions when interacting with real students. Overall, our results show that by using a rather small training corpus, the RL-induced strategies indeed measurably improved the effectiveness of Cordillera in that the RL-induced policies improved students’ learning gains significantly.
Similar content being viewed by others
References
Ai, H., Litman, D.J.: Knowledge consistent user simulations for dialog systems. In: Proceedings of Interspeech-2007, pp. 2697–2700, Antwerp, Belgium, 2007
Aleven, V., Ogan, A., Popescu, O., Torrey, C., Koedinger, K.R.: Evaluating the effectiveness of a tutorial dialogue system for self-explanation. In: Lester, J.C., Vicari, R.M., Paraguaçu, F. (eds.) Intelligent Tutoring Systems, 7th International Conference, ITS 2004, vol. 3220 of Lecture Notes in Computer Science, pp. 443–454, Maceiò, Alagoas, Brazil, 30 August–3 September. Springer, Berlin (2004)
Anderson J.R.: The Architecture of Cognition. Harvard University Press, Cambridge (1983)
Anderson J.R., Corbett A.T., Koedinger K.R., Pelletier R.: Cognitive tutors: lessons learned. J. Learn. Sci. 4(2), 167–207 (1995)
Baker, R.S., Corbett, A.T., Koedinger, K.R.: Detecting student misuse of intelligent tutoring systems. In: Lester, J.C., Vicari, R.M., Paraguaçu, F. (eds.) Intelligent Tutoring Systems, 7th International Conference, ITS 2004, vol. 3220 of Lecture Notes in Computer Science, pp. 531–540, Maceiò, Alagoas, Brazil, 30 August–3 September. Springer, Berlin (2004a)
Baker R.S., Corbett A.T., Koedinger K.R., Wagner A.Z.: Off-task behavior in the cognitive tutor classroom: when students “game the system”. In: Dykstra-Erickson, E., Tscheligi, M. (eds) CHI, pp. 383–390. ACM, New York (2004b)
Barnes T., Stamper J.C.: Toward automatic hint generation for logic proof tutoring using historical student data. In: Woolf, B.P., Aïmeur, E., Nkambou, R., Lajoie, S.P. (eds) Intelligent Tutoring Systems, vol. 5091 of Lecture Notes in Computer Science, pp. 373–382. Springer, Berlin (2008)
Beck, J., Woolf, B.P., Beal, C.R.: Advisor: a machine learning architecture for intelligent tutor construction. In: AAAI/IAAI, pp. 552–557. AAAI Press/The MIT Press, Menlo Park/Cambridge (2000)
Bernsen N.O., Dybkjaer L.: Designing Interactive Speech Systems: From First Ideas to User Testing. Springer-Verlag New York Inc, Secaucus (1997)
Chadés, I., Cros, M.-J., Garcia, F., Sabbadin, R.: Markov decision process (MDP) toolbox v2.0 for MATLAB (2005). http://www.inra.fr/internet/Departements/MIA/T/MDPtoolbox
Chi, M.: Do Micro-level tutorial decisions matter: applying reinforcement learning to induce pedagogical tutorial tactics. PhD thesis, Intelligent Systems Program, University of Pittsburgh, December (2009)
Chi M.T.H., de Leeuw N., Chiu M.-H., LaVancher C.: Eliciting self-explanations improves understanding. Cogn. Sci. 18(3), 439–477 (1994)
Chi, M. Jordan, P.W., VanLehn, K., Hall, M.: Reinforcement learning-based feature selection for developing pedagogically effective tutorial dialogue tactics. In: de Baker, R.S.J., Barnes, T., Beck, J.E. (eds.) The 1st International Conference on Educational Data Mining (EDM), pp. 258–265. Montreal, Québec, Canada (2008). www.educationaldatamining.org
Chi M., Jordan P.W., VanLehn K., Litman D.J.: To elicit or to tell: does it matter?. In: Dimitrova, V., Mizoguchi, R., du Boulay, B., Graesser, A.C. (eds) AIED, pp. 197–204. IOS Press, Amsterdam (2009)
Chi M., VanLehn K., Litman D.J., Jordan P.W.: Inducing effective pedagogical strategies using learning context features. In: De Bra, P., Kobsa, A., Chin, D.N. (eds) UMAP, vol. 6075 of Lecture Notes in Computer Science, pp. 147–158. Springer, Berlin (2010)
Collins A., Stevens A.: Goals and strategies for inquiry teachers. Adv. Instr. Psychol. 2, 65–119 (1982)
Collins A., Brown J.S., Newman S.E.: Cognitive apprenticeship: teaching the craft of reading, writing and mathematics. In: Resnick, L.B. (eds) Knowing, learning and instruction: essays in honor of Robert Glaser, chap. 14, pp. 453–494. Lawrence Erlbaum Associates, Hillsdale (1989)
Conati C., VanLehn K.: Toward computer-based support of meta-cognitive skills: a computational framework to coach self-explanation. Int. J. Artif. Intell. Educ. 11, 398–415 (2000)
Corbett, A.T., Anderson, J.R.: Locus of feedback control in computer-based tutoring: impact on learning rate, achievement and attitudes. In: CHI, pp. 245–252, Seattle, Washington, USA, 2001
D’Mello S.K., Graesser A.C.: Multimodal semi-automated affect detection from conversational cues, gross body language, and facial features. User Model. User-Adapt. Interact. 20(2), 147–187 (2010)
D’Mello S.K., Craig S.D., Witherspoon A.M., McDaniel B., Graesser A.C.: Automatic detection of learner’s affect from conversational cues. User Model. User-Adapt. Interact. 18(1–2), 45–80 (2008)
Forbes-Riley, K., Litman, D.J., Purandare, A., Rotaru, M., Tetreault, J.R.: Comparing linguistic features for modeling learning in computer tutoring. In: Luckin, R., Koedinger, K.R., Greer, J.E. (eds.): Artificial Intelligence in Education, Building Technology Rich Learning Contexts that Work, Proceedings of the 13th International Conference on Artificial Intelligence in Education, AIED 2007, vol. 158 of Frontiers in Artificial Intelligence and Applications, pp. 270–277, Los Angeles, California, USA, July 9–13. IOS Press, Amsterdam (2007)
Frampton, M., Lemon, O.: Reinforcement learning of dialogue strategies using the user’s last dialogue act. In: Proceedings of the IJCAI Workshop on K&R in Practical Dialogue Systems, pp. 62–67 (2005)
Frampton, M., Lemon, O.: Learning more effective dialogue strategies using limited dialogue move features. In: Calzolari, N., Cardie, C., Isabelle, P. (eds.) ACL 2006, 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, Sydney, Australia, pp. 185–192, 17–21 July 2006. The Association for Computational Linguistics, Uppsala (2006)
Graesser A.C., Person N.K., Magliano J.P.: Collaborative dialog patterns in naturalistic one-on-one tutoring. Appl. Cogn. Psychol. 9(6), 422–495 (1995)
Graesser A.C., VanLehn K., Rosé C.P., Jordan P.W., Harter D.: Intelligent tutoring systems with conversational dialogue. AI Magazine 22(4), 39–52 (2001)
Hauskrecht, M.: Planning and control in stochastic domains with imperfect information. PhD thesis, MIT (1997) (Available as Technical Report: MIT-LCS-TR-738, 1997)
Henderson, J., Lemon, O., Georgila, K.: Hybrid reinforcement/supervised learning for dialogue policies from communicator data. In: IJCAI Workshop on K&R in Practical Dialogue Systems, pp. 68–75, 2005
Iglesias A., Martínez P., Fernández F.: An experience applying reinforcement learning in a web-based adaptive and intelligent educational system. Infor. Educ. 2(2), 223–240 (2003)
Iglesias A., Martínez P., Aler R., Fernández F.: Learning teaching strategies in an adaptive and intelligent educational system through reinforcement learning. Appl. Intell. 31, 89–106 (2009a). doi:10.1007/s10489-008-0115-1
Iglesias, A., Martínez, P., Aler, R., Fernández, F.: Reinforcement learning of pedagogical policies in adaptive and intelligent educational systems. Knowledge-Based Syst. 22(4), 266–270 (2009b) (Artificial Intelligence (AI) in Blended Learning)
Janarthanam, S., Lemon, O.: User simulations for online adaptation and knowledge-alignment in troubleshooting dialogue systems. In: Proceedings of LonDial the 12th SEMdial Workshop on the Semantics and Pragmatics of Dialogues, pp. 51–58, Stockholm, 2008
Jolliffee I.T.: Principal Component Analysis, Springer Series in Statistics, 2nd edn. Springer, New York (2002)
Jordan, P.W., Ringenberg, M.A., Hall, B.: Rapidly developing dialogue systems that support learning studies. In: ITS06 Workshop on Teaching with Robots, Agents and NLP, pp. 29–36 (2006). http://facweb.cs.depaul.edu/elulis/ITS2006RobotsAgentsWorkshop.html
Jordan, P.W., Hall, B., Ringenberg, M., Cue, Y., Rosé, C.: Tools for authoring a dialogue agent that participates in learning studies. In: Luckin, R., Koedinger, K.R., Greer, J.E. (eds.) Artificial Intelligence in Education, Building Technology Rich Learning Contexts that Work, Proceedings of the 13th International Conference on Artificial Intelligence in Education, AIED 2007, vol. 158 of Frontiers in Artificial Intelligence and Applications, pp. 43–50, Los Angeles, CA, USA, July 9–13. IOS Press, Amsterdam (2007)
Kaelbling L.P., Littman M.L., Moore A.W.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4, 237–285 (1996)
Katz S., O’Donnell G., Kay H.: An approach to analyzing the role and structure of reflective dialogue. Int. J. Artif. Intell. Educ. 11(3), 320–343 (2000)
Koedinger K.R., Aleven V.: Exploring the assistance dilemma in experiments with cognitive tutors. Educ. Psychol. Rev 19(3), 239–264 (2007)
Koedinger K.R., Anderson J.R., Hadley W.H., Mark M.A.: Intelligent tutoring goes to school in the big city. Int. J. Artif. Intell. Educ. 8(1), 30–43 (1997)
Levin, E., Pieraccini, R.: A stochastic model of computer–human interaction for learning dialogue strategies. In: EUROSPEECH 97, pp. 1883–1886, 1997
Litman, D.J., Silliman, S.: Itspoke: an intelligent tutoring spoken dialogue system. In: Demonstration Papers at HLT-NAACL 2004, pp. 5–8. Association for Computational Linguistics, Morristown, NJ, USA (2004)
Martin, K.N., Arroyo, I.: Agentx: using reinforcement learning to improve the effectiveness of intelligent tutoring systems. In: Lester, J.C., Vicari, R.M., Paraguaçu, F. (eds.) Intelligent Tutoring Systems, 7th International Conference, ITS 2004, vol. 3220 of Lecture Notes in Computer Science, pp. 564–572, Maceiò, Alagoas, Brazil, 30 August–3 September. Springer, Berlin (2004)
McKendree J.: Effective feedback content for tutoring complex skills. Human–Computer Interaction 5(4), 381–413 (1990)
Moore J.D., Porayska-Pomsta K., Varges S., Zinn C.: Generating tutorial feedback with affect. In: Barr, V., Markov, Z. (eds) FLAIRS Conference, pp. 923–928. Menlo Park, (2004)
Newell, A.: Unified Theories of Cognition, Reprint edition. Harvard University Press, Cambridge (1994)
Paek, T., Chickering, D.: The Markov assumption in spoken dialogue management. In: 6th SIGDial Workshop on Discourse and Dialogue, pp. 35–44, 2005
Pain, H., Porayska-Pomsta, K.: Affect in one-to-one tutoring. In: Ikeda, M., Ashley, K.D., Chan, T.-W. (eds.): Intelligent Tutoring Systems, 8th International Conference, ITS 2006, p. 817, Jhongli, Taiwan, 26–30 June 2006, Proceedings, vol. 4053 of Lecture Notes in Computer Science. Springer, Berlin (2006)
Phobun, P., Vicheanpanya, J.: Adaptive intelligent tutoring systems for e-learning systems. Procedia Soc. Behav. Sci. 2(2), 4064–4069 (2010)
Porayska-Pomsta K., Mavrikis M., Pain H.: Diagnosing and acting on student affect: the tutor’s perspective. User Model. User-Adapt. Interact. 18(1–2), 125–173 (2008)
Raux, A., Langner, B., Bohus, D., Black, A.W., Eskenazi, M.: Let’s go public! Taking a spoken dialog system to the real world. In: Proceedings of Interspeech (Eurospeech), pp. 885–888, Lisbon Portugal, 2005
Rieser, V., Lemon, O.: Using machine learning to explore human multimodal clarification strategies. In: Calzolari, N., Cardie, C., Isabelle, P. (eds.) ACL 2006, 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, Sydney, Australia, pp. 659–666, 17–21 July 2006. The Association for Computational Linguistics, Uppsala (2006)
Ringenberg, M.A., VanLehn, K.: Scaffolding problem solving with annotated, worked-out examples to promote deep learning. In: Ikeda, M., Ashley, K.D., Chan, T.-W. (eds.): Intelligent Tutoring Systems, 8th International Conference, ITS 2006, pp. 625–636, Jhongli, Taiwan, 26–30 June 2006, Proceedings, vol. 4053 of Lecture Notes in Computer Science. Springer, Berlin (2006)
Rudnicky, A., Thayer, E., Constantinides, P., Tchou, C., Shern, R., Lenzo, K., Xu, W., Oh, A.: Creating natural dialogs in the Carnegie Mellon communicator system. In: Proceedings of Eurospeech, vol. 4, pp. 1531–1534, 1999
Singh S.P., Kearns, M.J., Litman, D.J., Walker, M.A.: Reinforcement learning for spoken dialogue systems. In: Solla S.A., Leen, T.K., Müller, K.-R. (eds.) NIPS, pp. 956–962. The MIT Press, Cambridge (1999)
Singh S.P., Litman D.J., Kearns M.J., Walker M.A., Marilyn A.: Optimizing dialogue management with reinforcement learning: Experiments with the NJfun system. J. Aritif. Intell. Res. (JAIR) 16, 105–133 (2002)
Stamper, J.C., Barnes, T., Croy, M.J.: Extracting student models for intelligent tutoring systems. In: AAAI, pp. 1900–1901, Vancouver, British Columbia, Canada, July 22–26. AAAI Press, Stanford, CA (2007)
Sutton R.S., Barto A.G.: Reinforcement Learning. MIT Press Bradford Books, Cambridge (1998)
Tetreault, J.R., Litman, D.J.: Comparing the utility of state features in spoken dialogue using reinforcement learning. In: Moore, R.C., Bilmes, J.A., Chu-Carroll, J., Sanderson, M. (eds.) Proceedings of the Human Language Technology Conference of the NAACL, Main Conference, pp. 272–279. The Association for Computational Linguistics, New York (2006a)
Tetreault, J.R., Litman, D.J.: Using reinforcement learning to build a better model of dialogue state. In: Proceedings 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pp. 289–296, Trento, Italy, 2006b
Tetreault J.R., Bohus D., Litman D.J.: Estimating the reliability of MDP policies: a confidence interval approach. In: Sidner, C.L., Schultz, T., Stone, M., Zhai, C. (eds) HLT-NAACL, pp. 276–283. The Association for Computational Linguistics, Boston (2007)
Tetreault J.R., Litman D.J.: A reinforcement learning approach to evaluating state representations in spoken dialogue systems. Speech Commun. 50(8–9), 683–696 (2008)
VanLehn K.: The behavior of tutoring systems. Int. J. Artif. Intell. Educ. 16(3), 227–265 (2006)
VanLehn K., Graesser A.C., Jackson G.T., Jordan P.W., Olney A., Rosé C.P.: When are tutorial dialogues more effective than reading?. Cogn. Sci. 31(1), 3–62 (2007a)
VanLehn, K., Jordan, P., Litman, D.: Developing pedagogically effective tutorial dialogue tactics: Experiments and a testbed. In: Proceedings of SLaTE Workshop on Speech and Language Technology in Education ISCA Tutorial and Research Workshop, pp. 17–20, 2007b
VanLehn K., Lynch C., Schulze K., Shapiro J.A., Shelby R., Taylor L., Treacy D., Weinstein A., Wintersgill M.: The andes physics tutoring system: lessons learned. Int. J. Artif. Intell. Educ. 15(3), 147–204 (2005)
Vygotsky, L.S.: Interaction between learning and development. In: Mind and Society, pp. 79–91. Harvard University Press, Cambridge MA (1978)
Walker M.A.: An application of reinforcement learning to dialogue strategy selection in a spoken dialogue system for email. J. Artif. Intell. Res. 12, 387–416 (2000)
Williams, J.D., Poupart, P., Young, S.J.: Factored partially observable Markov decision processes for dialogue management. In: 4th Workshop on Knowledge and Reasoning in Practical Dialog Systems, International Joint Conference on Artificial Intelligence (IJCAI), pp. 76–82, Edinburgh, 2005
Williams J.D., Young S.: Partially observable Markov decision processes for spoken dialog systems. Comput. Speech Lang. 21(2), 231–422 (2007a)
Williams J.D., Young S.: Scaling POMDPs for spoken dialog management. IEEE Trans. Audio Speech Lang. Process. 15(7), 2116–2129 (2007b)
Wylie, R., Koedinger, K., Mitamura, T.: Is self-explanation always better? the effects of adding self-explanation prompts to an english grammar tutor. In: Proceedings of the 31st Annual Conference of the Cognitive Science Society, COGSCI 2009, pp. 1300–1305, Amsterdam, The Netherlands, 2009
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chi, M., VanLehn, K., Litman, D. et al. Empirically evaluating the application of reinforcement learning to the induction of effective and adaptive pedagogical strategies. User Model User-Adap Inter 21, 137–180 (2011). https://doi.org/10.1007/s11257-010-9093-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11257-010-9093-1