Authorable Critiquing for Intelligent Educational Systems

Christopher K. Riesbeck
The Institute for the Learning Sciences
& Department of Computer Science
Northwestern University
Evanston, IL 60201 USA
+1 847 491 3500
riesbeck@ils.nwu.edu
Wolff Dobson
The Institute for the Learning Sciences
& Department of Computer Science
Northwestern University
Evanston, IL 60201 USA
+1 847 491 3500
wolff@cs.nwu.edu

ABSTRACT

An important issue in intelligent interfaces is making them as authorable as non-intelligent interfaces. In this paper, we describe Indie, an authoring tool for intelligent interactive education and training environments, with particular emphasis on how authors create knowledge bases for critiquing student arguments. A central problem was providing authors with tools that supported the entire development process from mock-up to final product. Two key ideas are: (1) MVC-based event-action triggers to support a gradual migration from interface-based to model-driven interactions (2) rule-based evidence assessment events.

Keywords

Intelligent learning environments, educational systems, authoring tools, goal-based scenarios

INTRODUCTION

Since 1989, the Institute for the Learning Sciences has been developing a number of intelligent interactive environments for education and training, called goal-based scenarios (GBS's)[4, 5]. A GBS is a simulated world in which students learn by doing under the watchful eye of coaches, critics, and experts. In a GBS, a student is given a specific role, a set of goals, an initial scenario, and tools for collecting information, taking action, and making arguments.

As the student acts and recommends, the system tracks what is happening. As needed or as requested by the student, the system uses experts captured on video to guide, coach, critique, and give real world examples of similar situations.

A GBS system is not intended to model a teacher or tutor and is not supposed to be like a classroom. A GBS system is supposed to be like learning on the job, only better because experts are available all the time for help and review.

Intelligent critiquing of the student's actions and conclusions is an important element of a GBS. Herein lies the rub. On the one hand, it's important to give the student the same kinds of options that the real-world task would have. On the other hand, the more complex and varied the student's choices, the more complex the critiquing module has to be, and hence the more complex the knowledge engineering needs.

We believe GBS systems for thousands of domains will need to be built and maintained. Therefore, an important goal for us is the development of authoring tools to let content experts create GBS's with complex student activities with modest knowledge engineering costs and little or no programming.

This paper describes a tool, called Indie, for authoring Investigate and Decide GBS's. In particular, we focus on how Indie is used to author knowledge bases, with special emphasis on the knowledge base used to critique evidence-based arguments.

INVESTIGATE AND DECIDE

In a particular subclass of GBS's, called Investigate and Decide, students have to make a decision based on information gathered from performing experimental tests, interviews, document review, inspections, and so on. When students feel ready, they submit a recommendation, along with whatever evidence they've gathered so far that they feel supports their conclusion.

Along the way, students can ask for help and browse a richly-indexed multimedia hypertext network called an ASK system [1]. The central idea in ASK systems is that all information is linked to other information via follow-up questions. For example, if an expert in a video clip mentions how an immunofluorescence test can be used detect immune system problems, one follow-up question might be "How does the immunofluorescence test work?"

Several examples of Investigate and Decide systems that have been built with Indie are:

* Immunology Consultant: Medical school students have to determine what has gone wrong with a patient's immune system by interviewing the patient, running lab tests, and collecting information on how the immune system works and sometimes malfunctions.

* Is It A Rembrandt?: Art history students have to determine whether a painting is actually by Rembrandt or a forgery, by inspecting the style of the painting, its materials, the signature, and so on.

* Volcano Investigator: High school students learning geology run experiments in order to estimate the likelihood of a Mt. St. Helens-like volcano erupting, in order to decide whether a nearby town has to evacuate immediately or not.

* Nutrition Clinician: Medical school students have to determine what nutritional deficiencies a patient has, what the medical implications are, and what needs to be done to remove the deficiencies.

For brevity, in this paper we'll refer to these systems as Immunology, Rembrandt, Volcano, and Nutrition.

INDIE APPLICATION ARCHITECTURE

Here is a simplified diagram of Indie-based GBS systems:

The student works with four important interface elements:

* The lab screens, where the student gathers facts about the scenario by interacting with the simulated world. "Lab" is a very broad term here, including screens for running experiments, interviewing patients, reading documents, and so on.

* The notebook, which contains all evidence that has been collected so far. Evidence is added automatically for the student.

* The report screen, where the student constructs an argument for or against one or more of the possible choices, using the evidence gathered in the notebook.

* The ASK browser screens, where the student talks to experts, gets background information about the task and domain, asks follow-up questions about critiques and coaching advice, and so on.

The notebook is usually on-screen all the time. The student switches between the lab, report, and browser screens with a mouse click or two.

Internally, there are three important modules:

* The Simulator produces responses for student actions on the lab screens. These responses include not only interface events such as movies and graphics, but also the pieces of evidence that such actions reveal and that get stored in the student's notebook.

* The Critiquer analyzes the arguments made by the student in the report. Problems found are used to retrieve the relevant responses in the ASK network.

* The ASK system retrieves information from the ASK network and supports browsing through that network. It keeps track of what the student has seen so far so that the same information is not shown twice unless the student asks to review it.

Each module has a knowledge base:

* The domain model holds facts about the particular scenario, e.g., "Mary has sickle cell anemia," and facts and rules about the domain in general, such as "if a patient has sickle cell anemia, the microscope test will show sickle-shaped blood cells."

* The argument models describe what makes good and bad arguments for each possible decision.

* The ASK network links questions to answers (in video or text) and answers to follow-up questions in a large graph. In Indie systems, ASK systems hold any information that should be presented with follow-up questions. This includes not only background reference material, but also critiques.

Indie provides tools for authoring:

* the interface screens

* the domain model

* the argument models

* the ASK network

In this paper, we will focus on how argument models are authored and used for critiquing. More specifically, we'll present the first approach we designed and implemented, its strengths and weaknesses, then the approach we're currently using.

ARGUMENT CRITIQUING IN INDIE

Our first model of how to represent and critique evidence-based arguments in Indie was very simple. A student's argument consisted of

* a claim about the scenario, e.g., "Mary has acute rheumatic fever,"

* a set of presented evidence, consisting of scenario facts (usually test results) supporting the claim, e.g., "Mary has a high fever, and Mary had strep throat recently."

When a student submitted an argument, it was compared against an argument model. Every claim had an argument model, consisting of

* the claim

* one or more proof sets, each a set of scenario facts

* one or more disproof sets, each a set of scenario facts

* a set of relevant fact types (usually types of tests, e.g., "take temperature")

For an argument to be acceptable, the set of presented evidence had to

* be a superset of at least one proof set in the argument model

* not be a superset of any disproof set

* include only facts of the relevant fact types

A typical model would have

* one proof set with several facts that needed to be true for the claim to be true.

* a default set of disproof sets, generated automatically by making a singleton set of the negation of each fact in the proof set.

Disproof sets were intended to give authors some control over how much evidence it took to disprove a claim.

Given an argument, argument model, and the scenario facts currently available to the student in the notebook, the Critiquer looked for

* Contradictions, i.e., evidence used to support a claim when in fact it argues against it, or vice versa.

* Overlooked necessary evidence, in the notebook but not used in the argument.

* Missing necessary evidence, not in the notebook because some tests had not yet been run

* Irrelevancies, i.e., evidence used for or against a claim not in the list of relevant fact types

Our model of critiquing had the following steps:

* The student constructed an argument.

* The Critiquer analyzed the argument and created a set of categorized critiques, such as "Overlooked: High ASO Titer; Irrelevant: Age is young"

* The ASK system retrieved the most relevant responses for each critique.

Our intent was that authors would write responses for the top-level critique categories, such as

* Contradictions exist: "I'm confused. Are you sure all the evidence implies what you say it does?"

* Omissions exist: "Maybe, but it seems like you need a stronger case."

* Irrelevancies exist: "Seems right but I don't see why you mentioned some of the things you did."

* None of the above: "Makes sense. Good job!"

In addition, authors could also write more specific responses for particular problems, such as "You seem to be confused about how the ASO titer test works..." The ASK system would take care of finding the most relevant of the authored responses for each critique.

Advantages Of The Initial Model

This initial model was:

* Simple: Since, for pedagogical reasons, there are usually only a handful of claims in an Indie GBS and less than a dozen tests necessary to prove any particular claim, the amount of knowledge engineering was fairly small.

* Robust: Using the critique taxonomy to organize responses meant that all student arguments would be handled gracefully, even silly ones.

* Tailorable: Authors could easily add remediation responses for very specific argument errors.

Problems With The Initial Model

The above model, implemented in Indie, version 1.0, was used to build Immunology Consultant. We spent a fair amount of time early on showing the content team how to represent and connect claims and arguments. Our hope was that Immunology would be an exemplar for later projects.

Problems arose almost at once, however, when the authors tried to adapt the Immunology examples to Volcano, Rembrandt, and Nutrition. They felt that the Indie 1.0 model didn't support

* the simple things they wanted to do

* the complicated things they wanted to do

SUPPORTING THE SIMPLE THINGS

Of great concern to us was the mismatch between what the authors wanted to write and what we, the developers, thought they needed to write. The authors wanted to write rules like

* IF the student says that a high fever is a sign of acute rheumatic fever, THEN play the movie that says fever can be a symptom of many things.

or even, while building a mock-up to demonstrate the interface, a rule like

* IF the student clicks on the "submit report" button, THEN play the movie that says "This isn't enough evidence. You need to collect some real data."

Instead, to make a movie play in response to clicking on submit report, authors had to

* determine the appropriate critique category for the mock-up student argument

* develop an argument model that would generate that critique category

* index the desired movie in the ASK network under that critique category

This is a lot of steps (and thinking) compared to what you need to do to play a movie in a typical interface authoring tool. Of course, such tools provide no support for intelligent critiquing.

There was clearly a serious conflict between our knowledge engineering approach and how goal-based scenario systems were actually being developed. The challenge was making Indie usable for authors without compromising the needs of the final GBS.

PHASES OF AUTHORING

Authors of scenario-based systems go through the following phases of development:

1. Single scenario mock-up: In this initial phase, authors develop key segments for one scenario, for design review purposes. They want to specify as simply and directly as possible a (mostly) linear sequence of events, triggered by button clicks. Interface concerns dominate the design and implementation process.

2. Single scenario run-through: In this phase, all the scenes for the scenario are specified, to make sure all functionality is available, feasible and consistent. Enough branches are defined to do some brief usability assessments with test users. The authors need to be able to specify a few conditional responses, especially for remediation, but they want to be able to leave other transitions "hard-wired." Interface work slowly gives way to scenario building.

3. Single scenario completion: In this phase, the authors finish all the branches and the ASK system content for the scenario. The authors need bookkeeping tools to check for consistency, completeness, redundancies, and so on. Some systems stop at this phase. Scenario building and ASK content work dominate.

4. Multiple scenario development: In this phase, the authors specify sequel scenarios. Most of the interface remains the same, as well as some of the artwork, but many of the system responses, especially remedial, have to be significantly changed. Authors need support for replacement, generalization, and reuse of response rules. Scenario building and support content work dominate.

This prototype-based development sequence is very typical with modern interactive systems. A key point for intelligent systems is that

* In the early phases, authors want total control over what happens. The system isn't intended to stand on its own yet. The auctorial attitude is "I want it to do this!"

* In the later phases, authors want to the system to be intelligent, or, more accurately, not stupid [3]. The auctorial attitude is "I want it to be able to handle this new stuff, along with the old stuff."

Our knowledge engineering approach did not support the direct control of system behavior needed in the first two phases. It only supported the robust response handling needed in the latter phases. Though we thought our critiquing model was a "good value" in terms of intelligence gained for work required, it was still more work to make something happen than, say, simply attaching "play movie" to a button.

Not surprisingly, it doesn't matter if an authoring system supports Phases 3 and 4 if authors don't want to use it in Phases 1 and 2. An author wants to say "when I click here, it does this." That's it. Given how rapidly things change in the first phase, doing more work than this is simply not in the author's interest.

The problem with giving our authors exactly what they want is that what works in the first two phases falls apart in the last two phases. In Phase 1, when building a mock-up, it's nice to be able to just attach the command "play the `needs more work' movie" to the button labelled "Submit report." Unfortunately, by the start of Phase 3, which movie to play depends in a non-trivial way on what the report being submitted actually contains. By the end of Phase 3, interface issues are mostly irrelevant to what the authors are trying to specify.

TRIGGERS AND INCREMENTAL AUTHORING

Indie, version 2.0, was designed to allow authors start with interface-based authoring and support (and encourage) a gradual migration to model-based authoring. A key addition was the concept of triggers. A trigger connects events to actions. Event and actions can occur in the interface or in the underlying model. By interface and model we mean the same distinction found in the Model-View-Controller paradigm [2]. Some examples of each are:
        Interface          Model              
Event   button click,      rule fired, test   
        item dragged       result generated   
        onto a list                           
Action  play movie,        fill test tube,    
        update button      critique           
        state              argument, set      
                           current topic      

Authors create triggers using a form-based editor that lets them select from lists of available actions and events. Triggers for events on interface objects can be edited by simply clicking on the interface object.

In Phase 1, an author can make the "submit report" button play a particular movie by creating a trigger that goes from interface event to interface action. Schematically, it looks like this:

Later, in Phase 2 or 3, when the author wants the movie that gets played to be selected based on some property of the student argument, the author

* changes the trigger on the submit button to call the model action "critique report" rather than "play movie,"

* creates a critiquing rule (as described below) to catch the relevant property,

* creates a new trigger that goes from the rule firing to "play movie."

Schematically, the new triggers look like this:

If, later, the authors want to add follow-up questions to the movie, they

* link the movie into the ASK network

* link the appropriate follow-up questions and answers to the movie in the network

* add a "set topic" model action to the second trigger

Schematically, the second trigger above becomes:

The ASK Browser interface then takes care of presenting the follow-up questions after the movie plays.

In this way, Indie supports migration from hard-wired button responses to full-fledged critiquing.

SUPPORTING THE COMPLICATED THINGS

The other area in which Indie 1.0 fell short was in how complicated arguments and argument analysis could be. Our authors wanted:

* more in a student argument than just claim plus evidence,

* more complex analysis of arguments than the proof and disproof set model could provide

In particular, authors wanted students to be able to include in their reports:

* contrary evidence, e.g., in Volcano and Rembrandt, students needed to be able to show that they were aware of test results that didn't fit the claim, e.g., "X is true, because ..., despite the fact that ..."

* categorized evidence, e.g., in Nutrition some evidence is from scenario-dependent test results and some is from scenario-independent background information

* non-evidence, e.g., in Nutrition, the argument for a particular nutritional deficiency is part of a bigger report that also includes medical implications and recommended actions, all of which need critiquing.

Furthermore, authors wanted more control over the argument analysis. In Rembrandt, where evidence about a painting's authorship can be quite fuzzy and subjective, a student's argument has to be based on a preponderance of evidence, not a simple all-or-none logic. In addition, pieces of evidence can interact in complex ways with other points. For example, one of the claims in Rembrandt had the following relationships between its evidence points A, B, C, D, E, F and G:

* Necessary: Any two of the following groups: (A and B), (C and D), E, or F.

* Irrelevant: If (A and B) are present, then E is irrelevant. If (C and D) are present then F is irrelevant.

* Conflicting: G conflicts with B, so if B is mentioned G shouldn't be.

ARGUMENT CRITIQUING RULES

In response to these needs, in Indie, version 2.0, we

* added support for multiple lists of evidence in arguments, and

* replaced proof and disproof sets with critiquing rules

Multiple Lists of Evidence

In Indie 2.0, student arguments can have one or more BECAUSE lists, zero or more DESPITE lists, and zero or more other labelled lists. BECAUSE lists should contain evidence supporting the claim, while DESPITE lists should contain evidence against the claim. Critiquer rules have predefined slots for dealing with BECAUSE and DESPITE lists. The other labelled lists can contain anything. They are used for things like Nutrition's lists of medical implications and recommendations.

Critiquing Rules

Critiquing rules check for the presence or absence of different evidence points in an argument or in the notebook. A critiquing rule can check if

* at least (or at most) M points from a set of N possible points

* are (or are not) in a BECAUSE list, a DESPITE list, some other labelled list, or the notebook

For example, a rule in Volcano Investigator is:

CLAIM: the volcano will erupt in the next 24 hours,

CHECK IF: BECAUSE does NOT include data from either of two ground deformation tests OR either of the strainmeter results

Since many critiques are based on missing evidence, many rules check for the absence of evidence. Checks for presence of evidence are usually to catch common errors, e.g., "If the students said that a high blood pressure is associated with underweight patients, show them this movie about causes of high BP."

Conditions

Checks on evidence are nested in groups called conditions. A condition is recursively defined as either

* an evidence point,

* at least M of N conditions being true, or

* at most M of N conditions being true.

In the Volcano example above, the rule says "CHECK IF: BECAUSE does NOT include" and the condition says

* at least 1 of

* at least 1 of 2 ground deformation test results

* at least 1 of 2 strainmeter results

Conditions can simulate various logical connectives:

* OR is "at least 1 of N conditions"

* AND is "at least N of N conditions."

* NOT is "at most 0 of N conditions."

The use of "at least" and "at most" is similar in approach to SNePS [6]. SNePS is more powerful, because it lets you specify at least and at most simultaneously, but this hasn't been needed by our authors. On the other hand, Indie authors do frequently go beyond AND and OR by asking for 2 of 6 possible conditions to be true or 3 of 7 possible points to be present.

In Indie, the rule specifies what evidence lists are being checked and whether the check is for presence or absence. The conditions specify the logic of the check.

Indie has form-based editors for rules and conditions. The rule editor looks like this:

and the condition editor looks like this:

Almost everything is selected from lists, rather than typed in. The only time something is typed is when something new is created, e.g., a new kind of test result. Anything that's created automatically becomes available for later re-use.

Controlling Rules

Authors have three simple mechanisms to control how rules fire:

* Rules can be marked "once only," which means they fire at most once.

* Rules can be collected into rule sets. Rules in a rule set are checked in order and checking stops when an author-specified number of rules has fired.

* Each scenario has its own rule collection. It's easy to share rule sets across collections.

Rembrandt's authors used once-only rules and rule sets to give different hints on different rounds of critiquing. The first time a student forgot to analyze the signature on the painting, the first rule in a rule set fired and said the report was incomplete. That rule was once-only and the rule set allowed only one rule to fire. Therefore, the second time the student submitted a report with the same mistake, the second (once-only) rule in that set fired and suggested looking at the signature. If this happened a third time, the third rule said the signature was atypical and the student needed to analyze it.

Non-exclusive claims

In Nutrition, several claims can apply in one scenario. A patient might be overweight, folate deficient, and calcium deficient. Supporting this involved several classes of change to Indie:

* Adding an interface by which the student can specify a set of claims.

* Allowing two kinds of claims:

deg. The usual kind of claim and argument, e.g., an argument for calcium deficiency

deg. The claim that all problems have been found.

The second kind of claim (usually labelled "I'm done") leads to three possible categories of critique:

* Yes, you're done.

* At least one of your arguments still has problems.

* The arguments you've given are OK, but there's at least one more claim that can be made

Inverse sets

In Nutrition, there are many irrelevant points for a given claim than relevant ones. Writing and maintaining rules to catch irrelevancies was tedious because of the length of the list of points. This was solved by allowing authors to indicate for each list of points in a condition whether they want the point list itself or its inverse, i.e., all the points in the system besides the ones listed.

Special conditions

In Nutrition Clinician, the authors wanted to have their students also choose risks and treatments that are associated with the conditions that they were proving about their patients. For example, overweight patients are at risk of high blood pressure, which can be treated with medication and regular exercise.

They wanted to represent these risks and treatments as points that can be dragged from predefined notebooks into evidence lists associated with an argument. These evidence lists really aren't BECAUSE or DESPITE, or even NOTEBOOKS, exactly. Rather than remove the BECAUSE and DESPITE formalisms from the rules, which saved authoring effort in more traditional critiquers, we added "special conditions" as a final slot on the rule editor which is hidden on a twist-down menu so as not to confuse novice authors.

A special condition is just like a normal condition except it relates directly to some evidence list that may or may not be part of the argument.

STATUS AND ASSESSMENT

Indie is a complete tool, including

* the Critiquer focussed on here and the various rule editors

* an interface editor

* an ASK network editor and browser

* a very lightweight experiment simulator

Indie is implemented in Digitool's Macintosh Common LISP 4.1, and generates stand-alone MCL applications.

In terms of complexities of the Indie systems built so far:

         Immun.   Volcano  Nutriti  Remb.    
                           on                
Points   120      36       1000     514      
Rules    15       30       150      77       
ASK      217      150      600      620      
nodes                                        

Indie application sizes are dominated by graphics and video. Rembrandt, for example, has around 60MB of pictures (largely uncompressed) and nearly 4 gigabytes of video consisting of nearly 500 clips of experts talking about Rembrandt. Nutrition has 3 gigabytes of video.

All of these systems have at least 15 different screens, ranging from introductions, tests, interviews, ASK zoomers, ASK browsers, report-building, and feedback. Each project took a team of 2 to 3 content analysts about 5 months to complete, with guidance and tool support by two graduate students. Immunology took almost twice as long, largely because it was the first Indie project, and had a programmer whose main role was to work around gaps in the first interface editor.

On average, the Indie team spent less than 3 hours a week communicating with each team, though more at the beginning or end of each project. Most of the interactions after the first week working with the tool were emails with suggestions, bug reports, or questions about what the best way to "Indie-engineer" a critiquer rule or interface interaction.

The Indie tool has also been used by several groups of graduate students, both PhD and masters, in course projects. These projects go through Phase 2, building at least half of a complete scenario, including video and artwork. Volcano Investigator is one of the more successful student projects, built by first-year masters students in an intensive project in 4 months. A similar MS project underway now is Clinical Monitor (drug testing). Two recent class projects were Car Repair and KERMIT (the ecology of polluted ponds).

These many projects have helped us explore the "space" of Investigate and Decide GBS's. Encouragingly, Indie did not need any major change for the most recent student projects. It seems to have reached a stable point where there are enough options to satisfy typical needs and enough concrete examples to show how to use those options.

RELATED WORK

In the space available, we'd like to bracket Indie with two argument evaluation systems, one relatively early and AI-intensive, and the other relatively new and AI-lite.

ACE [7] was an early interesting effort to apply natural language understanding and argument interpretation to the analysis of student explanations of Nuclear Magnetic Resonance (NMR) spectra. The focus was on finding incorrect and incomplete arguments with the pedagogical goal of making the student resolve the problems and re-articulate the argument Conceptually, ACE is similar to the Indie 1.0 Critiquer, in that it matched the student argument against correct arguments. ACE used a great deal of domain knowledge in a narrow area. No attention was given to authoring such knowledge for new domains.

Belvedere [8] is a very recent effort to allow students to collaboratively articulate arguments about scientific issues using a graphical argument tool. Like Indie 2.0, Belvedere uses rules to analyze student arguments in order to suggest areas where the student needs to flesh things out or repair logical problems. Unlike an Indie GBS and ACE, Belvedere has no domain knowledge and doesn't try to understand the propositions in the arguments. On the one hand, this means Belvedere needs little or no knowledge engineering. On the other hand, it can only critique structural problems, such as missing support links or circular reasoning chains.

Indie GBS's in particular, and GBS's in general sit between these two approaches to educational software. GBS's are neither as knowledge-intensive and closed as AI-based systems like ACE (and many other early systems), nor as knowledge-free and open-ended as Belvedere (and many other recent educational systems). The purpose of GBS tools is to make it possible to easily author large numbers of scenarios in many domains with a cost-effective level of knowledge engineering.

SUMMARY AND FUTURE WORK

The history of the development of the Indie tool illustrates what appears to be a common phenomenon:

The usefulness of a tool is inversely proportional to its intelligence.

Authors don't want smart tools, they want tools that aren't stupid [3]. Stupidity can come from missing knowledge, but it can also come from tools that require knowledge engineering at the wrong time. In some important ways, the current Indie Critiquer is stupider than its predecessor. It's less robust and more prone to allowing logical inconsistencies and gaps. But it seems less stupid because it lets authors do what they want to do. It doesn't require knowledge to be authored until the need for that knowledge is clear.

Indie 2.0 lets authors move at their own pace from interface authoring to model authoring. It is our hope to be able to develop a version of the tool which will support gradual on-demand migration from the current rule model, with gives control but not robustness, to an argument model similar to Indie 1.0.

ACKNOWLEDGMENTS

We'd like to acknowledge the significant contributions of Seth Tisue (ASK network tools), Steven Silverstein (Immunology), Brendon Towle and Joe Herman (editor tool kit), and Brian Davies (graphic tools).

This work has been supported in part by the Defense Advanced Research Projects Agency, monitored by the Office of Naval Research, under contracts N00014-90-J-4117 and N00014-91-J-4092. The Institute for the Learning Sciences was established in 1989 with the support of Andersen Consulting.

REFERENCES

1. Ferguson, W., Bareiss, R., Birnbaum, L., and Osgood, R. ASK Systems: An approach to story-based teaching. In Proceedings of the 1991 International Conference on the Learning Sciences, L. Birnbaum, Ed. (Evanston, IL, Aug. 1991), 158-164.

2. Goldberg, A. Information Models, Views, and Controllers. Dr. Dobb's Journal (July 1990), 54-60.

3. Riesbeck, C. What Next? The Future of Case-Based Reasoning in Postmodern AI. In Case-Based Reasoning: Experiences, Lessons, and Future Directions, D. Leake, Ed. AAAI Press/The MIT Press, Menlo Park, CA., 1996, 371-388.

4. Schank, R. Goal-based scenarios: A radical look at education. Journal of the Learning Sciences 3, 4(1994), 429-453.

5. Schank, R., Fano, A., Jona, M., and Bell, B. The design of goal-based scenarios. Journal of the Learning Sciences 3, 4(1994), 305-345.

6. Shapiro, S. The SNePS semantic network processing system. In Associative Networks: The Representation and Use of Knowledge by Computers, N. V. Findler, Ed. Academic Press, New York, 1979, 179-203.

7. Sleeman, D., and Hendley, R. ACE: A system which Analyzes Complex Explanations. In Intelligent Tutoring Systems, D. Sleeman and J. Brown Eds. Academic Press, London, 1982, 99-118.

8. Suthers, D., Weiner, A., Connelly, J. and Paolucci, M. Belvedere: Engaging students in critical discussion of science and public policy issues. In Proceedings AI-Ed 95, the 7th World Conference on Artificial Intelligence in Education (Washington DC, August 16-19, 1995) 266-273.