The first thing you need to do is to download this file: mp07.zip. It has the following content:
submitted.py
: Your homework. Edit, and then submit to Gradescope.mp07_notebook.ipynb
: This is a Jupyter notebook to help you debug. You can completely ignore it if you want, although you might find that it gives you useful instructions.grade.py
: Once your homework seems to be working, you can test it by typing python grade.py
, which will run the tests in tests/tests_visible.py
.tests/test_visible.py
: This file contains about half of the unit tests that Gradescope will run in order to grade your homework. If you can get a perfect score on these tests, then you should also get a perfect score on the additional hidden tests that Gradescope uses.data
: This directory contains the data.reader.py
: This is an auxiliary program that you can use to read the data.requirements.txt
: This tells you which python packages you need to have installed, in order to run grade.py
. You can install all of those packages by typing pip install -r requirements.txt
or pip3 install -r requirements.txt
.This file (mp07_notebook.ipynb
) will walk you through the whole MP, giving you instructions and debugging tips as you go.
There are two types of data: visible data (provided to you), and hidden data (available only to the autograder on Gradescope). If you get your code working for the visible data, it should also work for the hidden data.
The data for this MP are a set of closed-world reasoning environments that were developed for the 2020 paper Transformers as Soft Reasoners over Language by Clark, Tafjord and Richardson. Each world includes a set of facts, and a set of questions that can be asked on the basis of those facts.
In order to help you load the data, we provide you with a utility function called reader.py
. Since its methods are correctly documented by docstrings, you can find information about each function by using help
:
import reader, importlib
importlib.reload(reader)
help(reader)
Help on module reader: NAME reader DESCRIPTION Read in reasoning environments from one of the RuleTaker metadata jsonl files distributed by https://allenai.org/data/ruletaker. FUNCTIONS load_datafile(filename) Load a RuleTaker jsonl file in a format suitable for forward-chaining. @param filename (str): the file containing the data. Must be in jsonl format. @return worlds (dict): a dict mapping world IDs to worlds Each world is a dict containing two entries: world['rules'] - a dict mapping rule IDs to rules. Each rule is a dict: rule['text'] is the natural language text of the rule rule['antecedents'] contains the rule antecedents (a list of propositions) rule['consequent'] contains the rule consequent (a proposition). world['questions'] - a dict mapping question IDs to questions. Each question is a dict: question['text'] is the natural language text of the question question['proofs'] is a string specifying the reference proof of the provided answer question['query'] is a list specifying the rule in standard proposition format question['answer'] indicates the correct answer, which may be one of: True: query is true False: query is false "NAF": supported by a negative statement which wasn't disproven "CWA": search for a proof exceeded maximum depth Standard proposition format is a list of length 4: [head, predicate, tail, negation] where negation is True or False. parse_rule(s) Parse the representation of a rule or rule @param s (str) - string of the form ((ante1, ante2, ...) -> cons) where ante1, ... and cons are all strings representing logical triples @return antecedents (list) - list of antecedent propositions @return consequent (list) - consequent proposition where each proposition is a list of the form [ head, predicate, tail, negation ] parse_triple(s) Parse the representation of a logical triple. @param s (str) - a string of the form '("head" "predicate" "tail" "negation")' @return t (list) - a proposition, a list of the form [head, predicate, tail, negation] FILE /Users/jhasegaw/Dropbox/mark/teaching/ece448/ece448labs/spring23/mp07/src/reader.py
Well, that's pretty straightforward. Let's use it to load the data
directory.
importlib.reload(reader)
data_worlds = reader.load_datafile('data/meta-train.jsonl')
print("There were",len(data_worlds),"logical worlds loaded.")
ids = list(data_worlds.keys())
print("Some of the world IDs are:",ids[:5])
There were 6314 logical worlds loaded. Some of the world IDs are: ['RelNeg-D2-1742', 'RelNoneg-D2-2133', 'AttNoneg-D2-163', 'RelNoneg-D2-820', 'RelNoneg-D2-1342']
import pprint
id = 'RelNoneg-D2-2133'
print("data_worlds[%s] contains the following rules:"%(id))
pprint.PrettyPrinter(indent=1).pprint(data_worlds[id]['rules'])
print("")
print("... and the following questions:")
pprint.PrettyPrinter(indent=1).pprint(data_worlds[id]['questions'])
data_worlds[RelNoneg-D2-2133] contains the following rules: {'rule1': {'antecedents': [['something', 'eats', 'tiger', True]], 'consequent': ['something', 'sees', 'tiger', True], 'text': 'If something eats the tiger then it sees the tiger.'}, 'rule2': {'antecedents': [['something', 'visits', 'bald eagle', True]], 'consequent': ['something', 'eats', 'tiger', True], 'text': 'If something visits the bald eagle then it eats the ' 'tiger.'}, 'rule3': {'antecedents': [['something', 'is', 'rough', True]], 'consequent': ['something', 'eats', 'tiger', True], 'text': 'If something is rough then it eats the tiger.'}, 'triple1': {'antecedents': [], 'consequent': ['bald eagle', 'eats', 'tiger', True], 'text': 'The bald eagle eats the tiger.'}, 'triple10': {'antecedents': [], 'consequent': ['tiger', 'is', 'green', True], 'text': 'The tiger is green.'}, 'triple11': {'antecedents': [], 'consequent': ['tiger', 'is', 'kind', True], 'text': 'The tiger is kind.'}, 'triple12': {'antecedents': [], 'consequent': ['tiger', 'is', 'rough', True], 'text': 'The tiger is rough.'}, 'triple13': {'antecedents': [], 'consequent': ['tiger', 'is', 'young', True], 'text': 'The tiger is young.'}, 'triple14': {'antecedents': [], 'consequent': ['tiger', 'sees', 'bald eagle', True], 'text': 'The tiger sees the bald eagle.'}, 'triple15': {'antecedents': [], 'consequent': ['tiger', 'visits', 'bald eagle', True], 'text': 'The tiger visits the bald eagle.'}, 'triple2': {'antecedents': [], 'consequent': ['bald eagle', 'is', 'blue', True], 'text': 'The bald eagle is blue.'}, 'triple3': {'antecedents': [], 'consequent': ['bald eagle', 'is', 'green', True], 'text': 'The bald eagle is green.'}, 'triple4': {'antecedents': [], 'consequent': ['bald eagle', 'is', 'kind', True], 'text': 'The bald eagle is kind.'}, 'triple5': {'antecedents': [], 'consequent': ['bald eagle', 'is', 'rough', True], 'text': 'The bald eagle is rough.'}, 'triple6': {'antecedents': [], 'consequent': ['bald eagle', 'sees', 'tiger', True], 'text': 'The bald eagle sees the tiger.'}, 'triple7': {'antecedents': [], 'consequent': ['bald eagle', 'visits', 'tiger', True], 'text': 'The bald eagle visits the tiger.'}, 'triple8': {'antecedents': [], 'consequent': ['tiger', 'eats', 'bald eagle', True], 'text': 'The tiger eats the bald eagle.'}, 'triple9': {'antecedents': [], 'consequent': ['tiger', 'is', 'blue', True], 'text': 'The tiger is blue.'}} ... and the following questions: {'Q1': {'answer': True, 'proofs': '[(triple9)]', 'query': ['tiger', 'is', 'blue', True], 'text': 'The tiger is blue?'}, 'Q10': {'answer': 'CWA', 'proofs': '[@0: The bald eagle visits the bald eagle.[CWA. Example of ' 'deepest failure = (FAIL)]]', 'query': ['bald eagle', 'visits', 'bald eagle', True], 'text': 'The bald eagle visits the bald eagle?'}, 'Q2': {'answer': False, 'proofs': '[(triple12)]', 'query': ['tiger', 'is', 'rough', False], 'text': 'The tiger is not rough?'}, 'Q3': {'answer': True, 'proofs': '[(((triple12) -> rule3) OR ((triple15) -> rule2))]', 'query': ['tiger', 'eats', 'tiger', True], 'text': 'The tiger eats the tiger?'}, 'Q4': {'answer': False, 'proofs': '[(((triple12) -> rule3) OR ((triple15) -> rule2))]', 'query': ['tiger', 'eats', 'tiger', False], 'text': 'The tiger does not eat the tiger?'}, 'Q5': {'answer': True, 'proofs': '[(((((triple12) -> rule3)) -> rule1) OR ((((triple15) -> ' 'rule2)) -> rule1))]', 'query': ['tiger', 'sees', 'tiger', True], 'text': 'The tiger sees the tiger?'}, 'Q6': {'answer': False, 'proofs': '[(((((triple12) -> rule3)) -> rule1) OR ((((triple15) -> ' 'rule2)) -> rule1))]', 'query': ['tiger', 'sees', 'tiger', False], 'text': 'The tiger does not see the tiger?'}, 'Q7': {'answer': 'CWA', 'proofs': '[@0: The tiger visits the tiger.[CWA. Example of deepest ' 'failure = (FAIL)]]', 'query': ['tiger', 'visits', 'tiger', False], 'text': 'The tiger does not visit the tiger?'}, 'Q8': {'answer': 'CWA', 'proofs': '[@0: The bald eagle eats the bald eagle.[CWA. Example of ' 'deepest failure = (FAIL)]]', 'query': ['bald eagle', 'eats', 'bald eagle', True], 'text': 'The bald eagle eats the bald eagle?'}, 'Q9': {'answer': 'CWA', 'proofs': '[@0: The bald eagle is young.[CWA. Example of deepest ' 'failure = (FAIL)]]', 'query': ['bald eagle', 'is', 'young', False], 'text': 'The bald eagle is not young?'}}
As you can see, each logical world consists of a list of rules, and a list of questions.
Each rule consists of a consequent, which is true only if its antecedents are true. Both the antecedents and the consequent are propositions. If there are no antecedents, then the consequent is always true. The text of the rule is provided only to help you understand the formal proposition notation; no part of the MP will ever use the text for any other reason.
Each question consists of a query, which is a proposition that may or may not be true in this world. If a query is not provably true, then it is considered to be false (the so-called closed-world assumption.)
In our notation, a proposition is a list of length 4 consisting of a head (the subject of the verb), a predicate (usually a verb), a tail (the object of the verb), and a negation indicator. The negation indicator specifies whether the sentence "head predicate tail" should be considered True or False.
Notice that many of the rules we just loaded include the generic variable something. Consider these rules for example:
'rule1': {'antecedents': [['something', 'needs', 'bald eagle', True],
['something', 'is', 'cold', False]],
'consequent': ['something', 'visits', 'bald eagle', True],
'text': 'If something needs the bald eagle and it is not cold then '
'it visits the bald eagle.'},
'rule2': {'antecedents': [['something', 'is', 'nice', True],
['something', 'eats', 'squirrel', False]],
'consequent': ['squirrel', 'eats', 'bald eagle', True],
'text': 'If something is nice and it does not eat the squirrel then '
'the squirrel eats the bald eagle.'},
In logical notation, we could write these rules as
$$\forall x:\text{needs}(x,\text{bald eagle})\wedge\neg\text{is}(x,\text{cold})\Rightarrow\text{visits}(x,\text{bald eagle})$$$$\forall x:\text{is}(x,\text{nice})\wedge\neg\text{eats}(x,\text{squirrel})\Rightarrow\text{eats}(\text{squirrel},\text{bald eagle})$$In our corpus, the only variable you will ever see is the word something. Notice the following points:
In order to make it clear that the somethings in different rules refer to possibly different objects, the first thing we need to do is to standardize variables. This is your first task.
At this point, we'll load the file submitted.py
.
The file submitted.py
is the only part of your work that the autograder will see. The only purpose of this notebook is to help you debug submitted.py
. Once you have revised submitted.py
enough to make this notebook work, then you should go to the command line, and type python grade.py
. Once that command returns without errors, then you can go ahead and submit your file submitted.py
to the autograder. You can submit to the autograder as often as you want, but it will save you trouble if you debug as much as you can on your local machine, before you submit to the autograder.
We will use importlib
in order to reload your submitted.py
over and over again. That way, every time you make a modification in submitted.py
, you can just re-run the corresponding block of this notebook, and it will reload submitted.py
with your modified code.
Since the file is called submitted.py
, python considers it to contain a module called submitted
. As shown, you can read the module's docstring by printing submitted.__doc__
. You can also type help(submitted)
to get a lot of information about the module, including its docstring, a list of all the functions it defines, and all of their docstrings. For more about docstrings, see, for example, https://www.python.org/dev/peps/pep-0257/.
import submitted
import importlib
importlib.reload(submitted)
print(submitted.__doc__)
This is the module you'll submit to the autograder. There are several function definitions, here, that raise RuntimeErrors. You should replace each "raise RuntimeError" line with a line that performs the function specified in the function's docstring.
Now it's time for you to open submitted.py
, and start editing it. You can open it in another Jupyter window by choosing "Open from Path" from the "File" menu, and then typing submitted.py
. Alternatively, you can use any text editor.
Once you have it open, try editing the function standardize_variables
so that its functionality matches its docstring. Here is what it's docstring says:
importlib.reload(submitted)
help(submitted.standardize_variables)
Help on function standardize_variables in module submitted: standardize_variables(nonstandard_rules) @param nonstandard_rules (dict) - dict from ruleIDs to rules Each rule is a dict: rule['antecedents'] contains the rule antecedents (a tuple of propositions) rule['consequent'] contains the rule consequent (a proposition). @return standardized_rules (dict) - an exact copy of nonstandard_rules, except that the antecedents and consequent of every rule have been changed to replace the word "something" with some variable name that is unique to the rule, and not shared by any other rule. @return variables (list) - a list of the variable names that were created. This list should contain only the variables that were used in rules.
Edit standardize_variables
so that it does the task specified in its docstring. When you get the code working, check to make sure that standardize_variables(worlds[0]['rules'])
works.
Note: your version of standardize_variables
does not need to produce the same variable names as those shown here. You just need to make sure that (1) in any given rule, there is only one unique variable name, (2) no two rules have instances of the same variable name.
import pprint
import copy, importlib
importlib.reload(submitted)
worlds = {}
for id in data_worlds.keys():
worlds[id] = copy.deepcopy(data_worlds[id])
rules, variables = submitted.standardize_variables(data_worlds[id]['rules'])
worlds[id]['variables'] = variables
worlds[id]['rules'] = rules
id = 'RelNoneg-D2-2133'
print('The variables for world %s are:'%(id))
print(worlds[id]['variables'])
print("")
print("... and the standardized rules are:")
pprint.PrettyPrinter(indent=1).pprint(worlds[id]['rules'])
The variables for world RelNoneg-D2-2133 are: ['x0000', 'x0001', 'x0002'] ... and the standardized rules are: {'rule1': {'antecedents': [['x0000', 'eats', 'tiger', True]], 'consequent': ['x0000', 'sees', 'tiger', True], 'text': 'If something eats the tiger then it sees the tiger.'}, 'rule2': {'antecedents': [['x0001', 'visits', 'bald eagle', True]], 'consequent': ['x0001', 'eats', 'tiger', True], 'text': 'If something visits the bald eagle then it eats the ' 'tiger.'}, 'rule3': {'antecedents': [['x0002', 'is', 'rough', True]], 'consequent': ['x0002', 'eats', 'tiger', True], 'text': 'If something is rough then it eats the tiger.'}, 'triple1': {'antecedents': [], 'consequent': ['bald eagle', 'eats', 'tiger', True], 'text': 'The bald eagle eats the tiger.'}, 'triple10': {'antecedents': [], 'consequent': ['tiger', 'is', 'green', True], 'text': 'The tiger is green.'}, 'triple11': {'antecedents': [], 'consequent': ['tiger', 'is', 'kind', True], 'text': 'The tiger is kind.'}, 'triple12': {'antecedents': [], 'consequent': ['tiger', 'is', 'rough', True], 'text': 'The tiger is rough.'}, 'triple13': {'antecedents': [], 'consequent': ['tiger', 'is', 'young', True], 'text': 'The tiger is young.'}, 'triple14': {'antecedents': [], 'consequent': ['tiger', 'sees', 'bald eagle', True], 'text': 'The tiger sees the bald eagle.'}, 'triple15': {'antecedents': [], 'consequent': ['tiger', 'visits', 'bald eagle', True], 'text': 'The tiger visits the bald eagle.'}, 'triple2': {'antecedents': [], 'consequent': ['bald eagle', 'is', 'blue', True], 'text': 'The bald eagle is blue.'}, 'triple3': {'antecedents': [], 'consequent': ['bald eagle', 'is', 'green', True], 'text': 'The bald eagle is green.'}, 'triple4': {'antecedents': [], 'consequent': ['bald eagle', 'is', 'kind', True], 'text': 'The bald eagle is kind.'}, 'triple5': {'antecedents': [], 'consequent': ['bald eagle', 'is', 'rough', True], 'text': 'The bald eagle is rough.'}, 'triple6': {'antecedents': [], 'consequent': ['bald eagle', 'sees', 'tiger', True], 'text': 'The bald eagle sees the tiger.'}, 'triple7': {'antecedents': [], 'consequent': ['bald eagle', 'visits', 'tiger', True], 'text': 'The bald eagle visits the tiger.'}, 'triple8': {'antecedents': [], 'consequent': ['tiger', 'eats', 'bald eagle', True], 'text': 'The tiger eats the bald eagle.'}, 'triple9': {'antecedents': [], 'consequent': ['tiger', 'is', 'blue', True], 'text': 'The tiger is blue.'}}
The autograder checks to make sure that your standardized rules are identical to the input rules except that (1) in any given rule, there is only one unique variable name, (2) no two rules have instances of the same variable name.
You can test it by running grade.py
, and looking at the output for test_standardization
:
!python grade.py --json
{ "tests": [ { "name": "test_apply (test_visible.TestStep)", "score": 12.5, "max_score": 12.5 }, { "name": "test_backward_chain (test_visible.TestStep)", "score": 12.5, "max_score": 12.5 }, { "name": "test_standardization (test_visible.TestStep)", "score": 12.5, "max_score": 12.5 }, { "name": "test_unify (test_visible.TestStep)", "score": 12.5, "max_score": 12.5 } ], "leaderboard": [], "visibility": "visible", "execution_time": "0.01", "score": 50.0 }
Unification tests two propositions, a query
and a datum
, to see whether there is any third proposition (the unification
) that would imply both. For example, consider these two propositions:
These two propositions are both satisfied by the claim that squirrel visits bald eagle. The unification of these two propositions is
$$C: \text{visits}(\text{squirrel},\text{bald eagle})$$Notice the direction of implicature. Unification means that $C\Rightarrow A$ and $C\Rightarrow B$. The opposite is not true: $A\wedge B\not\Rightarrow C$!
If a pair of propositions cannot be unified, then the unify
function should just return None
, to indicate that no unification is possible. For example, proposition $A$ says that something visits the bald eagle, but suppose we want to test whether or not there is something that does not visit the bald eagle:
The unification of statements $D$ and $B$ is None
, because there is no single proposition we could provide that would prove both $D$ and $B$.
It is possible to unify two statements without resolving all variables. For example, consider:
$$E: \exists y,z: \text{visits}(y,z)$$The unification of statements $A$ and $E$ is:
$$F: \exists y:\text{visits}(y,\text{bald eagle})$$Statement $F$ has exactly the same meaning as statement $A$. Nevertheless, it is a valid unification of statements $A$ and $E$: notice that $F\Rightarrow A$ and $F\Rightarrow E$, so we know that the statements are equivalent in that very precise sense.
importlib.reload(submitted)
help(submitted.unify)
Help on function unify in module submitted: unify(query, datum, variables) @param query: proposition that you're trying to match. The input query should not be modified by this function; consider deepcopy. @param datum: proposition against which you're trying to match the query. The input datum should not be modified by this function; consider deepcopy. @param variables: list of strings that should be considered variables. All other strings should be considered constants. Unification succeeds if (1) every variable x in the unified query is replaced by a variable or constant from datum, which we call subs[x], and (2) for any variable y in datum that matches to a constant in query, which we call subs[y], then every instance of y in the unified query should be replaced by subs[y]. @return unification (tuple): unified query, or None if unification fails. @return subs (dict): mapping from variables to values, or None if unification fails. If unification is possible, then answer already has all copies of x replaced by subs[x], thus the only reason to return subs is to help the calling function to update other rules so that they obey the same substitutions. Examples: unify(['x', 'eats', 'y', False], ['a', 'eats', 'b', False], ['x','y','a','b']) unification = [ 'a', 'eats', 'b', False ] subs = { "x":"a", "y":"b" } unify(['bobcat','eats','y',True],['a','eats','squirrel',True], ['x','y','a','b']) unification = ['bobcat','eats','squirrel',True] subs = { 'a':'bobcat', 'y':'squirrel' } unify(['x','eats','x',True],['a','eats','a',True],['x','y','a','b']) unification = ['a','eats','a',True] subs = { 'x':'a' } unify(['x','eats','x',True],['a','eats','bobcat',True],['x','y','a','b']) unification = ['bobcat','eats','bobcat',True], subs = {'x':'a', 'a':'bobcat'} When the 'x':'a' substitution is detected, the query is changed to ['a','eats','a',True]. Then, later, when the 'a':'bobcat' substitution is detected, the query is changed to ['bobcat','eats','bobcat',True], which is the value returned as the answer. unify(['a','eats','bobcat',True],['x','eats','x',True],['x','y','a','b']) unification = ['bobcat','eats','bobcat',True], subs = {'a':'x', 'x':'bobcat'} When the 'a':'x' substitution is detected, the query is changed to ['x','eats','bobcat',True]. Then, later, when the 'x':'bobcat' substitution is detected, the query is changed to ['bobcat','eats','bobcat',True], which is the value returned as the answer. unify([...,True],[...,False],[...]) should always return None, None, regardless of the rest of the contents of the query or datum.
Modify the unify
function so that it performs as specified by the docstring. Once you have modified it, you can test it with some of the examples provided in the docstring:
importlib.reload(submitted)
unification, subs = submitted.unify(['x','eats','y',False],['a','eats','b',False],['x','y','a','b'])
print('Unification is:',unification)
print('Subs is:',subs)
Unification is: ['a', 'eats', 'b', False] Subs is: {'x': 'a', 'y': 'b'}
importlib.reload(submitted)
unification, subs = submitted.unify(['bobcat','eats','y',True],['a','eats','squirrel',True],['x','y','a','b'])
print('Unification is:',unification)
print('Subs is:',subs)
Unification is: ['bobcat', 'eats', 'squirrel', True] Subs is: {'a': 'bobcat', 'y': 'squirrel'}
importlib.reload(submitted)
unification, subs = submitted.unify(['x','eats','x',True],['a','eats','bobcat',True],['x','y','a','b'])
print('Unification is:',unification)
print('Subs is:',subs)
Unification is: ['bobcat', 'eats', 'bobcat', True] Subs is: {'x': 'a', 'a': 'bobcat'}
importlib.reload(submitted)
unification, subs = submitted.unify(['a','eats','bobcat',True],['x','eats','x',True],['x','y','a','b'])
print('Unification is:',unification)
print('Subs is:',subs)
Unification is: ['bobcat', 'eats', 'bobcat', True] Subs is: {'a': 'x', 'x': 'bobcat'}
importlib.reload(submitted)
unification, subs = submitted.unify(['a','eats','bobcat',True],['x','eats','x',False],['x','y','a','b'])
print('Unification is:',unification)
print('Subs is:',subs)
Unification is: None Subs is: None
In this MP we will be implementing backward-chaining. When we apply a rule, therefore, we will apply it backward. Suppose we have a set of propositions that we are trying to prove; call those the goals
. We test whether or not a rule is useful to us by the following procedure:
For example, suppose we have the following rule:
$$\forall x: \text{is}(\text{squirrel},\text{nice})\wedge\text{is}(x,\text{cold})\Rightarrow\text{visits}(x,\text{squirrel})$$Suppose that we are trying to prove the following list of goal statements:
$$\text{goals}[0]=\text{visits}(\text{bobcat},\text{bald eagle})$$$$\text{goals}[1]=\text{visits}(\text{bobcat},\text{squirrel})$$We can't apply the rule to $\text{goals}[0]$, because the rule talks about visiting squirrel. We can apply the rule to goal $\text{goals}[1]$, however. Once we have completed that application, the list of goals now reads:
$$\text{newgoals}[0]=\text{visits}(\text{bobcat},\text{bald eagle})$$$$\text{newgoals}[1]=\text{is}(\text{squirrel},\text{nice})$$$$\text{newgoals}[2]=\text{is}(\text{bobcat},\text{cold})$$If we can find a way to prove newgoals
, that will constitute a proof of goals
.
importlib.reload(submitted)
help(submitted.apply)
Help on function apply in module submitted: apply(rule, goals, variables) @param rule: A rule that is being tested to see if it can be applied This function should not modify rule; consider deepcopy. @param goals: A list of propositions against which the rule's consequent will be tested This function should not modify goals; consider deepcopy. @param variables: list of strings that should be treated as variables Rule application succeeds if the rule's consequent can be unified with any one of the goals. @return applications: a list, possibly empty, of the rule applications that are possible against the present set of goals. Each rule application is a copy of the rule, but with both the antecedents and the consequent modified using the variable substitutions that were necessary to unify it to one of the goals. Note that this might require multiple sequential substitutions, e.g., converting ('x','eats','squirrel',False) based on subs=={'x':'a', 'a':'bobcat'} yields ('bobcat','eats','squirrel',False). The length of the applications list is 0 <= len(applications) <= len(goals). If every one of the goals can be unified with the rule consequent, then len(applications)==len(goals); if none of them can, then len(applications)=0. @return goalsets: a list of lists of new goals, where len(newgoals)==len(applications). goalsets[i] is a copy of goals (a list) in which the goal that unified with applications[i]['consequent'] has been removed, and replaced by the members of applications[i]['antecedents']. Example: rule={ 'antecedents':[['x','is','nice',True],['x','is','hungry',False]], 'consequent':['x','eats','squirrel',False] } goals=[ ['bobcat','eats','squirrel',False], ['bobcat','visits','squirrel',True], ['bald eagle','eats','squirrel',False] ] variables=['x','y','a','b'] applications, newgoals = submitted.apply(rule, goals, variables) applications==[ { 'antecedents':[['bobcat','is','nice',True],['bobcat','is','hungry',False]], 'consequent':['bobcat','eats','squirrel',False] }, { 'antecedents':[['bald eagle','is','nice',True],['bald eagle','is','hungry',False]], 'consequent':['bald eagle','eats','squirrel',False] } ] newgoals==[ [ ['bobcat','visits','squirrel',True], ['bald eagle','eats','squirrel',False] ['bobcat','is','nice',True], ['bobcat','is','hungry',False] ],[ ['bobcat','eats','squirrel',False] ['bobcat','visits','squirrel',True], ['bald eagle','is','nice',True], ['bald eagle','is','hungry',False] ] ]
Edit your copy of apply
, and then test it using the example from the docstring:
import pprint
importlib.reload(submitted)
rule={
'antecedents':[['x','is','nice',True],['x','is','hungry',False]],
'consequent':['x','eats','squirrel',False]
}
goals=[
['bobcat','eats','squirrel',False],
['bobcat','visits','squirrel',True],
['bald eagle','eats','squirrel',False]
]
variables=['x','y','a','b']
applications, goalsets = submitted.apply(rule, goals, variables)
print('The possible applications of this rule are:')
pprint.PrettyPrinter(indent=1).pprint(applications)
print('The resulting modified goalsets are:')
pprint.PrettyPrinter(indent=1).pprint(goalsets)
The possible applications of this rule are: [{'antecedents': [['bobcat', 'is', 'nice', True], ['bobcat', 'is', 'hungry', False]], 'consequent': ['bobcat', 'eats', 'squirrel', False]}, {'antecedents': [['bald eagle', 'is', 'nice', True], ['bald eagle', 'is', 'hungry', False]], 'consequent': ['bald eagle', 'eats', 'squirrel', False]}] The resulting modified goalsets are: [[['bobcat', 'visits', 'squirrel', True], ['bald eagle', 'eats', 'squirrel', False], ['bobcat', 'is', 'nice', True], ['bobcat', 'is', 'hungry', False]], [['bobcat', 'eats', 'squirrel', False], ['bobcat', 'visits', 'squirrel', True], ['bald eagle', 'is', 'nice', True], ['bald eagle', 'is', 'hungry', False]]]
Backward-chaining is the method of proving a goal statement by finding rules that could be applied in order to prove it.
Backward-chaining is a search algorithm:
In backward-chaining, it is possible that your frontier will empty out before you ever reach a goal state. If that happens, it means that it is impossible to prove the query based on the propositions that are known in this world. Because of our closed-world assumption, we pretend that our inability to prove the query is proof of its falsehood, so in that case, your backward_chain
method should return the value proof==None
.
importlib.reload(submitted)
help(submitted.backward_chain)
Help on function backward_chain in module submitted: backward_chain(query, rules, variables) @param query: a proposition, you want to know if it is true @param rules: dict mapping from ruleIDs to rules @param variables: list of strings that should be treated as variables @return proof (list): a list of rule applications that, when read in sequence, conclude by proving the truth of the query. If no proof of the query was found, you should return proof=None.
The goal of this part of the MP is just to come up with an algorithm that can correctly distinguish between questions in which worlds[i]['questions'][j]['answer']==True
and those in which worlds[i]['questions'][j]['answer']==False
. You should feel pretty free to implement this in whatever way you like, e.g., you can use BFS or A* search, you can choose to re-expand previously expanded nodes or not, your frontier can use tuples or you can define a Node class for the frontier, and so on. The key is just to get the right answers.
import pprint
importlib.reload(submitted)
world = worlds['RelNoneg-D2-2133']
question = world['questions']['Q3']
print(question['text'])
print('Reference answer is:',question['answer'])
print('Reference proof is:',question['proofs'])
proof = submitted.backward_chain(question['query'],world['rules'], world['variables'])
if proof==None:
print('My code says the answer is: False')
else:
print('My code says the answer is: True')
pprint.PrettyPrinter(indent=1).pprint(proof)
The tiger eats the tiger? Reference answer is: True Reference proof is: [(((triple12) -> rule3) OR ((triple15) -> rule2))] My code says the answer is: True [{'antecedents': [], 'consequent': ['tiger', 'visits', 'bald eagle', True], 'text': 'The tiger visits the bald eagle.'}, {'antecedents': [['tiger', 'visits', 'bald eagle', True]], 'consequent': ['tiger', 'eats', 'tiger', True], 'text': 'If something visits the bald eagle then it eats the tiger.'}]
You will be graded only on the questions whose answers are either True
or False
. The questions whose answers are NAF
or CWA
are considered to be technically true, under the closed-world assumption, but the methods for proving their truth are beyond the scope of this MP.
importlib.reload(submitted)
world = worlds['RelNoneg-D2-2133']
for (qid, question) in world['questions'].items():
if question['answer']==True or question['answer']==False:
print(qid, ": ",question['text'])
print(" Reference proof is:",question['proofs'])
print(' Reference answer is:',question['answer'])
proof = submitted.backward_chain(question['query'],world['rules'],world['variables'])
if proof==None:
print(' My code says the answer is: False')
else:
print(' My code says the answer is: True')
pprint.pp(proof,indent=1)
Q1 : The tiger is blue? Reference proof is: [(triple9)] Reference answer is: True My code says the answer is: True [{'text': 'The tiger is blue.', 'antecedents': [], 'consequent': ['tiger', 'is', 'blue', True]}] Q2 : The tiger is not rough? Reference proof is: [(triple12)] Reference answer is: False My code says the answer is: False Q3 : The tiger eats the tiger? Reference proof is: [(((triple12) -> rule3) OR ((triple15) -> rule2))] Reference answer is: True My code says the answer is: True [{'text': 'The tiger visits the bald eagle.', 'antecedents': [], 'consequent': ['tiger', 'visits', 'bald eagle', True]}, {'text': 'If something visits the bald eagle then it eats the tiger.', 'antecedents': [['tiger', 'visits', 'bald eagle', True]], 'consequent': ['tiger', 'eats', 'tiger', True]}] Q4 : The tiger does not eat the tiger? Reference proof is: [(((triple12) -> rule3) OR ((triple15) -> rule2))] Reference answer is: False My code says the answer is: False Q5 : The tiger sees the tiger? Reference proof is: [(((((triple12) -> rule3)) -> rule1) OR ((((triple15) -> rule2)) -> rule1))] Reference answer is: True My code says the answer is: True [{'text': 'The tiger visits the bald eagle.', 'antecedents': [], 'consequent': ['tiger', 'visits', 'bald eagle', True]}, {'text': 'If something visits the bald eagle then it eats the tiger.', 'antecedents': [['tiger', 'visits', 'bald eagle', True]], 'consequent': ['tiger', 'eats', 'tiger', True]}, {'text': 'If something eats the tiger then it sees the tiger.', 'antecedents': [['tiger', 'eats', 'tiger', True]], 'consequent': ['tiger', 'sees', 'tiger', True]}] Q6 : The tiger does not see the tiger? Reference proof is: [(((((triple12) -> rule3)) -> rule1) OR ((((triple15) -> rule2)) -> rule1))] Reference answer is: False My code says the answer is: False
If you've reached this point, and all of the above sections work, then you're ready to try grading your homework! Before you submit it to Gradescope, try grading it on your own machine. This will run some visible test cases (which you can read in tests/test_visible.py
), and compare the results to the solutions (which you can read in solution.json
).
The exclamation point (!) tells python to run the following as a shell command. Obviously you don't need to run the code this way -- this usage is here just to remind you that you can also, if you wish, run this command in a terminal window.
!python grade.py
.... ---------------------------------------------------------------------- Ran 4 tests in 0.004s OK
If you got any 'E' marks, it means that your code generated some runtime errors, and you need to debug those.
If you got any 'F' marks, it means that your code ran without errors, but that it generated results that did not pass the test assertions listed in tests/test_visible.py
. Try debugging those.
If neither of those things happened, and your result was a series of dots, then your code works perfectly.
If you're not sure, you can try running grade.py with the -j option. This will produce a JSON results file, in which the best score you can get is 50.
!python grade.py -j
{ "tests": [ { "name": "test_apply (test_visible.TestStep)", "score": 12.5, "max_score": 12.5 }, { "name": "test_backward_chain (test_visible.TestStep)", "score": 12.5, "max_score": 12.5 }, { "name": "test_standardization (test_visible.TestStep)", "score": 12.5, "max_score": 12.5 }, { "name": "test_unify (test_visible.TestStep)", "score": 12.5, "max_score": 12.5 } ], "leaderboard": [], "visibility": "visible", "execution_time": "0.01", "score": 50.0 }
Now you should try uploading submitted.py
to Gradescope.
Gradescope will run the same visible tests that you just ran on your own machine, plus some additional hidden tests. It's possible that your code passes all the visible tests, but fails the hidden tests. If that happens, then it probably means that you hard-coded a number into your function definition, instead of using the input parameter that you were supposed to use. Debug by running your function with a variety of different input parameters, and see if you can get it to respond correctly in all cases.
Once your code works perfectly on Gradescope, with no errors, then you are done with the MP. Congratulations!