the making of phase 1 of Prolog task due in course git repository by end of day on 7 Mar 2019
1. 28 Feb 2019: WARNING WARNING WILL ROBINSON. In Prolog, there is a difference between writing X = Y + 1 and X is Y + 1. This difference changes the amount of memory you are using. The X = Y + 1 version uses more memory as it is storing a functor + with arguments Y and 1, whereas X is Y + 1 is just storing the value that results from actually doing the edition. This was mentioned briefly before reading week. For a more detailed explanation, see http://www.learnprolognow.org/slides/official/LPNchapter5.pdf . To avoid stack overflow problems, use X is Y + 1 when you are doing something where actual addition makes sense.
Below I show the testing code for phase 1 of the prolog task. Like the Ruby assignment, my sample testing code is in app/tester.pl and your solution code is expected to be in lib/due190307.pl .
TOC For this document:
1) Running the assignment
2) Automating the testing of your prolog
3) New Regular Expression Notation
4) Phase 1 tester.pl code
4.1) Preliminary Comments on Task
4.2) The Testing Code
5) PHASE 1 LAUNDRY LIST:
6) Hint on Using Prolog with Phase 1 of this assignment
1) Running the assignment
Unlike Ruby, instead of running the testing code from the app directory, one brings together the testing and solution by typing:
which puts you in the prolog interpreter where you can explore your code. At the final stage of checking my phase 1 solution, a session with the interpreter might look like:
gprolog consult lib/due190307.pl consult app/tester.pl
| ?- test_count_successful_tests(N).
test_count_successful_tests(N).
N= 2
(3 ms) yes
| ?- test_example_run_tests(S).
test_example_run_tests(S).
S = [[97],letter concat (letter star),my_succeed] ? ;
;
S = [[97],letter concat (letter star),my_succeed] ?
(2 ms) yes
| ?- exit.
exit.
uncaught exception: error(existence_error(procedure,exit/0),top_level/0)
| ?- quit.
quit.
uncaught exception: error(existence_error(procedure,quit/0),top_level/0)
| ?- C-c C-c
Prolog interruption (h for help) ? e
e
Sigh, for some reason I frequently forget how to exit prolog it turns out if you hit cntrl-c twice it brings up the interruption prompt to which e will exit (and h will remind you when you forget e).
2) Automating the testing of your prolog
Of course, hopefully by now you know the benefits of being able to automate your testing rather than retype it over and over. The following works:
echo test_count_successful_tests(A). | gprolog consult lib/due190307.pl consult app/tester.pl
I found the following predicates useful while testing and so provide them to you:
test_count_successful_tests(N) which tells you how many of your tests succeeded
test_example_run_tests(Status) which shows you which tests succeeded
test_debug_run_tests(Status) which shows you which tests succeeded and what the NFA was that you built as well as what the input string was transformed to (this latter being done by my provided code).
As with the Ruby code, I had to write something that would match a string against an NFA.
test_nfa_match(String, nfa(start(State),X,Y)) matches a string against an NFA
test_nfa_match_helper(String, nfa(start(State),X,Y), State) makes test_nfa_match easier to write by adding a specification of which state to start
the match from (so matching substrings is like matching the whole string, just with different starts)
test_categorize converts strings to lists of categories
test_category converts a character to a category
test_category_helper converts all the characters to categories except those going to the category undefined (Z in the Ruby notation)
The actual test cases are presented like:
In the Ruby assignment, this would have been equivalent to:
In the test_case , my_fail is used when the match should fail and my_succeed is used when the match should succeed.
3) New Regular Expression Notation
You will notice that the regular expression notation has changed from the Ruby assignment. This is because Prolog handles the conversion from infix notation to a tree for us. Consider the following:
test_case(a,letter concat (letter star), my_succeed).
test_case = (.L(|e(+L)))
target = a
test_case_tree =PrefixToTree.new(test_case).to_tree
test_case_nfa = TreeToNFA.new(test_case_tree).to_nfa
assert(NFA can match a against (.L(|e(+L))),
NFAScanner.new(test_case_nfa).match(target))
unix: cat directives.pl
:- op(300, yfx, concat).
:- op(500, yfx, or).
:- op(400, yf, plus).
:- op(400, yf, star).
:- op(400, yf, optional).
unix: gprolog consult directives.pl
GNU Prolog 1.4.5 (64 bits)
| ?- X = letter concat ( letter star ), X = concat(Y).
X = letter concat ( letter star ), X = concat(Y).
no
| ?- X = letter concat ( letter star ), X = concat(Y,Z).
X = letter concat ( letter star ), X = concat(Y,Z).
X = letter concat (letter star)
Y = letter
Z = letter star
yes
| ?- X = letter concat ( letter star ), X = concat(Y,star(Z)).
X = letter concat ( letter star ), X = concat(Y,star(Z)).
X = letter concat (letter star)
Y = letter
Z = letter
(1 ms) yes
| ?- X = letter concat ( letter star ), X = concat(Y,Z), Z = star(W).
X = letter concat ( letter star ), X = concat(Y,Z), Z = star(W).
W = letter
X = letter concat (letter star)
Y = letter
Z = letter star
yes
so, basically, using the ops directives in directives.pl (which are also in the provided tester.pl), when I type
prolog is automatically converting it into a tree notation of
The prolog tree notation looks a lot like the prefix lisp notation of the Ruby assignment except that the operator is just before the associated parenthesis contained parameters rather than being the first item in the parenthesis list. In prolog language this is a complex term. In our Ruby notation, . could take an unlimited number of parameters, but in our prolog notation being derived from infix, concat takes two parameters. However, prolog also handles associativity of binary operators so
letter concat ( letter star )
concat(letter, star(letter))
| ?-X = a concat b concat c, X = concat(Y,Z).
X = a concat b concat c, X = concat(Y,Z).
X = a concat b concat c Y = a concat b
Z= c
(1 ms) yes
| ?- X = a concat b concat c, X = concat(concat(A,B),C).
X = a concat b concat c, X = concat(concat(A,B),C).
A= a
B= b
C= c
X = a concat b concat c
yes | ?-
You can use parenthesis as you normally would in the infix version and they will disappear in the complex term version (although they will have, in the process, guided the shape of the complex term).
The operators to be supported in the assignment are:
The leaves of our regular expressions in this assignment will be
:- op(300, yfx, concat)this is . in the Ruby assignment restricted to binary form (binds tighter than other
operators)
:- op(500, yfx, or)
operators)
:- op(400, yf, plus)
:- op(400, yf, star)
:- op(400, yf, optional) this is ? from the textbook as a suffix operator binding between concat and or
this is | in the Ruby assignment restricted to binary form (binds looser than other
this is + in the Ruby assignment as a suffix operator binding between concat and or
this is * from the textbook as a suffix operator binding between concat and or
epsilon (e in Ruby assignment)
letter (L in Ruby assignment)
digit (D in Ruby assignment)
white_space (new handling and
)
question_mark (Q in Ruby assignment)
exclamation_mark (E in Ruby assignment)
period (P in Ruby assignment)
comma (C in Ruby assignment)
colon (K in Ruby assignment)
semicolon (new handlling 😉
minus_sign (M in Ruby assignment)
plus_operator (A in Ruby assignment)
binary_operators (B in Ruby assignment)
left_parenthesis (X in Ruby assignment)
right_parenthesis (Y in Ruby assignment)
equal (S in Ruby assignment)
less_than (T in Ruby assignment)
greater_than (U in Ruby assignment)
undefined (Z in Ruby assignment)
Prologs operator notation takes care of creating trees for regular expressions for us, so phase 1 starts with considering how to convert a regular expression tree to an NFA. As with the ruby assignment, testing is done by seeing if the resulting NFAs except the same strings as the regular expressions would have.
4) Phase 1 tester.pl code tester.pl contains:
4.1) Preliminary Comments on Task
% note: all the predicates defined in this file start with the prefix test_
% you are not allowed to reference any of these predicates in your own
% solution.if test_ appears anywhere in your lib/due190307.pl file
% (even in a comment or string constant), it will be assumed you are
% trying to compromise the testing process and marked accordingly.
% A scanner will be defined as a list of scanner terms
% % % % % %
where a scanner term is scanner(Token, RegularExpression)
where Token is the name the scanner returns if RegularExpression
matches an input string.if there are multiple such matches,
the Token in the first scanner term in the list to match is
the one that should match.
where RegularExpression is a regular expression term defined later.
% from NFA definition in text:
% corresponds to term nfa(Start, States, Transitions)
% Start is a state
% States is a list of state terms
% Transitions is a list of next terms
% a state term has the form state(Name, Status)
% where Name is a name given the state
% Status is either
%not_accepting if it is not an accepting state
%Token if it is an accepting state, when which Token
%
% a next term has the form next(From, To, Category)
% where From is a state term in States
% To is a state term in States
% Category is a character category or the term epsilon
% the categories will be
% letter
% digit
% white_space
% question_mark
% exclamation_mark
% period
% comma
% colon
% semicolon
% minus_sign
% plus_operator
% binary_operatorsfor* / % ^
% left_parenthesis
% right_parenthesis
% equal
% less_than
% greater_than
% undefined
%the regular expression operators will be:
%concat for concatenation as a binary infix operator
%plus for one or more copies of its parameter as a postfix operator
%star for zero or more copies of its parameter as a postfix operator
%opertional for zero or one copies of its parameter as a postfix operator
%or for from one or the other as a binary infix operator
%(they are suffixed with underscore to avoid conflicts with existing operators)
has been matched at this point.
As noted in Re: the making of phase 1 of Prolog task due in course git repository by end of day on 7 Mar 2019 ,
These were notes I wrote before doing the implementation and so have drifted a bit from the actual testing code, however, the nfa term still has
three parts:
The first part however, is not a State, but the term start(State) where State is filled in with the `name of the start state. This structure is used by my test_nfa_match predicate to find the start of the automata that is being matched against.
The States list maps nicely to the book definition of an NFA. However, my testing code never checks directly to see what you put in there. Keeping a list of state(Name,Status) values there would work fine, but it is over-specification in the sense that you could store other structures there that could also work fine. The main thing is that somewhere there has to be the information that allows your accepting predicate to tell if a given state is an accepting state or not. And in phase 2 (and 3), for the accepting states, you also need to know what token type they are accepting.
The transitions() structure matters and is used by test_nfa_match_helper to explore paths thru the nfa.
As to the structure of a state name, all it has to be is something that = works with. In an instance of an nfa, I would expect it to be grounded, i.e., having no free variables.
4.2) The Testing Code
:- op(300, yfx, concat).
:- op(500, yfx, or).
:- op(400, yf, plus).
:- op(400, yf, star).
:- op(400, yf, optional).
test_categorize([], []).
test_categorize([H_in|T_in], [H_result| T_result]) :- test_category(H_in, H_result), test_categorize(T_in,
T_result).
test_category(X,Y) :- test_category_helper(X,Y).
test_category(X,undefined) :- + test_category_helper(X,_).
test_category_helper(Char, letter) :- [ACode] = a, ACode @=< Char, [ZCode] = “z”, Char @=< ZCode.test_category_helper(Char, letter) :- [ACode] = “A”, ACode @=< Char, [ZCode] = “Z”, Char @=< ZCode.test_category_helper(Char, digit) :- [ZeroCode] = “0”, ZeroCode @=< Char, [NineCode] = “9”, Char @=< NineCode.test_category_helper(Char, white_space) :- [Char] = ” “.test_category_helper(Char, white_space) :- [Char] = ”
“.test_category_helper(Char, question_mark) :- [Char] = “?”.test_category_helper(Char, exclamation_mark) :- [Char] = “!”.test_category_helper(Char, period) :- [Char] = “.”.test_category_helper(Char, comma) :- [Char] = “,”.test_category_helper(Char, colon) :- [Char] = “:”.test_category_helper(Char, semicolon) :- [Char] = “;”.test_category_helper(Char, minus_sign) :- [Char] = “-“.test_category_helper(Char, plus_operator) :- [Char] = “+”.test_category_helper(Char, binary_operators) :- [Char] = “*”.test_category_helper(Char, binary_operators) :- [Char] = “/”.test_category_helper(Char, binary_operators) :- [Char] = “%”.test_category_helper(Char, binary_operators) :- [Char] = “^”.test_category_helper(Char, left_parenthesis) :- [Char] = “(“.test_category_helper(Char, right_parenthesis) :- [Char] = “)”.test_category_helper(Char, equal) :- [Char] = “=”.test_category_helper(Char, less_than) :- [Char] = “<“.test_category_helper(Char, greater_than) :- [Char] = “>.
test_count_successful_tests(N) :- setof(A, test_example_run_tests(A), L), length(L,N).
test_example_run_tests(Status) :-
test_case(InputString, RegularExpression, my_succeed),
Result = my_succeed,
regular_expression_to_nfa(RegularExpression, NFA),
test_categorize(InputString, TransformedString),
test_nfa_match(TransformedString, NFA),
Status = [InputString, RegularExpression, Result].
test_example_run_tests(Status) :-
test_case(InputString, RegularExpression, my_fail),
Result = my_fail,
regular_expression_to_nfa(RegularExpression, NFA),
test_categorize(InputString, TransformedString),
+ test_nfa_match(TransformedString, NFA),
Status = [InputString, RegularExpression, Result].
test_debug_run_tests(Status) :-
test_case(InputString, RegularExpression, my_succeed),
Result = my_succeed,
regular_expression_to_nfa(RegularExpression, NFA),
test_categorize(InputString, TransformedString),
test_nfa_match(TransformedString, NFA),
Status = [InputString, RegularExpression, TransformedString, NFA, Result].
test_debug_run_tests(Status) :-
test_case(InputString, RegularExpression, my_fail),
Result = my_fail,
regular_expression_to_nfa(RegularExpression, NFA),
test_categorize(InputString, TransformedString),
+ test_nfa_match(TransformedString, NFA),
Status = [InputString, RegularExpression, TransformedString, NFA, Result].
test_nfa_match(String, nfa(start(State),X,Y)) :- test_nfa_match_helper(String, nfa(start(State),X,Y), State).
test_nfa_match_helper(, NFA, State) :- epsilon_closure(State, NFA, Closure),
member(PossibleFinish, Closure),
accepting(NFA, PossibleFinish).
test_nfa_match_helper([H|T], nfa(X,Y,transitions(Transitions)), State) :-
member(next(State, NextState, H), Transitions),
test_nfa_match_helper(T, nfa(X,Y, transitions(Transitions)), NextState).
test_nfa_match_helper(String, nfa(X,Y,transitions(Transitions)), State) :-
epsilon_closure(State, nfa(X,Y,transitions(Transitions)), Closure),
member(NextState, Closure),
NextState = State,
member(next(State, NextState, epsilon), Transitions),
test_nfa_match_helper(String, nfa(X,Y,transitions(Transitions)), NextState).
test_case(a,letter concat (letter star), my_succeed).
test_case(a,letter concat letter star, my_fail).
% test_case(a,(letter concat letter) star, my_fail).
% identical toprevious test case after operator processing and so doesnt
% show up as distinct in count of tests passed.
5) PHASE 1 LAUNDRY LIST:
what you are writing in due190307.pl for phase 1 is the definition of the predicates
The value of RegularExpression will be bound before the predicate is invoked by the regular expression notation described earlier. Again what your definitions are supposed to do is to cause my code to do what it is supposed to do, i.e., successfully match on the NFA what would successfully match the RegularExpression and fail to match the NFA what would fail to match the RegularExpression.
accepting is true if State is an accepting state in NFA.
In the Ruby assignment, there were constraints on the NFA in terms of methods it had to respond to. Here the restrictions are on the shape of the complex term that represents an NFA. Specifically, it looks like:
where Transitions is a list of next terms so that I can do:
and Closure that epsilon_closure produces is also a list that supports:
regular_expression_to_nfa(RegularExpression, NFA)
epsilon_closure(State, NFA, Closure)
accepting(NFA, State)
nfa(start(State),Y,transitions(Transitions))
member(next(State, NextState, H), Transitions)
member(next(State, NextState, epsilon), Transitions)
member(NextState, Closure)
6) Hint on Using Prolog with Phase 1 of this assignment
Note, prolog builtins used in the test code are:
:-
,
op
=
=
[]
[|]
[,]
% comments
+
string
@=<setoflengthmember Note the prolog builtins used in my solution code for phase 1 were::-,_==[][,][|]is evaluates complex term+if functor of complex term evaluates to addition1evaluates to 1appendsortmemberfindall sortof that leaves in duplicates The builtins are defined in http://www.gprolog.org/manual/gprolog.html . A lot of other stuff is in there too. Most of it bad for an assignment like this one, but useful for other more complicated tasks. Remember prolog is about separating logic from control and in this assignment there should be very little reason to specify control (hence most of the classic impure parts of prolog are not listed among the builtins above. But, as with the Ruby assignment, badly written code that follows the assignment restrictions and passes the test data will be marked as if it were well written code.
Reviews
There are no reviews yet.