830 likes | 974 Vues
This document outlines the content from Week 5 of the Prolog for Linguists course, given by John Dowding at Stanford University on November 5, 2001. It includes information about office hours, reserved workstations, and the course schedule, alongside an introduction to red and green cuts in Prolog, input/output operations, stream management, character representation, and related built-in predicates. It also contains examples of defining predicates such as `delete_all`, `classify_term`, and `palindrome`, enhancing understanding of Prolog's symbolic systems.
E N D
Prolog for Linguists Symbolic Systems 139P/239P John Dowding Week 5, Novembver 5, 2001 jdowding@stanford.edu
Office Hours • We have reserved 4 workstations in the Unix Cluster in Meyer library, fables 1-4 • 4:30-5:30 on Thursday this week • Or, contact me and we can make other arrangements
Course Schedule • Oct. 8 • Oct. 15 • Oct. 22 • Oct. 29 • Nov. 5 (double up) • Nov. 12 • Nov. 26 (double up) • Dec. 3 No class on Nov. 19
More about cut! • Common to distinguish between red cuts and green cuts • Red cuts change the solutions of a predicate • Green cuts do not change the solutions, but effect the efficiency • Most of the cuts we have used so far are all red cuts %delete_all(+Element, +List, -NewList) delete_all(_Element, [], []). delete_all(Element, [Element|List], NewList) :- !, delete_all(Element, List, NewList). delete_all(Element, [Head|List], [Head|NewList]) :- delete_all(Element, List, NewList).
Green cuts • Green cuts can be used to avoid unproductive backtracking % identical(?Term1, ?Term2) identical(Var1, Var2):- var(Var1), var(Var2), !, Var1 == Var2. identical(Atomic1,Atomic2):- atomic(Atomic1), atomic(Atomic2), !, Atomic1 == Atomic2. identical(Term1, Term2):- compound(Term1), compound(Term2), functor(Term1, Functor, Arity), functor(Term2, Functor, Arity), identical_helper(Arity, Term1, Term2).
Input/Output of Terms • Input and Output in Prolog takes place on Streams • By default, input comes from the keyboard, and output goes to the screen. • Three special streams: • user_input • user_output • user_error • read(-Term) • write(+Term) • nl
Example: Input/Output • repeat/0 is a built-in predicate that will always resucceed % classifing terms classify_term :- repeat, write('What term should I classify? '), nl, read(Term), process_term(Term), Term == end_of_file.
Streams • You can create streams with open/3 open(+FileName, +Mode, -Stream) • Mode is one of read, write, or append. • When finished reading or writing from a Stream, it should be closed with close(+Stream) • There are Stream-versions of other Input/Output predicates • read(+Stream, -Term) • write(+Stream, +Term) • nl(+Stream)
Characters and character I/O • Prolog represents characters in two ways: • Single character atoms ‘a’, ‘b’, ‘c’ • Character codes • Numbers that represent the character in some character encoding scheme (like ASCII) • By default, the character encoding scheme is ASCII, but others are possible for handling international character sets. • Input and Output predicates for characters follow a naming convention: • If the predicate deals with single character atoms, it’s name ends in _char. • If the predicate deals with character codes, it’s name ends in _code. • Characters are character codes is traditional “Edinburgh” Prolog, but single character atoms were introduced in the ISO Prolog Standard.
Special Syntax I • Prolog has a special syntax for typing character codes: • 0’a is a expression that means the character codc that represents the character a in the current character encoding scheme.
Special Syntax II • A sequence of characters enclosed in double quote marks is a shorthand for a list containing those character codes. • “abc” = [97, 98, 99] • It is possible to change this default behavior to one in which uses single character atoms instead of character codes, but we won’t do that here.
Built-in Predicates: • atom_chars(Atom, CharacterCodes) • Converts an Atom to it’s corresponding list of character codes, • Or, converts a list of CharacterCodes to an Atom. • put_code(Code) and put_code(Stream, Code) • Write the character represented by Code • get_code(Code) and get_code(Stream, Code) • Read a character, and return it’s corresponding Code • Checking the status of a Stream: • at_end_of_file(Stream) • at_end_of_line(Stream)
Review homework problems: last/2 % last(?Element, ?List) last(Element, [Element]). last(Element, [_Head|Tail]):- last(Element, Tail). Or last(Element, List):- append(_EverthingElse, [Element], List).
evenlist/1 and oddlist/1 %evenlist(?List). evenlist([]). evenlist([_Head|Tail]):- oddlist(Tail). %oddlist(+List) oddlist([_Head|Tail]):- evenlist(Tail).
palindrome/1 %palindrome1(+List). palindrome1([]). palindrome1([_OneElement]). palindrome1([Head|Tail]):- append(Rest, [Head], Tail), palindrome1(Rest).
Or, palindrome/1 %palindrome(+List) palindrome(List):- reverse(List, List). %reverse(+List, -ReversedList) reverse(List, ReversedList):- reverse(List, [], ReversedList). %reverse(List, Partial, ReversedList) reverse([], Result, Result). reverse([Head|Tail], Partial, Result):- reverse(Tail, [Head|Partial], Result).
subset/2 %subset(?Set, ?SubSet) subset([], []). subset([Element|RestSet], [Element|RestSubSet]):- subset(RestSet, RestSubSet). subset([_Element|RestSet], SubSet):- subset(RestSet, SubSet).
union/3 %union(+Set1, +Set2, -SetUnion) union([], Set2, Set2). union([Element|RestSet1], Set2, [Element|SetUnion]):- union(RestSet1, Set2, SetUnion), \+ member(Element, SetUnion), !. union([_Element|RestSet1], Set2, SetUnion):- union(RestSet1, Set2, SetUnion).
intersect/3 %intersect(+Set1, +Set2, ?Intersection) intersect([], _Set2, []). intersect([Element|RestSet1], Set2, [Element|Intersection]):- member(Element, Set2), !, intersect(RestSet1, Set2, Intersection). intersect([_Element|RestSet1], Set2, Intersection):- intersect(RestSet1, Set2, Intersection).
split/4 %split(+List, +SplitPoint, -Smaller, -Bigger). split([], _SplitPoint, [], []). split([Head|Tail], SplitPoint, [Head|Smaller], Bigger):- Head =< SplitPoint, !, % green cut split(Tail, SplitPoint, Smaller, Bigger). split([Head|Tail], SplitPoint, Smaller, [Head|Bigger]):- Head > SplitPoint, split(Tail, SplitPoint, Smaller, Bigger).
merge/3 %merge(+List1, +List2, -MergedList) merge([], List2, List2). merge(List1, [], List1). merge([Element1|List1], [Element2|List2], [Element1|MergedList]):- Element1 =< Element2, !, merge(List1, [Element2|List2], MergedList). merge(List1, [Element2|List2], [Element2|MergedList]):- merge(List1, List2, MergedList).
Sorting: quicksort/2 % quicksort(+List, -SortedList) quicksort([], []). quicksort([Head|UnsortedList], SortedList):- split(UnsortedList, Head, Smaller, Bigger), quicksort(Smaller, SortedSmaller), quicksort(Bigger, SortedBigger), append(SortedSmaller, [Head|SortedBigger], SortedList).
Sorting: mergesort/2 % mergesort(+List, -SortedList). mergesort([], []). mergesort([_One], [_One]):- !. mergesort(List, SortedList):- break_list_in_half(List, FirstHalf, SecondHalf), mergesort(FirstHalf, SortedFirstHalf), mergesort(SecondHalf, SortedSecondHalf), merge(SortedFirstHalf, SortedSecondHalf, SortedList).
Merge sort helper predicates % break_list_in_half(+List, -FirstHalf, -SecondHalf) break_list_in_half(List, FirstHalf, SecondHalf):- length(List, L), HalfL is L /2, first_n(List, HalfL, FirstHalf, SecondHalf). % first_n(+List, +N, -FirstN, -Remainder) first_n([Head|Rest], L, [Head|Front], Back):- L > 0, !, NextL is L - 1, first_n(Rest, NextL, Front, Back). first_n(Rest, _L, [], Rest).
Lexigraphic Ordering • We can extending sorting predicates to sort all Prolog terms using a lexigraphic ordering on terms. • Defined recursively: • Variables @< Numbers @< Atoms @< CompoundTerms • Var1 @< Var2 if Var1 is older than Var2 • Atom1 @< Atom2 if Atom1 is alphabetically earlier than Atom2. • Functor1(Arg11, … Arg1N) @< Functor2(Arg21,…, Arg2M) if • Functor1 @< Functor2, or Functor1 = Functor2 and • N @< M, or Functor1=Functor2, N=M, and • Arg11 @< Arg21, or • Arg11 @= Arg21 and Arg12 @< Arg22, or …
Built-in Relations: • Less-than @< • Greater than @> • Less than or equal @=< • Greater than or equal @>= • Built-in predicate sort/2 sorts Prolog terms on a lexigraphic ordering.
Tokenizer • A token is a sequence of characters that constitute a single unit • What counts as a token will vary • A token for a programming language may be different from a token for, say, English. • We will start to write a tokenizer for English, and build on it in further classes
Homework • Read section in SICTus Prolog manual on Input/Output • This material corresponds to Ch. 5 in Clocksin and Mellish, but the Prolog manual is more up to date and consistent with the ISO Prolog Standard • Improve the tokenizer by adding support for contractions • can’t., won’t haven’t, etc. • would’ve, should’ve • I’ll, she’ll, he’ll • He’s, She’s, (contracted is and contracted has, and possessive) • Don’t hand this in, but hold on to it, you’ll need it later.
My tokenizer • First, I modified to turn all tokens into lower case • Then, added support for integer tokens • Then, added support for contraction tokens
Converting character codes to lower case % occurs_in_word(+Code, -LowerCaseCode) occurs_in_word(Code, Code):- Code >= 0'a, Code =< 0'z. occurs_in_word(Code, LowerCaseWordCode):- Code >= 0'A, Code =< 0'Z, LowerCaseWordCode is Code + (0'a - 0'A).
Converting to lower case % case for regular word tokens find_one_token([WordCode|CharacterCodes], Token, RestCharacterCodes):- occurs_in_word(WordCode, LowerCaseWordCode), find_rest_word_codes(CharacterCodes, RestWordCodes, RestCharacterCodes), atom_chars(Token, [LowerCaseWordCode|RestWordCodes]). find_rest_word_codes(+CharacterCodes, -RestWordCodes, -RestCharacterCodes) find_rest_word_codes([WordCode|CharacterCodes], [LowerCaseWordCode|RestWordCodes], RestCharacterCodes):- occurs_in_word(WordCode, LowerCaseWordCode), !, % red cut find_rest_word_codes(CharacterCodes, RestWordCodes, RestCharacterCodes). find_rest_word_codes(CharacterCodes, [], CharacterCodes).
Adding integer tokens % case for integer tokens find_one_token([DigitCode|CharacterCodes], Token, RestCharacterCodes):- digit(DigitCode), find_rest_digit_codes(CharacterCodes, RestDigitCodes, RestCharacterCodes), atom_chars(Token, [DigitCode|RestDigitCodes]). % find_rest_digit_codes(+CharacterCodes, -RestDigitCodes, -RestCharacterCodes) find_rest_digit_codes([DigitCode|CharacterCodes], [DigitCode|RestDigitCodes], RestCharacterCodes):- digit(DigitCode), !, % red cut find_rest_digit_codes(CharacterCodes, RestDigitCodes, RestCharacterCodes). find_rest_digit_codes(CharacterCodes, [], CharacterCodes).
Digits %digit(+Code) digit(Code):- Code >= 0'0, Code =< 0'9.
Contactions • Turned unambiguous contractions into the corresponding English word • Left ambiguous contractions contracted. • Handled 2 cases • Simple contractions: He’s => He + ‘s He’ll => He + will They’ve => They + have • Exceptions can’t => can + not won’t => will + not
Simple Contractions simple_contraction("'re", "are"). simple_contraction("'m", "am"). simple_contraction("'ll", "will"). simple_contraction("'ve", "have"). simple_contraction("'d", "'d"). % had, would simple_contraction("'s", "'s"). % is, has, possessive simple_contraction("n't", "not").
handle_contractions/2 % handle_contractions(+TokenChars, -FrontTokenChars, RestTokenChars) handle_contractions("can't", "can", "not"):- !. handle_contractions("won't", "will", "not"):- !. handle_contractions(FoundCodes, Front, NewCodes):- simple_contraction(Contraction, NewCodes), append(Front, Contraction, FoundCodes), Front \== [], !.
Modify find_one_token/3 % case for regular word tokens find_one_token([WordCode|CharacterCodes], Token, RestCharacterCodes):- occurs_in_word(WordCode, LowerCaseWordCode), find_rest_word_codes(CharacterCodes, RestWordCodes, TempCharacterCodes), handle_contractions([LowerCaseWordCode|RestWordCodes], FirstTokenCodes, CodesToAppend), append(CodesToAppend, TempCharacterCodes, RestCharacterCodes), atom_chars(Token, FirstTokenCodes).
Dynamic predicates and assert • Add or remove clauses from a dynamic predicate at run time. • To specify that a predicate is dynamic, add :- dynamic predicate/Arity. to your program. • assert/1 adds a new clause • retract/1 removes one or more clauses • retractall/1 removes all clauses for the predicate • Can’t modify compiled predicates at run time • Modifying a program while it is running is dangerous
assert/1, asserta/1, and assertz/1 • Asserting facts (most common) assert(Fact) • Asserting rules assert( (Head :- Body) ). • asserta/1 adds the new clause at the front of the predicate • assertz/1 adds the new clause at the end of the predicate • assert/1 leaves the order unspecified
Built-In: retract/1 • retract(Goal) removes the first clause that matches Goal. • On REDO, it will remove the next matching clause, if any. • Retract facts: retract(Fact) • Retract rules: retract( (Head :- Body) ).
Built-in: retractall/1 • retractall(Head) removes all facts and rules whose head matches. • Could be implemented with retract/1 as: retractall(Head) :- retract(Head), fail. retract(Head):- retract( (Head :- _Body) ), fail. retractall(_Head).
Built-In: abolish(Predicate/Arity) • abolish(Predicate/Arity) is almost the same as retract(Predicate(Arg1, …, ArgN)) except that abolish/1 removes all knowledge about the predicate, where retractall/1 only removes the clauses of the predicate. That is, if a predicate is declared dynamic, that is remembered after retractall/1, but not after abolish/1.
Example: Stacks & Queues :- dynamic stack_element/1. empty_stack :- retractall(stack_selement(_Element)). % push_on_stack(+Element) push_on_stack(Element):- asserta(stack_element(Element)). % pop_from_stack(-Element) pop_from_stack(Element):- var(Element), retract(stack_element(Element)), !.
Queues % dynamic queue_element/1. empty_queue :- retractall(queue_element(_Element)). %put_on_queue(+Element) put_on_queue(Element):- assertz(queue_element(Element)). %remove_from_queue(-Element) remove_from_queue(Element):- var(Element), retract(queue_element(Element)), !.
Example: prime_number. :- dynamic known_prime/1. find_primes(Prime):- retractall(known_prime(_Prime)), find_primes(2, Prime). find_primes(Integer, Integer):- \+ composite(Integer), assertz(known_prime(Integer)). find_primes(Integer, Prime):- NextInteger is Integer + 1, find_primes(NextInteger, Prime).
Example: prime_number (cont) %composite(+Integer) composite(Integer):- known_prime(Prime), 0 is Integer mod Prime, !.
Aggregation: findall/3. • findall/3 is a meta-predicate that collects values from multiple solutions to a Goal: findall(Value, Goal, Values) findall(Child, parent(james, Child), Children) • Prolog has other aggregation predicates setof/3 and bagof/3, but we’ll ignore them for now.
findall/3 and assert/1 • findall/3 and assert/1 both let you preserve information across failure. :- dynamic solutions/1. findall(Value, Goal, Solutions):- retractall(solutions/1), assert(solutions([])), call(Goal), retract(solutions(S)), append(S, [Value], NextSolutions), assert(solutions(NextSolutions)), fail. findall(_Value, Goal, Solutions):- solutions(Solutions).
Special Syntax III: Operators • Convenience in writing terms • We’ve seem them all over already: union([Element|RestSet1], Set2, [Element|SetUnion]):- union(RestSet1, Set2, SetUnion), \+ member(Element, SetUnion), !. This is just an easier way to write the term: ‘:-’(union([Element|RestSet],Set2,[Element|SetUnion]), ‘,’(union(RestSet1,Set2,SetUnion), ‘,’(‘\+’(member(Element, SetUnion), !)))
Operators (cont) • Operators can come before their arguments (prefix) • \+, dynamic • Or between their arguments (infix) • , + is < • Of after their arguments (postfix) • Prolog doesn’t use any of these (yet) • The same Operator can be more than one type • :-