This project specification is subject to change at any time for clarification. For this project you will be writing several Java classes to implement a programming language scanner and a CSV parser. The classes developed in this project will be used in subsequent projects. A Token class has been provided for you and an interface the Scanner will rely on for input has also been provided. The Scanner class must take in a PeekableCharacterStream and a List of keywords and must produce a stream of Tokens that are queried from either peekNextToken or getNextToken methods. The rules for the tokens defined below. You will also develop the CSVParser class that will parse CSV files into Maps, one for each row. Your first task is to develop a class that implements the PeekableCharacterStream interface for a FileInputStream. public interface PeekableCharacterStream{ // Returns true if more characters are available, false otherwise public boolean moreAvailable(); // Returns the next character that would be returned without consuming // the character. If no more characters are available -1 is returned. public int peekNextChar(); // Returns the character ahead in the stream without consuming the // the character. peekAheadChar(0) returns the same character as // peekNextChar(). If no more characters are available at that position // -1 is returned. public int peekAheadChar(int ahead); // Returns the next character and consumes it. If no more characters are // available -1 is returned. public int getNextChar(); // Closes the stream. public void close(); } Your second task is to develop the Scanner class that will rely on the PeekableCharacterStream interface and the Token class. The Scanner must have the following minimal interface. public class Scanner{ // Constructor that takes in a stream and a list of keywords. public Scanner(PeekableCharacterStream stream, List keywordlist); // Returns the next token without consuming it. If no more tokens are // available a None token is returned. public Token peekNextToken(); // Returns the next token and consumes it. If no more tokens are // available a None token is returned. ECS140A FQ20 October 7, 2020 This content is protected and may not be shared, uploaded, or distributed. Project 1 2 of 3 public Token getNextToken(); } Token Rules: Identifier := ( _ | Alpha ) { ( _ | Digit | Alpha ) } Operator := ( | , | ) | { | } | = | == | < | > | <= | >= | != | + | – | * | / | ; IntConstant := [ – ] Digit { Digit } FloatConstant := [ – ] Digit { Digit } [ . Digit { Digit } ] StringConstant := ” { ( CharacterLiteral | EscapedCharacter ) } ” Digit := 0 – 9 Alpha := A – Z | a – z WhiteSpace := Space | Tab | CarriageReturn | NewLine CharacterLiteral := Space – ! | # – [ | ] – ~ EscapedCharacter := b |
| r | t | \ | ’ | ” Additional Tokenizing Rules: • Keywords are identifiers that are in the List provided to the Scanner. • Whitespace must be skipped during the tokenizing. • A negative sign immediately preceding an integer or float constant must be tokenized as an operator if the previous token was a constant or identifier. For example, “A -5” must be tokenized as Identifier “A”, Operator “-”, and IntConstant “5”, not as Identifier “A”, and IntConstant “-5”. • If an underscore or Alpha character immediately follows a constant, the token is considered Invalid. All Alpha, Digit, and underscore characters will be part of the Invalid token. • If an invalid character is in a string constant, the characters are consumed until the next non-escaped ” is reached or the end of stream is reached. The token type is considered Invalid. • Any invalid character beginning a token will be considered an Invalid token by itself. For example, “@#4” must be tokenized as Invalid “@”, Invalid “#”, and IntConstant “4”. Implementation Requirements: • You may use java.io.FileInputStream. • You may use java.util.Set, java.util.HashSet, java.util.Arrays and similar containers. • You may not use java.util.regex or similar packages. • You may not use java.util.StringTokenizer or similar library classes. Your final task is to develop the CSVParser class that will rely on the PeekableCharacterStream interface. The CSVParser must have the following minimal interface. public class CSVParser{ // Constructor that takes in a stream. public CSVParser(PeekableCharacterStream stream); // Returns the next row without consuming it. If no more rows are // available null is returned. public Map<String,String> peekNextRow(); // Returns the next row and consumes it. If no more rows are // available null is returned. public Map<String,String> getNextRow(); } ECS140A FQ20 October 7, 2020 This content is protected and may not be shared, uploaded, or distributed. Project 1 3 of 3 CSV Format Rules • CSV files must have a header row, and no column in the header may be repeated or empty. • Each row is terminated by a newline character. • Each column is terminated by a comma character. • Any whitespace (space, tab, carriage return, or newline) character that is part of a column must be a double quoted ” column. The escape sequence for a double quote in a double quoted column is two double quotes in a row. • Any empty columns or missing columns will return a value of null for the corresponding value in the returned Map. • Valid CSV files are not allowed to have more columns in a data row than the header row but may have fewer. Your Scanner and CSVParser classes must have a main function that takes in a filename as an argument and outputs the contents similar to the examples in /home/cjnitta/ecs140a/proj1 on the CSIF. You can run the provided solutions by running the shell scripts Scanner.sh or CSVParser.sh and providing a filename to open. Your code will be tested on the CSIF and is expected to compile and run on the CSIF. You must submit the source file(s), a Makefile, and README.txt file, in a tgz archive. Do a make clean prior to zipping up your files so the size will be smaller. You will want to be in the parent directory of the project directory when creating the tgz archive. You can tar gzip a directory with the command: tar -zcvf archive-name.tgz directory-name You should avoid using existing source code as a primer that is currently available on the Internet. You must specify in your readme file any sources of code that you have viewed to help you complete this project. You must also provide the URL any code sources in comments of your source code. All class projects will be submitted to MOSS to determine if students have excessively collaborated. Excessive collaboration, or failure to list external code sources will result in the matter being referred to Student Judicial Affairs.
Overview This project specification is subject to change at any time for clarification. For this project you will be writing a Java program to parse language X. Language X is similar to the C language, but is simpler and slightly different. For this program you will be reading in an X language file and converting it to a syntax highlighted XHTML version. A CSV file will describe the font and color settings for the XHTML output file. The CSV file will have a .csv extension and the X language file will have a .x extension. You will be creating several classes that will solve the various portions of the problem, and reusing the classes developed in Project 1. X Programming Language The following EBNF describes the X language. The Non-terminals on the right are identified by italics. Literal values are specified in bold. The operators not in bold or italics describe the options of the EBNF. These operators are {} for repetition, [] for optional, () for grouping, and | for or. EBNF Rules: Program := {Declaration} MainDeclaration {FunctionDefinition} Declaration := DeclarationType (VariableDeclaration | FunctionDeclaration) MainDeclaration := void main ( ) Block FunctionDefinition := DeclarationType ParameterBlock Block DeclarationType := DataType Identifier VariableDeclaration := [= Constant] ; FunctionDeclaration := ParameterBlock ; Block := { {Declaration} [(Statement {Statement} {FunctionDefinition})] } ParameterBlock := ( [Parameter {, Parameter}] ) DataType := IntegerType | FloatType Constant := IntConstant | FloatConstant Statement := Assignment | WhileLoop | IfStatement | ReturnStatement | (Expression 😉 Parameter := DataType Identifier IntegerType := [unsigned] ( char | short | int | long ) FloatType := float | double Assignment := Identifier = {Identifier =} Expression ; WhileLoop := while ( Expression ) Block IfStatement := if ( Expression ) Block ReturnStatement := return Expression ; Expression := SimpleExpression [ RelationOperator SimpleExpression ] SimpleExpression := Term { AddOperator Term } Term := Factor { MultOperator Factor } Factor := ( ( Expression ) ) | Constant | (Identifier [ ( [ Expression {, Expression}] ) ] ) RelationOperator := ( == ) | < | > | ( <= ) | ( >= ) | ( != ) AddOperator := + | – MultOperator := * | / ECS140A FQ20 October 22, 2020 This content is protected and may not be shared, uploaded, or distributed. Project 2 2 of 4 Token Rules: Identifier := ( _ | Alpha ) { ( _ | Digit | Alpha ) } IntConstant := [ – ] Digit { Digit } FloatConstant := [ – ] Digit { Digit } [ . Digit { Digit } ] Digit := 0 – 9 Alpha := A – Z | a – z The X programming language allows for nested function declarations. Variables and functions must be declared before they are referenced or called. Identifiers have the same scoping rules as C. An identifier may only be declared once per block but may be declared more than once per file. Output XHTML The output XHTML will output the X language file with fonts and colors specified by the CSV file. All token types that are not specified in the CSV file will be assumed to be the same as the browser defaults. The token classes are IntConstants, FloatConstants, Keywords, Operators, Variables, and Functions. The keywords of language X are: unsigned, char, short, int, long, float, double, while, if, return, void, and main. The operators of language X are: (, ,, ), {, }, =, ==, <, >, <=, >=, !=, +, -, *, /, and ;. All Variable and Function references must link back to the declaration of the Variable or Function. Each nested block should be indented another level, the amount of indention will 4 spaces. Input CSV File The input CSV description file describes the formatting to be used for the XHTML file. The default will be specified by the DEFAULT ELEMENT_TYPE. The attribute types are: BACKGROUND, FOREGROUND, and STYLE. The DEFAULT BACKGROUND, FOREGROUND, and STYLE attributes set the global background color, foreground color and font face for the output XHTML. The indention is 4 spaces for each nested block. The FOREGROUND and STYLE may be specified for the token types: FUNCTION, VARIABLE, FLOAT_CONSTANT, INT_CONSTANT, OPERATOR, and KEYWORD. Below is an example of CSV formatting file. ELEMENT_TYPE,BACKGROUND,FOREGROUND,STYLE,FONT DEFAULT,navy,yellow,,”Courier New” FUNCTION,,orange,, VARIABLE,,yellow,, FLOAT_CONSTANT,,aqua,b, INT_CONSTANT,,aqua,b, OPERATOR,,white,b, KEYWORD,,white,b, ECS140A FQ20 October 22, 2020 This content is protected and may not be shared, uploaded, or distributed. Project 2 3 of 4 Suggested Approach 1. Write a class that will open and read the .csv file as specified on the command line. You should use your CSVParser class from Project 1. Make sure you can determine the formatting for each element type. 2. Use your Scanner class from Project 1 to write a recursive descent parser class that will parse the .x file by making repeated calls to the getNextToken() method. (You may need to use peekNextToken() to look ahead in some instances.) This class should implement one method per EBNF rule. It should also be able to validate that the .x file does conform to the grammar specified. If an error is found in the .x file, the line number of the error and the grammar rule where the error was found should be printed with an appropriate message. A working example is available on the CSIF to test against. 3. Using the classes written in steps 1 and 2, write a program that reads in a .x source file and a .csv file and outputs the XHTML formatting. The colors and styles of tokens in the XHTML file should match the specification in the .csv file. In addition, references to variables and functions in the source file should be hyperlinks back to their respective declarations. Do not worry about matching overloaded functions with appropriate call, just link back to the function that would be in current scope as if it were a variable. The XHTML should be a valid HTML that can be loaded into any web browser and viewed. You will probably need to begin your XHTML with: You may also need to have attributes in your html element. So it should look like:Examples of valid input files .x and .csv files are in /home/cjnitta/ecs140a/proj2 on the CSIF. Scripts that run the example classes can be found there too. All scripts take one argument except for XFormatter.sh which takes two arguments, the format CSV file and then the X file to format. Your code will be tested on the CSIF and is expected to compile and run on the CSIF. You must submit the source file(s), a Makefile, and README.txt file, in a tgz archive. Do a make clean prior to zipping up your files so the size will be smaller. You will want to be in the parent directory of the project directory when creating the tgz archive. You can tar gzip a directory with the command: tar -zcvf archive-name.tgz directory-name You should avoid using existing source code as a primer that is currently available on the Internet. You must specify in your readme file any sources of code that you have viewed to help you complete this project. You must also provide the URL any code sources in comments of your source code. All class projects will be submitted to MOSS to determine if students have excessively collaborated. Excessive collaboration, or failure to list external code sources will result in the matter being referred to Student Judicial Affairs. ECS140A FQ20 October 22, 2020 This content is protected and may not be shared, uploaded, or distributed. Project 2 4 of 4 Helpful Hints • The use of peekNextToken()will be helpful in some instances where it isn’t clear what the next rule should be parsed. The language is designed so that only a single look ahead is necessary. • Obviously, a hash table of some form should be used for the symbol table, but there are two clear options for handling of the scope blocks. Either a stack (or list) of hash tables can be used for the blocks, or a hash table list values could be used. Using a stack of hash tables makes the cleaning up of the scope easy when the block ends but requires traversing through the outer scope hash tables if the symbol is not found in the inner table. Using a hash table of list values makes finding the correct symbol easy but complicates the scope cleanup when the block ends. • In order to implement the X formatter there are two general approaches. One is to use an observer pattern https://en.wikipedia.org/wiki/Observer_pattern that observes the parsing of your class developed in step 2. You can create an observer interface that has methods to notify of beginning/ending of parsing and rules, as well as parsing of a token and if there is an error. The formatter would then just need to implement the observer interface. Another option is to subclass the X parser and to call the equivalent super method after any pre work is completed, and any post work can be done after the method returns. There are advantages/disadvantages to both approaches.
Overview
This project specification is subject to change at any time for clarification. For this
programming assignment you will be programming in LISP. This entire project must be
purely functional; you are not allowed to use setq, loop or similar constructs. Points
may be docked for use of such constructs.
Fibonacci Sequence Less Than N
Write a function fibo-lt to return a list as the Fibonacci sequence such that all numbers
in the sequence are less than n (NOT the first n numbers of the sequence). For example:
> (fibo-lt ’50)
(0 1 1 2 3 5 8 13 21 34)
Your function must run in sublinear time and may not hardcode results except for those of
the beginning of the sequence such as ‘(), ‘(0) and ‘(0 1).
Pattern Matching Program
Before we start building the pattern matching function, let us first build a set of routines
that will allow us to represent facts, called assertions. For instance, we can define the
following assertions:
(this is an assertion)
(color apple red)
(supports table block1)
Patterns are like assertions, except that they may contain certain special atoms not allowed
in assertions, the single ! character, for instance, or may have strings containing the *
character.
(color ! red)
(su*ts table block1)
Write a function match, which compares a pattern and an assertion. When a pattern
containing no special atoms is compared to an assertion, the two match only if they are
exactly the same, with each corresponding position occupied by the same atom.
> (match ‘(color apple red) ‘(color apple red))
T
> (match ‘(color apple red) ‘(color apple green))
NIL
ECS140A FQ20 September 28, 2020
This content is protected and may not be shared, uploaded, or distributed.
Project 3
2 of 4
The special symbol ‘!’ expands the capability of match by matching zero or more atoms.
> (match ‘(! table !) ‘(this table supports a block))
T
Here, the first symbol ‘!’ matches this, table matches table, and second symbol ‘!’
matches supports a block.
> (match ‘(this table !) ‘(this table supports a block))
T
> (match ‘(! brown) ‘(green red brown yellow))
NIL
In the last example, the special symbol ‘!’ matches ‘green red’. However, the match
fails because yellow occurs in the assertion after brown, whereas it does not occur in
the assertion. However, the following example succeeds:
> (match ‘(! brown) ‘(green red brown brown))
T
In this example, ‘!’ matches list (green red brown), whereas the brown matches the
last element.
> (match ‘(red green ! blue) ‘(red green blue))
T
In this example, the ‘!’ matches the empty list. The * character matches zero or more
characters inside a string.
> (match ‘(red gr*n blue) ‘(red green blue))
T
> (match ‘(t* table is *n) ‘(this table is blue))
NIL
In the first example the *, matches ee. In the second example the first * matches his,
but the second one fails to match because of the n. The lone * will match any single atom.
> (match ‘(color apple *) ‘(color apple red))
T
> (match ‘(color * red) ‘(color apple red))
T
> (match ‘(color * red) ‘(color apple green))
NIL
In the last example, color * red and (color apple green) do not match because
red and green do not match.
Erasure Code Reconstruction
Erasure codes are used when a value can be detected as incorrect (or erased). They are
commonly used in storage (like in RAID-5) as well as in communication. Write a function
ECS140A FQ20 September 28, 2020
This content is protected and may not be shared, uploaded, or distributed.
Project 3
3 of 4
to return a reconstructed message. Each message will be a set of words each with a parity
bit followed by a parity word. Each word is represented as a list of 0, 1 or NIL values. The
parity bit will be the final value in each word, the parity word will be the final word in the
list of words. An example output should be something like:
> (parity-correction ‘((0 1 1 NIL 0) (0 0 0 NIL 1) (NIL 1 1 1 0) (1 0 1
NIL 0) (0 NIL 1 0 1)))
(T ((0 1 1 0 0) (0 0 0 1 1) (1 1 1 1 0) (1 0 1 0 0) (0 0 1 0 1)))
This would be equivalent to the following message.
D0 D1 D2 D3 P
W0 0 1 1 0 0
W1 0 0 0 1 1
W2 1 1 1 1 0
W3 1 0 1 0 0
WP 0 0 1 0 1
If the message is incorrect or can’t be solved NIL should be returned with the original
message. For example:
> (parity-correction ‘((0 1 1 NIL 0) (0 0 0 NIL 1) (NIL 1 1 1 0) (1 0 1
NIL 0) (0 NIL 1 0 0)))
(NIL ((0 1 1 NIL 0) (0 0 0 NIL 1) (NIL 1 1 1 0) (1 0 1 NIL 0) (0 NIL 1
0 0)))
The parity-correction function should return results for modest size messages
within a second or two. Submissions that take significant amount of time may lose points.
Scripts of working solutions are in /home/cjnitta/ecs140a/proj3 on the CSIF.
Passing the arguments may require care on bash as a single quote may need to be passed
in to the test script.
You must submit the source file(s), and README.txt file, in a tgz archive. Do a make
clean prior to zipping up your files so the size will be smaller. You will want to be in the
parent directory of the project directory when creating the tgz archive. You can tar gzip a
directory with the command:
tar -zcvf archive-name.tgz directory-name
You should avoid using existing source code as a primer that is currently available on the
Internet. You must specify in your readme file any sources of code that you have viewed
to help you complete this project. You must also provide the URL any code sources in
comments of your source code. All class projects will be submitted to MOSS to determine
if students have excessively collaborated. Excessive collaboration, or failure to list external
code sources will result in the matter being referred to Student Judicial Affairs.
ECS140A FQ20 September 28, 2020
This content is protected and may not be shared, uploaded, or distributed.
Project 3
4 of 4
Helpful Hints
• The command to use Common LISP is clisp.
• Appendix A of LISPcraft summarizes LISP’s built-in functions. Each function is
explained briefly. You will find this a very useful reference as you write and debug
your programs. Also, you can get help about clisp by typing: man clisp
• You may define additional helper functions that your main functions use. Be sure,
though, to name the main functions as specified since the test program uses those
names.
• If you place a init.lsp file in the directory in which you execute LISP (or your home
directory), LISP will load that file automatically when it starts execution. Such a file is
useful to define your own environment. For instance, you will probably want to put the
following command in that fiile:
(setq *print-case* :downcase)
• When developing your program, you might find it easier to test your functions first
interactively before using the test program. You might find trace, step, print, etc.
functions useful in debugging your functions.
• A few points to help the novice LISP programmer:
• Watch your use of (, ), “, and ‘. Be sure to quote things that need to be quoted.
• To see how lisp reads your function, use pretty printing. For example,
> (pprint (symbol-function ‘foo))
will print out the definition of function foo, using indentation to show nesting. This
is useful to locate logically incorrect nesting due to, e.g., wrong parenthesizing.
• If you cause an error, Common Lisp places you into a mode in which debugging
can be performed (LISPcraft section 11.2). To exit any level, except the top level,
type :q. To exit the top level, type
> (bye)
Overview
This project specification is subject to change at any time for clarification. For this project you will
be writing Datalog queries and will be developing a non-recursive Datalog interpreter. The EBNF
rules that will be used for the dialect of Datalog are in the following section. For this project
Datalog queries will be constructed of a set of rules that will each have a rule head. Each rule head
specifies the rule name and a set of variables. Each rule that does not have a rule body is considered
a fact (also known as external rule); these facts will be backed by CSV data files that have the
tuples that specify when the rule is true. Any other tuples do not satisfy the fact rules. Rules that
have a rule body are to be calculated and attempt to find the values that will satisfy all of the
subgoals. If multiple rules exist with the same name, then rule invocations of that name are
considered to be a union of the multiple rules. The goal of the Datalog interpreter is to find all
tuples that satisfy all of the rules specified, with the base values coming from the fact rules.
Non-Recursive Datalog Programming Language
The following EBNF describes the non-recursive Datalog language. The Non-terminals on the
right are identified by italics. Literal values are specified in bold. The operators not in bold or
italics describe the options of the EBNF. These operators are {} for repetition, [] for optional, ()
for grouping, and | for or.
EBNF Rules:
Query := Rule { Rule }
Rule := [ RuleHead [ := RuleBody ] ] NewLine
RuleHead := RuleName ( HeadVariableList )
RuleBody := SubGoal { AND SubGoal }
RuleName := Identifier
HeadVariableList := Identifier { , Identifier }
SubGoal := RuleInvocation | NegatedRuleInvocation | EqualityRelation
RuleInvocation := RuleName ( BodyVariableList )
NegatedRuleInvocation := NOT RuleInvocation
EqualityRelation := InequalityRelation { EQOperator InequalityRelation }
BodyVariableList := InvocationVariable { , InvocationVariable }
InequalityRelation := Term { IEQOperator Term }
EQOperator := != | =
InvocationVariable := Identifier | EmptyIdentifier
Term := SimpleTerm { AddOperator SimpleTerm }
IEQOperator := < | > | <= | >=
SimpleTerm := UnaryExpression { MultOperator UnaryExpression }
AddOperator := + | –
UnaryExpression := ( UnaryOperator UnaryExpression ) | PrimaryExpression
MultOperator := * | / | %
UnaryOperator := ! | – | +
PrimaryExpression:= ( EqualityRelation ) | Constant | Identifier
Constant := IntConstant | FloatConstant | StringConstant
ECS140A FQ20 December 10, 2020
This content is protected and may not be shared, uploaded, or distributed.
Project 4
2 of 8
Token Rules:
Identifier := Alpha { ( Digit | Alpha ) }
EmptyIdentifier := _
IntConstant := Digit { Digit }
FloatConstant := Digit { Digit } [ . Digit { Digit } ]
StringConstant := ” { ( CharacterLiteral | EscapedCharacter ) } ”
Comment := # { ( WhiteSpace | Printable ) } NewLine
Digit := 0 – 9
Alpha := A – Z | a – z
WhiteSpace := Tab | CarriageReturn | NewLine | Space
Printable := Space – ~
CharacterLiteral := Space – ! | # – [ | ] – ~
EscapedCharacter := b |
| r | t | \ | ’ | ”
Semantics:
There are several requirements for the non-recursive Datalog queries that cannot be expressed in
the EBNF. The following must be true of all valid queries:
• All subgoals must be defined prior to their use
• All repeated rule names must be defined contiguously
• All repeated rule names must have the same number of variables
• Any variable in the rule head or rule body must appear in a non-negated rule invocation
• A rule invocation may not appear in its own rule definition (i.e. recursive definition)
• Fact rules may only appear once
• Every rule invocation must provide the same number of variables as its definition
CSV Data Files
The facts can be stored in a CSV files. The name of the file without .csv extension will be the name
of the facts loaded into the environment. Each column is a different attribute, with the header row
containing the attribute name. Four data types are allowed in the CSV files: string, float, integer,
and boolean. Rows two and on, each have one tuple in them.
The following is an example of fact R that would be in the file R.csv:
“a” “b” “c” “d”
3 “Hello” 3.4 true
4 “World” 1.1 false
6 “Goodbye” 8.8 false
7 “None” 9.3 true
Datalog Examples
The following are a few Datalog query examples using the grammar described in the previous
sections using the R fact described.
Simple fact query:
ECS140A FQ20 December 10, 2020
This content is protected and may not be shared, uploaded, or distributed.
Project 4
3 of 8
R(a,b,c,d)
This will result in the following output:
a b c d
3 Hello 3.4 true
4 World 1.1 false
6 Goodbye 8.8 false
7 None 9.3 true
Simple filtering for a being greater than 5:
R(a,b,c,d)
S(x) := R(x,_,_,_) AND x > 5
This will result in the following output:
x
6
7
Simple filtering for c being greater than 8.0 or less than 3.0:
R(a,b,c,d)
S(x) := R(_,_,x,_) AND x > 8.0
S(x) := R(_,_,x,_) AND x < 3.0
This will result in the following output:
x
1.1
8.8
9.3
Find all b associated with c not being the smallest in R:
R(a,b,c,d)
S(x) := R(_,x,c1,_) AND R(_,_,c2,_) AND c1 > c2
This will result in the following output:
x
Hello
Goodbye
None
ECS140A FQ20 December 10, 2020
This content is protected and may not be shared, uploaded, or distributed.
Project 4
4 of 8
Datalog Queries
Write Datalog queries for the questions posed below. Name your query files query_X.nrdl
where X is the number of the question below. You may assume the following facts will be
available:
Product(maker, model, year)
Car(model, city, highway, style, passengers, trunk, msrp)
Pickup(model, city, highway, passengers, cargo, towing, msrp)
EV(model, range, battery, passengers, msrp)
1) What Car models have a highway less than 35miles? The final rule head should be
Answer(model).
2) Find all of the Pickup models that have a cargo capacity of at least 75cu ft. and a highway
fuel economy less than 25MPG. The final rule head should be Answer(model).
3) Find all automakers that sell at least one vehicle that msrp less than $27,000 and at least
one vehicle greater than $55,000. The final rule head should be Answer(maker).
4) Find the passenger capacities that exist for three or more vehicles. That is three or more
different vehicles should have that passenger capacity. The final rule head should be
Answer(passengers).
5) Find the automaker(s) of the highest combined fuel economy (55% city, 45% highway)
of conventional vehicles (cars and pickups). The final rule head should be
Answer(maker).
6) Find the vehicle model with the highest miles per gallon gasoline equivalent (MPGGE).
For this problem assume combined fuel economy formula from above, and that a gallon
of gasoline is equivalent to 33.1kWh. The final rule head should be Answer(model).
7) Find automaker(s) that sell a pickup with a highway fuel economy lower than all the cars
it sells. The final rule head should be Answer(maker).
Datalog Interpreter
The Datalog interpreter will be constructed from several classes. A given NRDatalog class
provides a class that can parse command line arguments and instantiate and call the appropriate
classes. The following classes should be created with at least the specified interface.
// Class that parses the non-recursive Datalog language
public class NRDatalogParser{
// Constructor for the parser
public NRDatalogParser(PeekableCharacterStream stream);
// Parses a query and returns true if valid query
public boolean parseQuery();
// Prints out the error if one has occurred
public void printError(PrintStream ostream);
}
ECS140A FQ20 December 10, 2020
This content is protected and may not be shared, uploaded, or distributed.
Project 4
5 of 8
// Class that will construct the parse tree from
public class NRDatalogParseTree extends NRDatalogParser{
// Constructor for the parse tree
public NRDatalogParseTree(PeekableCharacterStream stream);
// Parses a query and returns true if valid query
public boolean parseQuery();
// Prints out the error if one has occurred
public void printError(PrintStream ostream);
// Outputs a tree of the parsed query
public void outputParseTree(PrintStream ostream);
}
// Class that will construct and execution tree from the parse
// tree. Executes the query and displays the results.
public class NRDatalogExecutionTree extends NRDatalogParseTree{
// Constructor for the execution tree
public NRDatalogExecutionTree(PeekableCharacterStream stream);
// Parses a query and returns true if valid query
public boolean parseQuery();
// Prints out the error if one has occurred
public void printError(PrintStream ostream);
// Outputs a tree of how execution would occur for queries
public void outputExecutionTree(PrintStream ostream);
// Sets the verbosity setting for execution
public void setVerbose(boolean verb);
// Sets the number of threads during execution
public void setThreadCount(int threadcount);
// Sets the path to the data files
public void setDataPath(String datapath);
// Executes the parsed query
public boolean executeQuery();
}
Suggested Approach
1. Work on writing the Datalog queries first. An example program is available to execute the
queries. This will provide you the basic understanding of how the operations should work
from the query writing perspective.
2. Update your Scanner to accommodate the new set of operators. Make sure to change the
newline to an operator, if a whitespace newline is desired, it must be preceded with a
backslash. There are only two keywords AND and NOT.
3. Write the NRDatalogParser class to recognize the non-recursive Datalog language. Similar
to project 2 you will want one method per rule. Creating an enum class that represents each
rule and creating an isFirst method may aid in the development of the rules.
ECS140A FQ20 December 10, 2020
This content is protected and may not be shared, uploaded, or distributed.
Project 4
6 of 8
4. Write the NRDatalogParseTree class to construct the parsed query as a tree. You will likely
want to create some type of node class within the NRDatalogParseTree that holds the rule
type, associated token (if any), parent, and children. You may find using a stack data
member will be helpful in the recursive decent parsing. The top of the stack can hold the
current parent node.
5. Write a class to hold the data sets. These should be able to hold a set of list of objects. Each
list of objects can act as data tuple. It should also have a list of column names. This class
should be able to be constructed from a PeekableCharacterStream (so to load using CSV),
or from list of column names and the set of list of objects. The data set class should support:
a. Appending tuples at least privately (this will help constructing new data sets)
b. Filtering (or selecting) tuples given an object that has function returning if the tuple
should be in the new data set or not (this will essentially remove rows from the data
set)
c. Projecting specified columns to a new data set (this is similar to cutting off columns
that are not desired, may want to consider supporting renaming of the columns with
this as well)
d. Cartesian product with another data set to create a new data set
e. Natural join with another data set to create new data set
f. Filter (or selecting) tuples given tuples do not appear in another data set (the tuple
should be kept if the combination of columns specified do not appear in the other
data set.
g. Union with another data set to create a new data set (it is assumed that both have
the same column names)
h. Reordering columns to create a new data set (may be helpful when projecting down
temporary work into final rule)
6. Write the NRDatlogExecutionTree class to construct the tree to that it can easily be
executed. You will likely want to create some type of node class within the
NRDatlogExecutionTree that holds the rule type, associated token (if any), parent,
children, and possibly a node index (may be helpful for sorting step).
a. Translate the parse tree into an execution tree. The difference is that infix operators
will be migrated up to be the parent of the operands instead of being children of the
larger rule. You will likely want to write a function that recursively visits the nodes
of the parse tree and returns a converted execution tree node equivalent. Tokens
like open/close paren, assignment, and commas should be omitted. The body sub
goals should also be sorted for simplicity of execution. Rule invocations should be
first, followed by negated rule invocations and finally equality relations.
b. Write a method to validate the semantic rules for the execution tree. This will make
sure that execution should be able to proceed.
c. Write a method to execute a rule without a rule body. This will essentially just load
a data set and add it to the calculated rules.
d. Write a method to execute a rule with a rule body. This will create a temporary
calculated data set that will either be added to the calculated rules or unioned with
the previously calculated rule of the same name.
i. Create the ability to invoke rules by projecting them based upon the
invocation. This projected invocation will then be either naturally joined or
ECS140A FQ20 December 10, 2020
This content is protected and may not be shared, uploaded, or distributed.
Project 4
7 of 8
cartesian product with the temporary calculated data set depending if there
are any overlap in column names.
ii. Create the ability to do a negated invocation, this is similar to invoking
rules; however, this will potentially be removing tuples from the temporary
data set instead of adding tuples.
iii. Create the ability to filter the tuples based upon the equality relations. A
filter object should be created using the subgoal and should be able to
calculate if a tuple should be kept or omitted.
Examples of valid and invalid nrdl files are in /home/cjnitta/ecs140a/proj4 on the
CSIF. Script that runs the NRDatalog class can be found there too. Runnig the script without
options, or with the –help option will print the help. Your code will be tested on the CSIF and
is expected to compile and run on the CSIF.
You must submit the source file(s), a Makefile, and README.txt file, in a tgz archive. Do a
make clean prior to zipping up your files so the size will be smaller. You will want to be in the
parent directory of the project directory when creating the tgz archive. You can tar gzip a directory
with the command:
tar -zcvf archive-name.tgz directory-name
You should avoid using existing source code as a primer that is currently available on the Internet.
You must specify in your readme file any sources of code that you have viewed to help you
complete this project. You must also provide the URL any code sources in comments of your
source code. All class projects will be submitted to MOSS to determine if students have
excessively collaborated. Excessive collaboration, or failure to list external code sources will result
in the matter being referred to Student Judicial Affairs.
Helpful Hints
• The use of peekNextToken()will be helpful in some instances where it isn’t clear what
the next rule should be parsed. The language is designed so that only a single look ahead
is necessary.
• Creating an enum class for the rule names will be helpful in multiple levels and is highly
encouraged.
• Use the working example to output the parse tree and the execution tree. Being able to
match them will help prior to executing the queries.
• Using Integer, Float, String, and Boolean types will allow the data sets to hold lists of
Objects. The instanceof operator will be helpful in detecting the type during calculations
of evaluations.
Extra Credit
After successfully creating a Datalog interpreter convert the NRDatalogExecutionTree into a
multithreaded version. The executeQuery method needs to use multiple threads to execute the
ECS140A FQ20 December 10, 2020
This content is protected and may not be shared, uploaded, or distributed.
Project 4
8 of 8
query in parallel on systems with multiple cores. Larger data sets with a query that requires
significant time will be provided later.

![[SOLVED] Ecs140a projects 1 to 4 solution](https://assignmentchef.com/wp-content/uploads/2022/08/downloadzip.jpg)

![[SOLVED] Cs6262 project 2: advanced web security spring 2025 solution](https://assignmentchef.com/wp-content/uploads/2022/08/downloadzip-1200x1200.jpg)
Reviews
There are no reviews yet.