Takehome Exam in Advanced Programming
Deadline: Friday, November 8, 16:00 Version 1.0
Preamble
This is the exam set for the individual, written takehome exam on the course Advanced Programming, B12019. This document consists of 20 pages; make sure you have them all. Please read the entire preamble carefully.
The exam consists of 2 questions. Your solution will be graded as a whole, on the 7point grading scale, with an external examiner. The questions each count for 50. However, note that you must have both some nontrivial working Haskell and Erlang code to get a passing grade.
In the event of errors or ambiguities in an exam question, you are expected to state your assumptions as to the intended meaning in your report. You may ask for clarifications in the discussion forum on Absalon, but do not expect an immediate reply. If there is no time to resolve a case, you should proceed according to your chosen documented interpretation.
What To Hand In
To pass this exam you must hand in both a report and your source code:
The report should be around 510 pages, not counting appendices, presenting at least your solutions, reflections, and assumptions, if any. The report should contain all your source code in appendices. The report must be a PDF document.
Thesourcecodeshouldbeina.ZIPfilecalledcode.zip,archivingonedirectorycalled code, and following the structure of the handout skeleton files.
Make sure that you follow the format specifications PDF and .ZIP. If you dont, the hand in will not be assessed and treated as a blank hand in. The hand in is done via the Digital Exam system eksamen.ku.dk.
Learning Objectives
To get a passing grade you must demonstrate that you are both able to program a solution using the techniques taught in the course and write up your reflections and assessments of
1
Advanced Programming DIKU, B120192020
your own work.
For each question your report should give an overview of your solution, including an assessment of how good you think your solution is and on which grounds you base your assessment. Likewise, it is important to document all relevant design decisions you have made.
Inyourprogrammingsolutionsemphasisshouldbeoncorrectness,ondemonstrating that your have understood the principles taught in the course, and on clear separation of concerns.
It is important that you implement the required API, as your programs might be subjected to automated testing as part of the grading. Failure to implement the correct API may influence your grade.
To get a passing grade, you must have some nontrivial working code in both Haskell and Erlang.
Exam Fraud
This is a strictly individual exam, thus you are not allowed to discuss any part of the exam with anyone on, or outside the course. Submitting answers code andor text you have not written entirely by yourself, or sharing your answers with others, is considered exam fraud.
You are allowed to ask not answer how an exam question is to be interpreted on the course discussion forum on Absalon. That is, you may ask for official clarification of what constitutes a proper solution to one of the exam problems, if this seems either underspecified or inconsistently specified in the exam text. But note that this permission does not extend to discussion of any particular solution approaches or strategies, whether concrete or abstract.
This is an openbook exam, and so you are welcome to make use of any reading material from the course, or elsewhere. However, make sure to use proper and specific citations for any material from which you draw considerable inspirationincluding what you may find on the Internet, such as snippets of code. Similarly, if you reuse any significant amount of code from the course assignments that you did not develop entirely on your own, remember to clearly identify the extent of any such code by suitable comments in the source.
Also note that it is not allowed to copy any part of the exam text or supplementary skeleton files and publish it on forums other than the course discussion forum e.g., StackOverflow, IRC, exam banks, chatrooms, or suchlike, whether during or after the exam, without explicit permission of the authors.
During the exam period, students are not allowed to answer questions on the discussion forum; only teachers and teaching assistants are allowed to answer questions.
Breaches of the above policy will be handled in accordance with the Faculty of Sciences disciplinary procedures.
2
Advanced Programming DIKU, B120192020
Question 1: Implementing Configuraptor
Background
In madetoorder manufacturing, a customer can specify the desired configuration of a product by choosing from a variety of options for individual pieces of functionalitymore than can reasonably be accommodated by selecting one of a fixed range of preconfigured product models. A prototypical example of such a situation occurs when building a desktop PC from scratch, using offtheshelf components case, power supply, motherboard, processor, RAM, disk, graphics card, monitor, operating system, etc.
A complicating factor is that not all combinations of component selections are feasible, or even meaningful. For example, a powerful graphics card may require a correspondingly heavyduty power supply; or a motherboard will only work with a specific processor family, and accommodate only a limited number of RAM modules. Also, in almost all cases, the total system price is an important constraint, precluding simultaneous topoftheline choices for all components.
To assist both buyers and sellers of madetoorder products, a configurator system is often used to keep track of the various options and the dependencies between them; then the system may be able to suggest particular configurations satisfying the customers functionality and performance needs, as well as their budget. In this task, you will implement a simple such system, Configuraptor.
Informal description of Configuraptor
Configuraptor is a generic configurator system, not inherently tied to any particular application domain, but particularly suited for situations when products are built up from a relatively small number typically 520 of discrete components. For concreteness, our examples will be mainly taken from the abovementioned domain of PC configuration.
The two key Configuraptor concepts are resources and components. A resource represents some key characteristic of a configuration or component choice, such as available RAM or disk space, USB or video ports, ormost importantlyavailable funds. All resources must be explicitly declared, e.g., resource GBRAMspace. Note that resource names are caseinsensitive, so gbramspace or even gBRaMSpace would also refer to the same resource.
A component whether hardware or software then provides some resources, often in exchange for others. For example, an external disk will contribute some amount of storage space, but will typically use up a free USB port; whereas an internal disk may instead use a SATA port, as well as a 3.5 drive bay; and both will naturally cost money, generally increasing with the disk size. Or some piece of software may provide a needed functionality, but require a particular operating system, as well as a set of shared libraries, to also be installed.
When talking about the resource needs of components, it is useful to distinguish between exclusive and shared uses: a component say, a RAM module may completely monopolize some resource a DIMM slot, making said resource in practice unavailable for any other
3
Advanced Programming DIKU, B120192020
components. Or it may merely require some resource to be present say, a shared library, or a sufficiently powerful graphics card, without precluding other components from utilizing it though perhaps not at the same time. For example, we may have the following component declaration:
component MyExtDisk8TB:
provides 8000 GBdiskspace;
requires NTFSsupport;
uses 1 USBport;
uses 1500 DKK.
That is, this particular disk provides 8000 GB of disk space, but requires the operating system to support NTFSformatted disks. I.e., whatever component is chosen for the OS, must include a clause provides NTFSsupport. It also needs a free USB port and costs 1500 DKK.
Often, the resource needs of a component can be satisfied in alternative ways. For example, connecting a display screen may use either a free VGA or HDMI port but not both. Conversely, a component may provide a choice between two or more resources, only one of which can be utilized in any particular configuration: for example, a drive bay may accomodate either one 3.5 disk magnetic, or two 2.5 ones SSDs, but not both at the same time. For example,
component SmallTrueView:
provides 22 inchdisplay, 768 linesresolution;
uses VGAportHDMIport;
uses monitorspace, 1000 DKK.
Here, alternatives are separated by vertical bars , while conjunctive needs or availabilities are separated by commas ,. The example also shows that this particular monitor provides a 22, 768p display. However since display sizes do not in general behave additively, we dont want, say, a requirement for a 32 monitor to be satisifed by buying two smaller ones, even if that may be a cheaper solution; this is accomplished by stipulating that all monitors compete for 1 unit of monitor space, which we must ensure is initially provided, along with sufficient funds.
All available components, with their associated resource needs and offerings are collected in a database. A user may then query said database, by describing their resource needs and budget including available physical space, etc., and getting an answer consisting of a suggested system configuration, satisfying all the requirements or an indication that no such configuration is possible.
System structure The Configuraptor system is divided into four main modules:
A Parser, which converts the textual representation of a database as seen in the examples above into an intermediate database, with essentially the same structure as the textual one.
4
Advanced Programming DIKU, B120192020
An Elaborator, which checks that the intermediate database is well formed, consol idates the offered and requested resources for each each component into a resource profile, and desugars alternatives into virtual resources and components, giving a final database.
A Solver which tries to construct a satisfactory system configuration by selecting components from the final database in a goaldirected way.
A Main driver program that processes commandline arguments and invokes the appropriate functionality from the other modules.
Your task
We have provided a suitable driver program. Thus, you will need to implement the other three modules, which are more precisely specified below. All are weighed roughly equally, but you should aim to solve each of them at least partially.
Question 1.1: Parser
The concrete grammar of a Configuraptor database is shown in Figure 1.
Lexical definitions
A name is one or more words, separated by single hyphensA word is one or more ASCII letters az and AZ andor digits 09, containing at least one letter. The total length of name including any hyphens may be at most 32 characters. Examples: AB, 9innail; nonexamples: foo, bar, 2by4, ahha.
Keywords are not reserved, i.e., a keyword e.g., uses may in principle also be used as a name.
A num is one or more ASCII digits. Its numeric value must be no more than 999999 1061. Here, and elsewhere, dont worry about Int overflowing.
Whitespace and comments All tokens may be surrounded by whitespace spaces, tabs, and newlines. Whitespace is generally ignored, but may not occur inside multicharacter tokens keywords, names, or numbers. Some whitespace is necessary where a keyword, name, or number would otherwise run into an adjacent letter, digit, or hyphen.
Comments are written between curly braces, this is a comment. Any characters except curly braces may occur without restrictions inside a comment. Additionally, comments may be nested: a deeply nested comment. All levels of comments must be explicitly closed before the end of the input. Comments count as whitespace for token separation.
Keywords are caseinsensitive. Semantically, so are resource and component names, but the parser should just pass them on as written.
5
Advanced Programming
DIKU, B120192020
Database :: Declz ::
Decl ::
RNames ::
Clauses ::
Clause ::
RSpec ::
RName :: CName ::
Declz
Decl Declz
resource RNames .
component CName : Clauses . RName
RName , RNames Clause
Clause ; Clauses
provides RSpec uses RSpec requires RSpec
RName
num RSpec RSpec , RSpec RSpecRSpecRSpec
name name
Figure 1: Concrete grammar of Configuraptor databases
6
Advanced Programming DIKU, B120192020
Disambiguation In resource specifications RSpec, the implicit scaling operator pre fixing by a number groups tighter than ,, which again groups tighter than , so that a4 b , c would parse as if parenthesized like a4 b , c. Further, , andassociate to the left, while scaling necessarily associates to the right, e.g., a, 3 b, 4 5 c
parses like a, 3 b, 4 5 c.
The abstract syntax of the intermediate database is defined in the module Absyn omitting
the derives Eq, Show, Read for all datatypes: type IDBRName, IComp
data ICompIC CName Clause
type ClauseCKind, RSpec
data CKindCKProvidesCKUsesCKRequires
data RSpec
RSName RName
RSNum Int RSPec
RSAnd RSpec RSpec
RSOr RSpec RSpec
type RNameString
type CNameString
type ErrMsgStringgeneral type of humanreadable messages
API Your Parser module should export one function, for parsing a full database:
parseString :: StringEither ErrMsg IDB
The function should parse a database into an IDB. All resources declared in the input whether individually or jointly with others, and no matter where should be collected into the RName, while component declarations should be collected as Comp, with the evident correspondence between concrete and abstract syntax.
If parsing fails, the function should instead return a suitable error message which doesnt have to be specific. Only actual syntax errors should be reported at this stage; semantic errors e.g., using an undeclared resource in a component, or declaring the same resource twice, will be detected during elaboration.
You must use either the ReadP or Parsec parsercombinator library as supplied with LTS14.1. If you use Parsec, then only plain Parsec is allowed, namely the following submodules of Text.Parsec: Prim, Char, Error, String, and Combinator or the compat ibility modules in Text.ParserCombinators.Parsec; in particular you are disallowed to use Text.Parsec.Token, Text.Parsec.Language, and Text.Parsec.Expr.
7
Advanced Programming DIKU, B120192020
Question 1.2: Elaborator
The abstract syntax of the final database is roughly similar to the intermediate one, but simpler:
type DBResource, CName, RProf
type RProfResource, Int, Int
newtype ResourceR RName deriving Eq, Ord, Show, Read
The database gives the complete list of resources with their canonical capitalizations, and, for each component, its resource profile. Such a profile says, for each relevant resource, how many units of that resource the component contributes in net i.e., what it provides minus what it uses, which could be positive, negative, or zero, as well as how many units it requires to be available in the complete configuration i.e., after summing the net contributions of all components, including its own. If both of those numbers are zero, the resource is omitted from the profile. The resources in both the complete list, and in individual profiles are always listed in lexicographically sorted order.
The elaborator turns lists of clauses into a single resource profile. The order of resource specifications and how they are divided into clauses doesnt matter. For example, uses A; provides B elaborates to the same profile as provides B; uses A; likewise, uses A, B means the same as uses A; uses B; uses 1 A means the the same as just
uses A; and uses 3 4 A is equivalent to uses 12 A. Analogously for provides and requires.
For example, the specification
component c: provides r2, 5 r1, r3; uses 2 r1; requires 4 r3, 7 r1; uses 1 r2.
corresponds to the following component profile in the final database:
c, R r1, 3,7, R r3, 1, 4
Also, scaling a resource specification is semantically equivalent to repeating it the cor responding number of times. That is, uses 3 A behaves essentially the same as uses A,A,A, except that the former will generally be handled more efficiently, especially for large scaling factors andor more complicated specifications A.
Note that tracking net resource usages alone sometimes oversimplifies matters. For example, a USB hub like a power strip converts a single port into multiple ones. But if we naively specified it as
component USBhub4: uses 1 USBport; provides 4 USBport.
it would behave identically to
component USBhub4: provides 3 USBport.
8
Advanced Programming DIKU, B120192020
which says that, to connect a USB device to the system, we only need a hub, but no actual USB port on the PC. To express that we cannot simply plug a hub into itself, we need to extend either of the above specifications with a clause requires USBhost, a resource that would be provided only by the computer, so that the whole USB tree can be properly rooted.
Alternatives between resourcesare handled by introducing virtual resources and components. A clause or conjunct in a clause provides AB can be seen as a conceptual abbreviation for provides pAorB, where the new virtual resource pAorB is only used by two virtual components that in turn provide the actual resources A and B:
component AorB1: uses pAorB; provides A. component AorB2: uses pAorB; provides B.
The virtual components are not physical pieces of hardware; they merely correspond to a implicit selection of which resource to pick when a choice is offered.
The names pAorB, AorB1, and AorB2 have no actual significance, and were simply chosen mnemonically. An actual implementation of the elaborator should just generate arbitrary fresh names, not necessarily meaningful, but guaranteed not to clash with userprovided names, or with each other. Also, it wouldnt first construct explicit textual or intermediate database representations of the virtual components, but just calculate their resource profiles directly.
Note also that A andor B could themselves be composite specifications, possibly involving further alternatives. e.g., A could actually be the 2 CD; then we would need to also introduce a virtual resource pCorD with virtual components to turn a pCorD into a C or a D, and so on.
Ideally, when theres a direct choice between more than two alternatives, e.g., ABC, whether parenthesized as ABC or ABC, the elaborator would only introduce a single virtual resource pAorBorC, with one virtual component for each of the three branches, rather than using nested binary choices. Explain in your report whether and if so how you implemented this optimization.
Note that there is a difference between the clauses provides 2 A2 B and provides 2 AB: the former says that the component only provides either two As or two Bs, while the latter
can also provide one of each, if thats whats needed.
Analogously, we may use alternatives to signify that a request for resources may be satisfied in different ways. For example, uses AB would become uses uAorB, where
component 1AorB: uses A; provides uAorB. component 2AorB: uses B; provides uAorB.
Note how the uses and provides clauses go in the other direction here. Again, the above are not real components, but merely reflect choices in how an alternativeresource need was actually satisfied.
9
Advanced Programming DIKU, B120192020
Finally, a clause requires AB becomes uses rAorB not requires rAorB, so as not to leave stray virtual resources lying around, with
component 1AorB: requires A; provides rAorB. component 2AorB: requires B; provides rAorB.
That is, the virtual components provide an exclusive, ephemeral copy of the virtual resource, as long as one of the alternative underlying resources is already available.1
API Your elaborator module should export the following functions: lookres :: ResourceRNameEither ErrMsg Resource
elaborate :: IDBEither ErrMsg DB
lookres rs s should look in the duplicatefree list of resources rs for the name s, and
either return the canonically capitalized resource, or a suitable error message.
elaborate pdb should return a final database corresponding to the intermediate one. This involves the following checks and transformations:
1. Allclausesforeachcomponentshouldbesuitablycombinedintoasingle,flatresource profile for that component.
2. All resource names in resource profiles should be canonicalized to the capitalization used in the resource declaration. Uses of undeclared resources should be reported with a suitable error message as a Lefttagged result, not by calling error!. It is not a requirement that resources are declared before being used. Indeed, an IDB contains no information about the relative placement of resource and component declarations in the original input string.
3. The elaborator should check that all resources are declared exactly once. Multiple declarations differing in case are not allowed. Also, it is illegal to declare a component and a resource with the same name.
Multiple declarations of a component are allowed, with the meaning that all clauses are put together as if they were in a single declaration. E.g,
component Foo: provides A, 2 B; requires C.
possibly other declarations here
component Foo: provides 3 A; uses 10 D.
should give the same result as the single declaration
component Foo: provides 4 A, 2 B; requires C; uses 10 D.
1This elaboration of requiresclauses is a bit too simplistic to correctly express sharing of multiple units of a resource, as in require n AB where n1. However, doing it properly would require a significantly more complicated elaboration process, keeping closer track of when shared resources are reserved and released, so just implement it as described here.
10
Advanced Programming DIKU, B120192020
The component names in the declarations must match exactly, including case. Other wise, such as if the second declaration above was instead for component foo , an error should be reported.
4. Alternative resource specifications are translated to plain resource profiles, by gen erating fresh virtual resources as well as corresponding components for each choice. The fresh names should be distinguished from userprovided names by starting with
. The rest of the name can be just a unique number; it doesnt have to relate to actual resource names in the alternatives.
Note: you get points for correctly handling each of these aspects. It is strongly suggested that you prioritize the lowernumbered ones.
You should structure your elaborator using an appropriate readerwriterstateerror monad. In your report, explain how you exploit each of these components of the monad or why not.
Question 1.3: Solver
We define a few more type synonyms:
type GoalRProf
type SolutionCName, Int
A goal is simply a resource profile, i.e., a list of resources real or virtual, for each one giving the number of units available, and needed. The initial goal will typically make available some amount of the money resource, and say that particular amounts of various functionality resources RAM, disk, etc. are needed. A goal is solved if, for each resource, the amount available is at least as large as the amount needed. Note that a solution will often provide excess resources: there may be some money left over; there may be more units of some resource than what was required; or the solution may provide some resources that were not asked for at all; those are all fine.
A purported solution is a list CName, Int, where for each entry c, n, the integer n says how many of real or virtual component c to include. Each c should only occur once in the list, and all ns should be strictly positive.
Your module should export three functions:
combine :: RProfRProfRProf
verify :: DBGoalSolutionEither ErrMsg RProf
solve :: DBGoalIntEither ErrMsg Solution
combine rp1 rp2 combines two resource profiles: for each resource, the net contribution is the sum of the two contributions, while the requirement is the maximum of the requirements. You may assume that rp1 and rp2 are well formed profiles sorted, and no 0, 0entries, and you should ensure that the result is, too. For example,
11
Advanced Programming DIKU, B120192020
combine R r1, 3, 5, R r3, 2, 0, R r4, 3,0
R r1, 2, 7, R r2, 3, 4, R r4, 3,0
should return
R r1, 5, 7, R r2, 3, 4, R r3, 2, 0
verify dbsol checks whether sol is a well formed solution to initial goali.e., that combiningwith the resource profiles of the components with associated multiplicities from db, as listed in the solution, results in a solved goal. If so, verify should return the resource profile of the complete system including whats left over from resources provided in the initial goal; otherwise, it should return a suitable error message. You may assume that db andare well formed, but you should not assume anything about sol.
solve dbn tries to solve goalusing as few components including any virtual ones from db as possible, and at most n. If this succeeds, it return a possible solution. Naturally, this solution should verify successfully. You may assume that db andare well formed, andn0.
If the solver fails, the error message should start with either Impossible, which means that the goal cannot possibly be solved in the given database, no matter how many components we allow; or Exhausted, meaning that there is definitely no solution using n or fewer components, but there may be one with more components. After that, you may optionally give additional, humanreadable information that might be useful in diagnosing the problem.
If a solution is found, it does not have to be the cheapest, nor provide a maximal number of excess resources; it merely has to be correct and not use more components than necessary. To find better solutions, we may rerun the solver with a tighter initial goal a lower budget andor higher requirements, possibly combined with a higher bound on the number of components to consider.
Efficiency is not a significant concern. However, your solver should not get bogged down in exploring pointless parts of the search space. You should organize it as a goaldirected search like in logic programming, where you start from the initial goal, and consider adding relevant components one by one until the goal is solved. In particular, the solver should never consider adding a component whether real or virtual unless that component actually addresses some unsatisfied constraint in the current goal, i.e., the amount needed of some resource strictly exceeds the amount currently available, and the component contributes at least one unit of that resource.
Use monads where you deem appropriate. Justify your choices in the report.
If you cannot get the solver to work in full generality, consider whether you can implement a limited version, e.g., one that only works under some assumptions about the database andor initial goal, or one that is not guaranteed to find minimal solutions.
Main program
This is provided in the handout. Usage is:
12
Advanced Programming DIKU, B120192020
configuraptor DB.crDB.hcr n NUMp NUM RESr NUM RES
An argument DB.cr reads in the database in file DB.cr the extension .cr can actually be anything. Similarly DB.hcr reads the database from an already parsed Haskell value of type IDB. The contents of all databases are concatenated resource and component lists separately between parsing and elaboration.
An argument n NUM where NUM is a nonnegative integer sets the maximum number of components to consider for the solver. If more than one n is given, the last one takes effect; if none are present, the maximum defaults to 10.
An argument p NUM RES says that, initially, NUM units of RES are available. NUM
should be nonnegative, and RES a resource declared in one of the databases. Analogously, r NUM RES says that the initial requirement is for the specified number of units of a
resource. Both p and r may be repeated, for specifying multiple resources.
Using multiple databases, you can have general specifications of all components in a fixed database and a detailed specification of the goal requirements in another, e.g. goal.cr could contain:
resource system.
component DreamPC:
provides system; requires 16 GBRAM, 27 inchdisplay, Linux, .
Then you can invoke the main program as
configuraptor components.cr goal.cr p 10000 DKK r 1 system n 20
You may use the commandline program for informal experimentation or integration testing, but you should still do proper, automated testing of your individual modules. You are not expected to test the main program, nor to employ it it in your tests. In fact, you are free to ignore it entirely if you dont find it useful.
General instructions
You should use the skeleton files provided. Do not modify the types in the abstract syntax or APIs, as this will likely prevent your solution from working with our automated tests.
For each of the three modules, there is both a file Module.hs, which exports only the required API, and ModuleImpl.hs, which exports everything. You should place your code in the latter only. Then, if you want to to do whitebox testing of some of your internal functions, you may import them from the relevant implementation modules.
Your implementation modules should not import directly from each other. If you have a nontrivial type definition or function that you want to share between two or more modules, place it in Utils.hs and import it from both.
You should not modify the package.yaml file, except possibly in the testing section. That is, your main code should import only from the allowed packages. For testing, you may use anything from LTS14.1 e.g., a different testing framework if necessary. As usual, your
13
Advanced Programming DIKU, B120192020
tests should be reproducible by running stack test. If any special instructions are needed, give them in the report.
We provide some sample database files in the examples directory both .cr and .hcr versions. Note that these do not exercise all the relevant functionality of the parser, elaborator, and solver, so you should also test with your own probably smaller, and targeted sample databases. Do consider whether any parts of the system would be particularly suitable for propertybased testing, and if relevant, implement some such tests as well.
14
Advanced Programming DIKU, B120192020
Question 2: Shared Optimism
The task in this question is to implement a server that maintains a shared state in the form of a keyvalue map and can execute operations on this shared state using a simplified version of what is called optimistic concurrency control.
General comments
This question consists of two subquestions: Question 2.1 about implementing an API for starting an optimistic server and for working with operations, and Question 2.2 about writing QuickCheck tests against this API. Question 2.1 counts for 70 and Question 2.2 counts for 30 of this question. Note that Question 2.2 can be solved with a simple partial or even without implementation of the optimistic module from Question 2.1.
In Appendix A you can find an example on how to use the API.
Remember that it is possible to make a partial implementation of the API that does not support all features. If there are functions that you dont implement, then leave them to return the atom notimplemented.
There is a section at the end of this question, on page 19, that suggests topics for your report.
Terminology
An optimistic server maintains a shared state in the form of a keyvalue map. An operation is a function that reads a view of the state, and computes a result and a potential change to the shared state. A view is a submap of the state. Similarly a change to a state is a map with the insertions or updates to be performed. Thus, we use the following Erlang representation of views, changes, and operations:
type view :: map.
type change :: map.
type operation :: funviewany, change.
The read set of an operation is the keys of its view, and the write set is the keys of its change. The dependency set is the union of the read and write sets. Two operations are in conflict if one of their write sets overlaps the others dependency set. We define the dirty set of an operation to be the keys that are written by other concurrent operations.
We want to execute operations concurrently but isolated. That is, if we have two concurrent operations, op1 and op2, that are not in conflict, then the end state of running those two operations should be the same as running them sequentially in some order.
To perform an operation on the shared state, you must first start the operation which creates an intermediate map containing the view of the shared state, and then, when the operation has completed, commit the changes to the shared state. An operation can either be successfully committed and updating the state of the server with the changes or it can be aborted and the state is not updated.
15
Advanced Programming DIKU, B120192020
There are three ways that an operation, op, can be aborted:
The operation is explicitly aborted.
The operation fails throws an exception, exits, generate an error, or return a value of the wrong type.
Another operation, op, that conflicts with op is executed concurrently with op, and op commits its updated state before op. In other words, the intersection of the dependency set and the dirty set is nonempty.
Multiple operations can be executed concurrently. However, if the keys in an operations dependency set are updated, the operation is operating on a now obsolete state and should be stopped. Queries of the result on such an operation must be answered with aborted.
Question 2.1: The optimistic module
Implement an Erlang module optimistic with the following API, where S denotes a process ID of an optimistic server and OR denotes a reference to an operation you decide the representation. All functions are allowed to return error, Reason if some unspeficied error occurred.
startState for starting an optimistic server with State as the initial shared state. State is an Erlang map. Returns ok, S on success.
stopS for stopping an optimistic server; this includes aborting all ongoing opera tions. Returns ok, State after all operations have been aborted, where State is the current shared state of S.
resetS, State for resetting the shared state of the optimistic server to State; this includes aborting all ongoing operations. Returns ok.
deleteS, Keys for deleting Keys from the shared state; this includes updating the dirty set of ongoing operations which may cause them to abort. Where Keys is a list of keys, and nonexisting keys are ignored. Returns ok.
operationS, Reads, OFun for starting a new operation on S. Returns ok, OR. Where Reads is a list of keys to be read from the shared state and OFun is a function that takes a View corresponding to Reads nonexisting keys are ignored and returns a pair Res, Change.
The server executes OFunView with a view of the shared state corresponding to Reads. If OFun fails, the operation is aborted.
The keys in Reads and Change are added to the dependency set of OR and the keys in Change are added to the write set of OR.
Remember that it is up to you to decide how to represent OR.
16
Advanced Programming DIKU, B120192020
commitOR for committing an operation. Waits for the operation to complete. Updates the shared state of S with the changes from the operation, if OR is not aborted and the intersection between the dirty set and the dependency set is empty. The dirty set of other ongoing operations are updated with the write set of OR which may cause them to be aborted. Return either ok, Res if the operation returned
Res, Change, or aborted if the operation is aborted.
Note that commit1 is idempotent, meaning that multiple application of commit1 to the same argument should give the same result as long as the optimistic server that OR originates from is running.
abortOR for aborting the operation OR. Returns aborted if OR is not committed or if it is already aborted; otherwise returns toolate.
Your library must be robust against erroneous operations. In particular, your solution should be able to deal with slow or nonterminating operation functions.
Concurrent execution of operations
Operations can only affect other running operations and the shared state when they are committed. To exploit this inherent concurrency, each operation should run in its own process. That is, when an optimistic server is asked to start an operation, it should spawn a new helper operationprocess for maintaining the intermediate state of the operation.
Note that your implementation should stop all unused processes and should not keep operationprocesses around longer than necessary. Also, processes waiting for answers from aborted operations should be answered with aborted as quickly as possible. Explain in your report how you solve these challenges.
How to get started
Try to start by implementing a version of the optimistic library that only runs a single operation at a time, perhaps even without spawning a separate operationprocess. When this works, you can extend it to handle concurrent operations.
17
Advanced Programming DIKU, B120192020
Question 2.2: Testing optimistic
Make a module testoptimistic that uses eqc QuickCheck for testing an optimistic module. We evaluate your tests with various versions of the optimistic module that contains different planted bugs and check that your tests find the planted bugs. Thus, your tests should only rely on the API described in the exam text. In this question, we assume that all values in keyvalue maps are integers, we dont assume anything about the keys in general.
Your testoptimistic module should contain the following functions:
mkoprOpr, Args a helper function that generates an operation function from the atom Opr and the arguments Args. You decide which atoms are valid for Opr and what Args should be. We will not call this function directly, only through symbolic calls generated by your generators.
A QuickCheck generator goodopr0 that generates an operation in the form of a tuple with three elements: the first element is a list of keys corresponding to the read set, the second element is a list of keys corresponding to the write set, and the third element is a symbolic call which can be evaluated to an operation function that works on a view corresponding to the keys in the read set.
For instance, five samples from this generator could be:
d, d, call,testoptimistic,mkopr,incr,d
c,d, c,d, call,testoptimistic,mkopr,swap,c,d
f, f, call,testoptimistic,mkopr,incr,f
e,a, e,a, call,testoptimistic,mkopr,swap,e,a
a,d,f, a,d,f,
call,testoptimistic,mkopr,
seq,
call,testoptimistic,mkopr,swap,d,f,
call,testoptimistic,mkopr,incr,a
Again, remember that the arguments to mkopr2 are up to you to decide. The given samples are just for inspiration.
A parameterised QuickCheck property propopraccurateOprGen that takes a generator that generates tuples in the same form as goodopr0. That is tuples with three elements: the first element is a list of keys corresponding to the read set, the second element is a list of keys corresponding to the write set, and the third element is a symbolic call which can be evaluated to an operation function that works on a view corresponding to the keys in the read set. Hence goodopr0 could be an argument to test this property with, but we might test it with other generators as well.
The property should check that a generated tuple Reads, Writes, SymOpr is accurate. That is, SymOpr can be evaluated to a function, Fun, and when Fun is applied on a view, corresponding to Reads, it gives a result of the right form: a pair where the second element is a change that corresponds to Writes.
18
Advanced Programming DIKU, B120192020
Remember that Fun may fail raise an exception of any class or return a result of the wrong form, and your property should be robust against that. You may assume that Fun terminates though.
A property propservercommit0 that checks that starting an operation and then committing it on an optimistic server, gives the right result and that the shared state is updated correctly. This property should test the following functions from the optimistic module: start1, operation3, commit1, and perhaps also stop1.
In your report, make sure that you document whether your implementation of this property deals with operations that might fail or not.
A property propisolated0 that checks that running two nonconflicting oper ations concurrently gives the same result as running them sequentially in some order.
A testall0 function that runs all your tests that only depends on the specified optimistic API. These tests should involve the required properties in this module, but should also test aspects and functionality not covered by the required properties.
We also evaluate your tests on your own implementation, for that you should export a testeverything0 function that could just call testall0.
You may want to put your tests in multiple files especially if you use both eqc and eunit as they both define a ?LET macro, for instance. If you use multiple files, they must all start with the prefix test.
You are welcome even encouraged to make more QuickCheck properties than those explicitly required. Properties that only depend on the specified optimistic API should start with the prefix prop. If you have properties that are specific to your implementation of the optimistic library perhaps they are related to an extended API or you are testing submodules of your implementation, they should start with the prefix myprop, so that we know that these properties most likely only work with your implementation of optimistic.
Topics for your report for Question 2
You should clearly document if you have implemented all parts of the question. Likewise, remember to detail how you have tested your module. In general, as always, remember to test your solution and include your tests in the handin.
Your report should also document:
What erroneous behaviours your implementation can handle, and how you have tested that.
How you deal with the management of helper operationprocesses. Including your strategy for aborting helper processes as early as possible and how you have tested this.
The quality or limitations of your tests. Especially those explicitly required in Question 2.2. Explain how you have measured this quality.
19
Advanced Programming DIKU, B120192020
Appendix A: Example use of optimistic
The following example demonstrates how to use the optimistic API. The function incr2 increments the value for a given key. The function updates3 spawns two processes that both increment the value for a given key N times. The given keys can either be the same key, conflictingupdates1, in which case some operations are aborted with high probability; or the keys can be different, nonconflictingupdates1, in which case none of the operations should be aborted. The functions conflictingupdates1 and nonconflictingupdates1 both return a tuple with four elements; you should think about the relation of these four numbers.
modulecounter.
exportconflictingupdates1, nonconflictingupdates1.
incrS, C
ok, ORoptimistic:operationS, C,
fun View
Vmaps:getC, View,
success, CV1
end,
updatesK1, K2, N
ok, Soptimistic:startK10, K20, Meself,
P1spawnfunMe ! self,
lists:mapfunincrS, K1 end, lists:seq1, N
end,
P2spawnfunMe ! self,
lists:mapfunincrS, K2 end, lists:seq1, N
end,
L2receive P2, Res2Res2 end,
L1receive P1, Res1Res1 end,
ok, K1 : C1, K2 : C2optimistic:stopS, C1, length Rok, RL1,
C2, length Rok, RL2. conflictingupdatesN
updatescrash, crash, N.
nonconflictingupdatesNupdatesx, y, N.
optimistic:commitOR.
20
Reviews
There are no reviews yet.