The Relational Database Model
MONASH
INFORMATION TECHNOLOGY
Overview
We now have a conceptual model for Monash Software, it is time to move to the second stage and map this to a logical model
For our unit this will involve mapping to the Relational Model in preparation for implementation in a RDBMS
Relational Model
RelationalAlgebra
2
The Relational Model
IntroducedbyCODDin1970-thefundamentalbasisfortherelationalDBMS
BasicstructureisthemathematicalconceptofaRELATIONmappedtothe
concept of a table (tabular representation of relation)
Relation abstract object
Table pictorial representation
Storage structure real thing eg. isam file of 1s and 0s
RelationalModelTerminology
DOMAIN set of atomic (indivisible) values
specify
name
data type
data format Examples:
customer_number domain 5 character string of the form xxxdd
name domain 20 character string
address domain 30 character string containing street, town & postcode
credit_limit domain money in the range $1,000 to $99,999
3
A Relation
Arelationconsistsoftwoparts heading
body
RelationHeading
Also called Relational Schema consists of a fixed set of attributes R(A1,A2,.An)
R = relation name, Ai = attribute i
Each attribute corresponds to one underlying domain:
Customerrelationheading:
CUSTOMER (custno, custname, custadd, credlimit)
dom(custno) = customer_number dom(custname) = name
dom(custadd) = address
dom(credlimit) = credit_limit
custno
custname
custadd
credlimit
4
Relation Body
RelationBody
Also called Relation Instance (state of the relation at any point in time)
r(R)={t1,t2,t3,.,tm}
consistsofatime-varyingsetofn-tuples
Relation R consists of tuples t1, t2, t3 .. tm
m = number of tuples = relation cardinality eachn-tupleisanorderedlistofnvalues
t=
n = number of values in tuple (no of attributes) = relation degree
In the tabular representation:
Relationheadingcolumnheadings
Relationbodysetofdatarows
custno
custname
SMI13
JON44 BRO23
BROWN
Here Rd, Clayton, 3168
2000
10000
10000
SMITH
JONES
custadd
Wide Rd, Clayton, 3168
Narrow St, Clayton, 3168
credlimit
5
Relation Properties
No duplicate tuples
bydefinitionsetsdonotcontainduplicateelements
hence tuples must be unique Tuples are unordered within a relation
bydefinitionsetsarenotordered
hence tuples can only be accessed by content
No ordering of attributes within a tuple bydefinitionsetsarenotordered
6
Relation Properties contd
Tuple values are atomic cannot be divided
EMPLOYEE (eid, ename, departno, dependants)
not allowed: dependants (depname, depage) multivalued
hencenomultivalued(repeating)attributesallowed, called the first normal form rule
COMPARE with tabular representation
normally nothing to prevent duplicate rows rows are ordered
columns are ordered
tablesandrelationsarenotthesamething
7
Functional Dependency
Functional Dependency:
A set of attributes A functionally determines an attribute B if, and only if,
for each A value, there is exactly one value of B in the relation. It is denoted as A B (A determines B, or B depends on A)
order_no order_date
prod_no prod_desc
order_no, prod_no qty_ordered
9
Relational Model Keys
A superkey of a relation R is an attribute or set of attributes which exhibits only the uniqueness property
No two tuples of R have the same value for the superkey (Uniqueness property)
t1[superkey] = t2[superkey]
A candidate key CK of a relation R is an attribute or set of attributes
which exhibits the following properties:
Uniqueness property (as above), and
No proper subset of CK has the uniqueness property
(Minimality or Irreducibility property) ie. a minimal superkey
One candidate key is chosen to be the primary key (PK) of a relation. Remaining candidate keys are termed alternate keys (AK).
Only ONE primary key
(may be composed of
many attributes a
composite primary key) 10
Many possible superkeys
Potentially many possible candidate keys
Selection of a Primary key
A primary key must be chosen considering the data that may be added to the table in the future
Names, dates of birth etc are rarely unique and as such are not a good option
PK should be free of extra semantic meaning and security compliant, preferably a single attribute, preferably numeric (see Table 5.3 Coronel & Morris)
Natural vs Surrogate primary key
PATIENT_TREATMENT (patient_id, physician_id, treatment_code, pt_date, pt_time, pt_result)
Superkey
CK
PK
Issues with PK?
14
15
Null in the Relational Model
NULL is NOT a value is a representation of the fact that there is NO VALUE
Reasons for a NULL:
VALUE NOT APPLICABLE
EMP relation empno, deptno, salary, commission commission only applies to staff in sales dept
VALUE UNKNOWN
Joes salary is NULL, Joes salary is currently unknown
VALUE DOES NOT EXIST
Tax File Number is applicable to all employees but Joe may not
have a number at this time
VALUE UNDEFINED
Certain items explicitly undefined eg. divide by zero
Columns Number_of_payments, Total_payments ColumnAverage_payment_made
If Number_of_payments = 0 => Average undefined
16
Writing Relations
Relations may be represented using the following notation:
RELATION_NAME(attribute1,attribute2,)
The primary key is underlined.
Example:
STAFF(staffid,surname,initials,address,phone)
17
Relational Database
A relational database is a collection of normalised relations.
Normalisation is part of the design phase of the database and will be discussed in a later lecture.
Example relational database:
ORDER (order_id, orderdate,)
ORDER_LINE (order_id, product_id, quantity) PRODUCT (product_id, description, unit_price)
18
Foreign Key (FK)
FK: An attribute/s in a relation that exists in the same, or another relation as a Primary Key.
Referential Integrity
AForeignKeyvaluemusteithermatchthefull
primary key in a relation or be NULL.
The pairing of PK and FK creates relationships (logical connections) between tables when implemented in a RDBMS. Hence the abstraction away from the underlying storage model.
19
Data Integrity
Entity integrity
PrimarykeyvaluemustnotbeNULL.
No duplicate tuple property then ensures that each primary key must be unique
Referential integrity
ThevaluesofFKmusteithermatchavalueofafull PK in the related relation or be NULL.
Column/Domain integrity
Allvaluesinagivencolumnmustcomefromthe same domain (the same data type and range).
22
Relational DMLs
Relational Calculus
Relational Algebra
Transform Oriented Languages (e.g. SQL)
Graphical Languages
Exhibit the closure property queries on relations produce relations
24
Relational Calculus
Based on mathematical logic.
Non-procedural.
Primarily of theoretical importance.
May be used as a yardstick for measuring the power of other relational languages (relational completeness).
Operators may be applied to any number of relations.
25
RELATIONAL ALGEBRA
Manipulation of relational data
26
Relational Algebra
Relationally complete.
Procedural.
Operators only apply to at most two relations at a time. 8 basic operations:
single relation: selection, projection
cartesianproduct,join
union
intersection
difference
division
27
Relational Operation PROJECT
28
Relational Operation SELECT
29
Relational Operation Multiple Actions 2
1
30
SQL vs Relational Algebra in the Database
31
JOIN
Join operator used to combine data from two or more relations, based on a common attribute or attributes.
Different types: theta-join
equi-join
naturaljoin outerjoin
33
THETA JOIN (Generalised join)
(Relation_1) F (Relation_2)
Fisapredicate(i.e.truth-valuedfunction)whichis
of the form Relation_1.ai Relation2.bi
CUSTOMER.cust_no ORDER.cust_no
isoneofthestandardarithmeticcomparison operators, i.e. <, , =, , >
Mostcommonly,isequals(=),butcanbeanyof the operators
EMPLOYEE.emp_sal > SALARYSCALE.step_5
34
NATURAL JOIN
STUDENT MARK
ID Name ID Subj Marks
1Alice 1100495 2Bob 2104555
Step 1: STUDENT X MARK
Step 2: delete rows where IDs do not match (select =)
1 1045 90
STUDENT. Name ID
1 Alice 1 Alice
1 Alice
2 Bob
2 Bob 2 Bob
MARK.ID Subj Marks
1 1004 95
2 1045 55
1 1045 90
1 1004 95
2 1045 55
1 1045 90
37
NATURAL JOIN
STUDENT MARK
ID Name ID Subj Marks
1Alice 1100495 2Bob 2104555
Step 1: STUDENT X MARK
Step 2: delete rows where IDs do not match (select =) Step 3: delete duplicate columns (project away)
1
1045 90
STUDENT.I Name D
MARK.ID Subj Marks
1 1004 95 1 1045 90
1 Alice
1 Alice
2 Bob 2 1045 55
38
NATURAL JOIN
STUDENT MARK
ID Name ID Subj Marks
1 Alice 1 100495 2Bob 2104555
Step 1: STUDENT X MARK
Step 2: delete rows where IDs do not match (select =) Step 3: delete duplicate columns (project away)
1
1045 90
Marks 95
90
55
ID 1 1 2
Name Subj Alice 1004 Alice 1045 Bob 1045
A natural join of STUDENT and MARK
39
STUDENT
ID Name
1 Alice
2 Bob
3 Chris
MARK
ID Subj Marks
1 1004 95
2 1045 55
1 1045 90 4 1004 100
OUTER JOIN
No information for Chris (no mark, e.g. just enrolled) and the student with ID 4 (no student, e.g. quit uni)
ID 1 1 2
Name Subj Marks Alice 1004 95 Alice 1045 90 Bob 1045 55
A natural join of STUDENT and MARK
41
STUDENT ID
1 2 3
Name Alice Bob Chris
MARK
ID Subj Marks
FULL OUTER JOIN
Get (incomplete) information of both Chris and student with ID 4
ID Name 1 Alice
1 Alice
2 Bob
3 Chris
4
Subj Marks 1004 95 1045 90 1045 55
1 1004 95 2 1045 55 1 1045 90 4 1004 100
1004 100 A full outer join of STUDENT and MARK
Null Null
Null
42
STUDENT
ID Name
1 Alice 2 Bob 3 Chris
MARK
ID Subj Marks
1 1004 95 2 1045 55 1 1045 90 4 1004 100
LEFT OUTER JOIN
Get (incomplete) information of only Chris
ID Name 1 Alice 1 Alice 2 Bob
3 Chris
Subj Marks 1004 95 1045 90 1045 55
Null Null
A left outer join of STUDENT and MARK
Memory aid: Chris is on the LEFT of the nulls.
43
STUDENT
ID Name
1 Alice 2 Bob 3 Chris
MARK
ID Subj Marks
1 1004 95 2 104555 1 1045 90
4 1004 100
ID 1 1 2 4
Name Alice Alice Bob
Subj Marks 1004 95 1045 90 1045 55 1004 100
RIGHT OUTER JOIN
Get (incomplete) information of the student with ID 4
Null
A right outer join of STUDENT and MARK.
Memory aid: the marks data is on the RIGHT of the null.
44
Reviews
There are no reviews yet.