[SOLVED] python Spark SQL database 95 703 D: Database Management DB Implementation Project

$25

File Name: python_Spark_SQL_database_95__703_D:_Database_Management_DB_Implementation_Project.zip
File Size: 772.44 KB

5/5 - (1 vote)

95 703 D: Database Management DB Implementation Project
This project is to be completed in Oracle Database 18c Express by groups of two students. Pairing will be provided by the instructor. This document is broken down into the following parts: description of the model (Part I), the specific tasks you need to do (Part II), and what is required for submission (Part III). Submit your project (hardcopy) on December 4th, by 6 p.m.
Part I. Description of the model
In Homer Consulting each employee is assigned to (employed by) a specific department. One of the employees from a department manages the department. Some employees (especially in larger departments) supervise other employees in their department. Thus, one employee may have a supervisor and may also supervise other employees as well. The CEO supervises the top- level employee of each department.
The company runs projects for external clients. Information on both current clients (those that have had projects done in the past) as well as potential clients (those who have not requested projects yet) is kept. For billing purposes, each project is contained within one department. Therefore, most (but not necessarily all) of employees assigned to a project come from one department. Administration department does not run any projects. For each project start date and the total cost is recorded. The total cost of a project can only be estimated (and recorded) after the project is completed. For an ongoing project, the total cost will have a NULL value.
Each project has several employees working on (assigned to) the project. Newly hired employees are not assigned to any project for the first few months and may be trained internally or by attending a formal workshop. Each assignment is recorded on month-by-month base. Within each month the date employee was assigned to that project, and the end date of the assignment in that month (not necessarily the last day of the month) is recorded. In addition, the total number of hours spent by the employee on the project in that month is kept. If an employee worked on a project for two months, then two records would be created, one record for each month. Number of hours used within each month is known and recorded after the monthly assignment is completed. For an assignment in a current month, both the end date and hours spent are NULL.
Homer Consulting maintains a list of desired skills. For each desired skill, code and description is kept. Not all of these skills might be represented among the employees. Whenever an employee acquires a skill, through some form of training (i.e. completing a course at a college, completing a technical training program, or a workshop), approved by the company, the date when it was acquired and a comment regarding the training is recorded. Each employee may take the same training more that once since some of the skills might need to be updated or renewed.

ER Diagram:
TRAINING
SKILL
(0, 1)
DEPARTMENT
(1, 1)
operates (0, N)
(0,N) has
Schema:
manages
employs
(1, 1) (1, N)
(1, N)
(0, N) (1, 1)
(0, N)
(1, 1)
(0, N)
supervises
EMPLOYEE
(1, 1)
(1, 1)
(1,1)
(1, 1)
(0, 1)
(0, N)
PROJECT
ASSIGNMENT
CLIENT
SKILL (Code, Description)
TRAINING (Train_Num, Code@, Emp_Num@, Date_Acquired, Name, Comments)
DEPARTMENT (Dept_Code, Name, Location, Phone, Manager_ID@)
EMPLOYEE (Emp_Num, Emp_Last, Emp_First, DOB, Hire_Date, Super_ID@, Dept_Code@)
CLIENT (Client_ID, Name, Street, City, State, Zip_Code, Industry, Web_Address, Phone, Contact_Name)
PROJECT (Proj_Number, Name, Start_Date, Total_Cost, Dept_Code@, Client_ID@)
ASSIGNMENT (Assign_Num, Proj_Number@, Emp_Num@, Date_Assigned, Date_Ended, Hours_Used)

Part II.
A. Create two separate SQL scripts for creating all necessary tables, data integrity constraints, and for inserting data into the tables to implement the database as defined above.
1. Script #1: Write all CREATE TABLE statements that have the necessary integrity constraints including primary keys, foreign keys, check constrains, etc. Start the script with a set of DROP TABLE statements that will allow you to clean up the database before creating it (useful when you recreate the database using the script). Use the names of tables as well as the names of attributes exactly as provided in the model above.
All constraints, except one, must be created and enforced within the Create Table statements and must be named using the convention discussed in class (lecture 12). Run the script to create the DB objects and to create the missing constraint (as a last statement in your script). Your CREATE TABLE statements must also include the constraints defined below:
a. The employees Hire_Date, the project Start_Date, and the Date_Assigned should have a default value that is current date,
b. The Date_Ended cannot be earlier than the Date_Assigned, and
c. Three more appropriate and useful check constraints (but not the Not Null
constraints) of your choice. Make sure that you clearly identify them.
2. Script #2: After the DB tables and all data integrity constraints are created, insert about ten rows of data into each table. Provide data that would allow you to test all queries as defined in section B below (each query must give you results). It may be required to add more than 10 rows in some tables in order to show that your queries work as intended. The Department and Skill tables could have fewer rows. Run the script to perform the task of populating the tables in your database. (Note: All constraints must be enabled and enforced not deferred before you populate your tables)
3. After all tables are populated, list the complete content of each table. Format content of each table to make it readable and understandable.
B. CreateSQLqueriestoanswerthefollowingquestions:
1. The companys human resources department needs information on the current training levels of the employees. They need information that shows, for each employee, and his/her acquired skills, the number of times he or she received training for that skill, the most recent date of the training, and the number of months (full months only) that have passed since the most recent training for the skill. Include all employees in the output, even if they have not received any training yet.
2. Due to a recent re-organization, the company needs a list that shows who supervises who. List all the employees in the company clearly indicating the organizational hierarchy. Include the level of the hierarchy each employee is at and the department name of the employee.

3. For every ongoing project (i.e., a project with total cost being NULL) provide the project name, project start date. In addition, for each month of the project list the number of employees assigned that month and hours spent on the project that month. At the end provide an overall total of the number of employees assigned to the project and the total hours spent on the project.
4. The company has decided to adopt a bonus policy for their employees. Each employee will receive a bonus of $200 for every project started last calendar year, on which they have charged at least 40 hours. As a result, management has asked you to update the employee records to include their current bonus information. In order to accomplish this, you will have to add another column to the EMPLOYEE table (i.e., column BONUS_AMT), before you update the data. After the single Update statement is performed, list the contents of the Employee table.
5. For each employee hired last calendar year, list the name and the hire date of the employee, name of training received (if any), date of the training, and the number of days between the hire date and the training. Also include number of projects that employee worked on so far.
6. Some projects are discontinued for certain period of time and resumed later. Find out project name and start date of those projects that have discontinuous activity. Indicate whether the discontinued projects have been completed (i.e., your output should clearly indicate this by completed or on-going value in a column).
7. We need a summary of project information for each quarter of this year. For each quarter include: number of projects started in the quarter, number of employees working on those projects in the quarter, number of hours spent on average per project.
8. For each employee and each skill, list the number of times the employee had training and the latest date (most recent) when the employee acquired that particular skill. Also, provide the number of trainings provided/attended for each skill, and the number of skills acquired by each employee. The output should look like this:
(Note: the values in the table are made up; use the names/values from your own database):
9. For each department and each skill combination, list the number of trainings completed within the department that was associated with the particular skill. Also provide a rank of each skill within each department. The rank should be based on the number of trainings completed for that skill. Same rank should be given when the number of trainings is the same.
Latest Date Acquired
Latest Date Acquired
Latest Date Acquired
Latest Date Acquired
Number of Skills:
ID
Employee Name
ETL
Spark
Python
Tableau
100
Mary Smith
1
10/25/19
0

0

0

1
110
Alice Rodriguez
0

1
11/02/19
2
6/24/19
2
9/07/19
3
120
Kunal Shah
1
9/25/19
2
6/02/19
2
11/23/18
1
4/07/19
4
130
Tao Li
1
6/23/19
1
12/21/18
1
11/23/18
1
10/17/18
4
140
Pat Gomez
0

1
11/02/19
1
6/24/19
1
10/17/18
3
Number of Trainings:
3
5
6
5

10. For each employee, list ID and name, his/her supervisors ID and name (concatenate ID and name for both employees and supervisors), any training each employee has had, and the skill(s) associated with the training. For the employees training, list the name of training, training date, skill(s) obtained through the training, and the number of months since the training date. In addition, provide the number of trainings each employee has had.
11. The firm defines utilization as the number of hours worked since employment divided by the number of hours available. It also awards the best utilized employee with $10,000, second best with $9000 and so on till the fifth best with $6000. Given that each employee can work only 40 hours-per-week, list the top five employees with highest utilization. Also list the utilization rate and the assigned award/bonus.
12. Create a single query that accesses the system catalog to retrieve information about your database tables and their columns, as well as any constraints on those columns. Include the following:
a. Table name
b. Column name
c. Constraint name
d. Constraint type. Instead of the Oracles one letter codes, use the two letter indicators as
defined in lecture #12
e. Search condition (for any check constraints), and
f. Table and column that each FK references
Sort the result by table name. List column in the same order as they were created in their tables. Include commands used for formatting the results (e.g., use BREAK command to suppress repeating names of tables and columns and COLUMN command on individual columns).
Part III.
Create a title page that includes your names, email addresses, and the group number as it was assigned to your group. Clearly identify each part of your submitted project and each answer. Submit the following:
1. Printout of the two separate SQL script files (Script#1 & Script#2) as you created in Part II.A. as well as the content of the tables. (Note: Do not print the spool file of executing the scripts)
2. The queries created in Part II.B. Include the SQL statements as you execute them as well as
the results. Make sure you appropriately format both the statements and results (refer to the Formatting SQL Output document provided for your SQL Assignments).

Grading
Projects will be graded based on the following criteria:
Correctness of SQL based on requirements given in the assignment. Although there may be more than one way to approach each question, the SQL query/results must match the requirements as defined in each question.
Readability of your SQL statements as they are executed and well formatted results.
Punctuality. The deadline for the submission is 6 p.m. on December 4th.
o Projects submitted on December 4th but later than 6 p.m. will receive a 25% penalty o Projects will not be accepted after December 4th.

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] python Spark SQL database 95 703 D: Database Management DB Implementation Project
$25