[SOLVED] 代写 CSI 3030 Project Guidelines

30 $

File Name: 代写_CSI_3030_Project_Guidelines.zip
File Size: 320.28 KB

SKU: 1021422584 Category: Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Or Upload Your Assignment Here:


CSI 3030 Project Guidelines

Objective
• Implement a new symmetric hash join query operator replacing the current hash join implementation.
• Modifications to be done in Optimizer and Executor component

What you need to know
• Understand what a hash join is.
• How is it implemented in POSTGRESQL.
• Need to know what all files to be modified.
• Understand POSTGRESQL backend architecture.

What you are to implement • Symmetric Hash Join

A Tour of PostgreSQL Internals https://www.postgresql.org/files/developer/tour.pdf

More reading to do
• Understand how Optimizer and Executor works
• Refer to src/backend/optimizer/README • Refer to src/backend/executor/README

Relevant Files • Src/backend/executor
– nodeHashJoin.c: This file implements the actual processing of the hash join operator.
– nodeHash.c: This file is responsible for creating and maintaining a hash table.
*You will be given a list of methods to be modified for this Project (slide 19 onwards ).

Relevant Files • src/backend/optimizer/plan/
– createplan.c: This file contains the code that creates a hash join node in the query plan.
• src/include/nodes/
– execnodes.h: This file contains the structure
HashJoinState that maintains the state of the hash join during execution.

Deliverables
• All of the relevant files listed in the two previous slides to zipped and submitted
• Send the zipped file to any one of the TA’s emails. Copying the other two lab TAs’ (emails available here https://github.com/ferna11i/3130-DBMS)
• Insert comments in the parts you have modified. Also add in any necessary explanations.
• Need a test query which would execute your code as per the requirement. Test Query is available here (schema.txt, query.txt, output.txt) (https://github.com/ferna11i/3130- DBMS/tree/master/Project)
• All comments in code to be preceded by ‘CSI3130:’
• Project Deadline: Monday December 9 at 6:00 pm at the latest.

Additional Help
• http://doxygen.postgresql.org/: Source code browser
• ELOG in POSTGRESQL

Hash Join & Symmetric Hash Join explained

What is a JOIN • Between tables or self
• Query accessing multiple rows of the same or different tables
• Order of execution : Sequential

Hash Join
• Loads candidate records from one side of the join into a hash table.
• Probe for each record from other side of the join
• Purpose of the Hash Table – Indexing

Hash Join

Facts about hash joins
• Hash joins do not need indexes on predicates
• Reduce the hash table size to improve the performance
• Cannot perform joins that have range conditions in the join predicates

Drawbacks with Hash Join • Bottleneck in query execution pipeline
• Sufficient memory required to store inner relation.
• Hybrid Hash Join!!!

Symmetric Hash Join
• Maintains two hash tables • Two Hash functions

Other requirements to be considered
• Should run until it gives one output tuple each time
• State should be saved – should include detail on which tuple is running using state node.

Further Project Details

Disabling use of Multiple Batches
• Present implementation
– Hybrid hash join
– Tuples from inner / outer are considered in batches

Changes in nodeHash.c
• Modify ExecHash
• Disable Batches
• Implement ExecScanHashBucket for both Inner and Outer

ExecHash
• Remove the present error message
• It should have the same code as MultiExecHash

Where to disable?
• nodeHash.c
– ExecHashTableCreate

Changes to scanhashbucket • Present implementation
– Two functions : one for probing inner and other for outer
– Returns HeapTuple

Changes to scanhashbucket
• Need only one method for scannning both hash tables
• Return HashTuple instead of HeapTuple
• Check for type of hash table in the method : whether its inner or outer.

Sample Implementation Code

Deliverables
• All of the relevant files listed to be submitted
• Send in the Zipped file to Any of the TA’s emails. Copying the other TAs (emails available here https://github.com/ferna11i/3130-DBMS)
• Insert comments in the parts you have modified. Also add in any necessary explanations.
• Need a test query which would execute your code as per the requirement. Test Query is available here (schema.txt, query.txt, output.txt) (https://github.com/ferna11i/3130- DBMS/tree/master/Project)
• All comments in code to be preceded by ‘CSI3130:’
• Project Deadline: Monday December 9 at 6:00 pm at the latest.

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] 代写 CSI 3030 Project Guidelines
30 $