[SOLVED] data structure database SQL Overview

$25

File Name: data_structure_database_SQL_Overview.zip
File Size: 339.12 KB

5/5 - (1 vote)

Overview
In this assignment, you will first construct the basic building block of a table: Tuples. Next, you will implement a simple Catalog that keeps track of what tables are currently in the database and provides a way to access them. Finally, you will implement HeapFiles which are the physical representation of the data in our database.

Tuples
Your first task is to complete theTupleDescandTupleclasses.TupleDescis a class that specifies a schema for aTuple. Examine the methods and come up with a design before beginning your implementation.

Some implementation hints forTupleDesc:

* For the purposes of our database, we will only be using two datatypes: int and string (char). ints should be 4 byte signed values. Strings can be up to 128 characters in length. The first byte of a string will contain the length of the string, followed by the bytes containing the content of the string (one byte per character). If a string is less than 128 characters, the leftover bytes will be filled with zeroes until it reaches a size of 128.

* ThehashCodemethod is not required, but if you intend to hashTupleDescobjects (for use with aHashMapfor example, then you will need to implement it.

* When implementingtoStringplease use the format specified in the comments, since we will be using this for testing purposes.

Once you haveTupleDescimplemented, it is time to focus on theTupleclass. ATupleis responsible for storing data and making sure that it conforms to the schema specified by aTupleDesc.

Some implementation hints forTuple:

* You will need to decide on a data structure that can be used to map field names to their content.

Once you have finished these two classes, your code should pass theTupleDescTestandTupleTesttests.

Catalog
Next, we will shift our focus to tables. Tables will consist of manyTuples. Before we construct the tables themselves, we will implement aCatalogthat will keep track of what tables currently exist and provide access to those tables. Examine theCatalogclass and come up with a design before implementing it.

Some implementation hints forCatalog:

* You will likely need to create a structure to hold relevant information for a table (such as its name, the location of the file, and its primary key). The best way to do this is probably to create aprivate inner class.

* Many of the methods in this class rely on a table id. This id is generated by aHeapFilewhich you have not implemented yet. It may be worthwhile to take a look at that method (it is very simple) before testing yourCatalogimplementation.

Heap Files
The actual physical storage of our data is called aHeapFile.HeapFilesconsist of manyHeapPages. The primary purpose of aHeapFileis to manage access to theHeapPagesand also to stream the data out to disk. The primary function of aHeapPageis to manage the various tuples making sure that each tuple is written to a proper location and accessing tuples as necessary.

It is highly recommended that you design and implement theHeapPageclass first, then design and implement theHeapFileclass.

Some implementation hints for theHeapPageclass:

* EachHeapPagewill have a header and some slots. Slots are used to store the actual data the size of each slot is the same as the size of the tuples contained in theHeapPage. The header is a binary encoding of which slots are occupied. Each slot is assigned one bit in the header. If that bit is a one, then that slot is occupied, and contains data that can be accessed. If the bit is a zero, then that slot contains no data and can have new data written to it. The header format should track slot occupancy starting with the least significant bit. In other words, the header bit for slot 0 would be in byte 0, bit 0 of the header. Slot 1 would be in byte 0, bit 1. Slot 8 would be in byte 1, bit 0, etc.

* It is important to consider how much space the header will require. All pages are of constant size (see the static variable inHeapPage) so the space that we are using for the header will decrease the amount of space that we can use for tuples. Headers must be a whole number of bytes (no partial bytes).

* Any space that is not taken up by the header can be used to store tuples. Tuples should be stored immediately following the header. No partial tuples can be stored in aHeapPage. If there is any space leftover, theHeapPagewill be padded with zeroes until it reaches its defined size.

* When adding or deleting a tuple from aHeapPagebe sure to update the header to indicate the occupied status of the modified slot.

Here are some implementation hints for theHeapFileclass:

* To create an id for yourHeapFileit is sufficient to create some sort of hash based on theFile.

* When reading or writing a page, it is strongly recommended that you use aRandomAccessFilefor file access.

* When adding a tuple, you must first find a page with an empty slot. If there are no empty slots, create a new heap page instead. Feel free to create some additional methods inHeapPageto help with this if necessary. You should then pass the new tuple to theHeapPage. Finally, be sure to write the modifiedHeapPageto disk.

* When deleting a tuple, you should be able to find the page id and the tuple id (slot number) from theTupleobject. Be sure to write the modifiedHeapPageto disk.

* It will be helpful to use the iterator from yourHeapPageclass to aggregate all of the tuples in thegetAllTuples()method.

Example
The following example will load a table from a file, then display all of the tuples contained in that table. In other words, it is equivallent toSELECT * FROM test;. The .txt file contains the schema and the catalog looks for the data in a file called test.dat.

Catalog c = Database.getCatalog();
c.loadSchema(testfiles/test.txt);

int tableId = c.getTableId(test);
td = c.getTupleDesc(tableId);
System.out.println(td);

hf = c.getDbFile(tableId);
ArrayList tups = hf.getAllTuples();
Iterator it = tups.iterator();

while(it.hasNext()) {
System.out.println(it.next());
}

Right now we arent properly reading in any SQL code or processing any queries. In future assignments we will add this functionality to our database.

Testing
You have been provided some tests that you can use to check basic functionality of each of the objects. These tests are NOT comprehensive. Just because you pass the tests does not mean that you will get a perfect score on this assignment. Please feel free to write some of your own tests to check your implementation. In particular, notice that none of the tests coverHeapFileswith multiple pages.

Your tests will not be part of your grade. When we do grade your assignment, we will replace the existing tests with some more thorough tests and record the output.

Rubric
The below rubric is intended as a guideline and will be followed as closely as possible, however submissions that deviate too much from the specification may be graded using an alternate rubric.

TupleDesc (15 points): instance variables and getters and setters (5 points), toString (5 points), field access methods (5 points)

Tuple (15 points): instance variables and getters and setters (5 points), toString (5 points), field access methods (5 points)

Catalog (15 points): instance variables and getters and setters (5 points), implementation of inner class (5 points), table management methods (5 points)

HeapPage (25 points): instance variables and getters and setters (5 points), numSlots and header size (5 points), adding and deleting tuples (10 points), iterator (5 points)

HeapFile (30 points): instance variables and getters and setters (5 points), adding and deleting tuples (10 points), read and write page (10 points), getAllTuples() (5 points)

I reserve the right to deduct points for code that is poorly formatted or poorly documented.

/docProps/thumbnail.jpeg

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] data structure database SQL Overview
$25