ECS 150: Project #4 File system
Prof.Jol Porquet
UC Davis, Winter Quarter 2019
1 Changelog
2 General information
3 Objectives of the project
4 Program description
5 The ECS150-FS specifications
6 Suggested work phases
7 Submission
8 Academic integrity
1 Changelog
Note that the specifications for this project are subject to change at anytime for additional clarification. Make sure to always refer to the latest version.
v1: First publication
2 General information
Due before 11:59 PM, Thursday, March 7th, 2019.
You will be working with a partner for this project.
Remember, you cannot keep the same partner for more than 2 projects over the course of the quarter.
The reference work environment is the CSIF.
3 Objectives of the project
The objectives of this programming project are:
Implementing an entire FAT-based filesystem software stack: from mounting and unmounting a formatted partition, to reading and writing files, and including creating and removing files.
Understanding how a formatted partition can be emulated in software and using a simple binary file, without low-level access to an actual storage device.
Learning how to test your code, by writing your own testers and maximizing the test coverage.
Writing high-quality C code by following established industry standards.
4 Program description
4.1 Introduction
The goal of this project is to implement the support of a very simple file system, ECS150-FS. This file system is based on a FAT (File Allocation Table) and supports up to 128 files in a single root directory.
The file system is implemented on top of a virtual disk. This virtual disk is actually a simple binary file that is stored on the real file system provided by your computer.
Exactly like real hard drives which are split into sectors, the virtual disk is logically split into blocks. The first software layer involved in the file system implementation is the block API and is provided to you. This block API is used to open or close a virtual disk, and read or write entire blocks from it.
Above the block layer, the FS layer is in charge of the actual file system management. Through the FS layer, you can mount a virtual disk, list the files that are part of the disk, add or delete new files, read from files or write to files, etc.
4.2 Constraints
Your library must be written in C, be compiled with GCC and only use the standard functions provided by the GNU C Library (aka libc). All the functions provided by the libc can be used, but your program cannot be linked to any other external libraries.
Your source code should adopt a sane and consistent coding style and be properly commented when necessary. One good option is to follow the relevant parts of the Linux kernel coding style.
4.3 Assessment
Your grade for this assignment will be broken down in two scores:
Auto-grading: ~60% of gradeRunning an auto-grading script that tests your program and checks the output against various inputs
Manual review: ~40% of gradeThe manual review is itself broken down into different rubrics:
Report file: ~50%
Submission : ~5%
Makefile: ~5%
Phase 1 implementation: ~5%
Phase 2 implementation: ~5%
Phase 3 implementation: ~10%
Phase 4 implementation: ~10%
Testing: ~5%
Code style: ~5%
5 The ECS150-FS specifications
For this project, the specifications for a very simple file system have been defined: its the ECS150-FS file system.
The layout of ECS150-FS on a disk is composed of four consecutive logical parts:
The Superblock is the very first block of the disk and contains information about the file system (number of blocks, size of the FAT, etc.)
The File Allocation Table is located on one or more blocks, and keeps track of both the free data blocks and the mapping between files and the data blocks holding their content.
The Root directory is in the following block and contains an entry for each file of the file system, defining its name, size and the location of the first data block for this file.
Finally, all the remaining blocks are Data blocks and are used by the content of files.
The size of virtual disk blocks is 4096 bytes.
Unless specified otherwise, all values in this specifications are unsigned and the order of multi-bytes values is little-endian.
5.1 Superblock
The superblock is the first block of the file system. Its internal format is:
Offset
Length (bytes)
Description
0x00
8
Signature (must be equal to ECS150FS)
0x08
2
Total amount of blocks of virtual disk
0x0A
2
Root directory block index
0x0C
2
Data block start index
0x0E
2
Amount of data blocks
0x10
1
Number of blocks for FAT
0x11
4079
Unused/Padding
If one creates a file system with 8192 data blocks, the size of the FAT will be 8192 x 2 = 16384 bytes long, thus spanning 16384 / 4096 = 4 blocks. The root directory block index will therefore be 5, because before it there are the superblock (block index #0) and the FAT (starting at block index #1 and spanning 4 blocks). The data block start index will be 6, because its located right after the root directory block. The total amount of blocks for such a file system would then be 1 + 4 + 1 + 8192 = 8198.
5.2 FAT
The FAT is a flat array, possibly spanning several blocks, which entries are composed of 16-bit unsigned words. There are as many entries as data blocks in the disk.
The first entry of the FAT (entry #0) is always invalid and contains the special FAT_EOC (End-of-Chain) value which is 0xFFFF. Entries marked as 0 correspond to free data blocks. Entries containing a positive value are part of a chainmap and represent a link to the next block in the chainmap.
Note that although the numbering in the FAT starts at 0, entry contents must be added to the data block start index in order to find the real block number on disk.
The following table shows an example of a FAT containing two files:
The first file is of length 18,000 bytes (thus spanning 5 data blocks) and is contained in consecutive data blocks (DB#2, DB#3, DB#4, DB#5 and DB#6).
The second file is of length 5,000 bytes (this spanning 2 data blocks) and its content is fragmented in two non-consecutive data blocks (DB#1 and DB#8).
Each entry in the FAT is 16-bit wide.
FAT index:
0
1
2
3
4
5
6
7
8
9
10
Content:
0xFFFF
8
3
4
5
6
0xFFFF
0
0xFFFF
0
0
5.3 Root directory
The root directory is an array of 128 entries stored in the block following the FAT. Each entry is 32-byte wide and describes a file, according to the following format:
Offset
Length (bytes)
Description
0x00
16
Filename (including NULL character)
0x10
4
Size of the file (in bytes)
0x14
2
Index of the first data block
0x16
10
Unused/Padding
An empty entry is defined by the first character of the entrys filename being equal to the NULL character.
The entry for an empty file, which doesnt have any data blocks, would have its size be 0, and the index of the first data block be FAT_EOC.
Continuing the previous example, lets assume that the first file is named test1 and the second file test2. Lets also assume that there is an empty file named test3. The content of the root directory would be:
Filename (16 bytes)
Size (4 bytes)
Index (2 bytes)
Padding (10 bytes)
test1
18000
2
xxx
test2
5000
1
xxx
test3
0
FAT_EOC
xxx