DNA Search
This project explores pattern – matching techniques to find a pattern within a DNA sequence composed of the alphabet {A, C, G, T}.
Example:
Consider the following DNA sequence:
ATGACGATCTACGTATGGCAGCCACGCTTTTGATGTTAAGTCACACAGCCAAGTCAACAAGGGC
GACTTCATGATCTTTCCGCTCCGTTGGTGTAGGCCCGTGTTCAAATTCAATGGCTGATTGGAAT
TACCTTTGAAATACTCCAACCGACCGCCACGGCCAGGGTCCCGCTCGCTCTCTGTGGCCCTCCC
ACAAAACTCCGGTGAAAGTTGATTTGGACACGGACCCAAAGCAGCGTAGATTATTCGAGCGTAT TCGGTAGTCATTGAGGCCCCAA
The pattern “GCTTTT” is found at index 27 (where the first character of the sequence is at index 0).
Note: Overlapping matches are treated as separate occurrences. For instance, in the sequence ‘AAAAAA’ with the pattern ‘AAA’, there are 4 occurrences at indices 0, 1, 2, and 3. You will write a C program and a RISC – V program to identify the indices where a given pattern appears in a DNA sequence.
Strategy
1. Pre – coding Analysis:
Before writing any code, analyze the task requirements and constraints. Mentally explore various approaches and algorithms, considering their potential performance, code length, and storage costs. There are often trade – offs between these metrics.
2. High – Level Language (HLL) Implementation:
Choose a promising approach and first implement it in a high – level language (e.g., C) to deepen understanding. HLL implementations are more flexible for exploring solutions and should be developed before creating the assembly version, where design changes are more difficult.
For P1 – 1, you will write a C implementation of the program.
3. Assembly Translation:
Once a working C version is completed, “be the compiler” to translate it into RISC – V assembly. This step helps understand how HLL constructs map to machine – level instructions and offers opportunities to optimize performance and efficiency. You will write the assembly version for P1 – 2.
P1 – 1: High – Level Language Implementation
1. Development Approach:
Start with a simple implementation to understand the problem, then experiment with optimizations. Time spent here can save significant effort during assembly coding.
2. Shell Program:
Use the provided P1 – 1 – shell.c as a template. Rename it to P1 – 1.c and modify it by adding your code. The shell auto generates a 10240 – character DNA sequence and a random pattern (3 – 10 characters) using ASCII characters A, C, G, T.
3. Match Function Requirements:
Implement the Match function with four parameters:
4. Pointer to the pattern string
5. Pattern length
6. Pointer to the DNA sequence 7. Sequence length
The function must store matching indices in ascending order in the global array MatchIndices, ending with – 1. Example: For the sequence “AACAAC” and pattern “AAC”, MatchIndices should be [0, 3, – 1].
8. Grading Considerations:
9. Do not modify the provided print statements, as they are used for grading.
10. Use the DEBUG flag to wrap debug prints (set to 0 in submitted code).
11. Submission requirements:
File named P1 – 1.c
Compile and run with gcc on Linux (no warnings). Self – contained code (no header files).
P1 – 2: Assembly Level Implementation
1. Shell Program:
Use the provided P1 – 2 – shell.asm as a starting point; rename it to P1 – 2.asm. The assembly program uses ecall to generate a random DNA sequence (4800 characters, packed as 2 – bit nucleotides) and a pattern (3 – 7 nucleotides).
2. ECALL Routines:
3. 512 (Generate DNA Sequence):
a0: Address for the pattern (stored as right – to – left 2 – bit pairs). a1: Address for the sequence (4800 characters, 600 words with lower 16 bits as 2 – bit nucleotides).
a2: Pattern length (3 – 7).
4. 513 (Verify Solution):
a3: Address of MatchIndices array (sorted indices + – 1). Outputs debug messages for correctness.
5. 552 (Highlight Letter): a6: Offset in the sequence (0 – 4799) for debugging visualization.
6. Memory Constraints:
7. The Matches array must be alloc 128 (do not modify size).
8. Performance Metrics:
9. Baseline Scores:
Static code size: 44 instructions Dynamic execution length: 45740 instructions Storage: 743 words
10. Scoring Formula:
[ ext{PercentCredit} = 2-rac{ ext{Metric}{ ext{Your Program}}}{ ext{Metric}{ ext{Baseline Program}}} ]
11. Accuracy (25 points): Reduced by 10% per failed trial (out of 100), with style deductions possible.
12. Performance scores are adjusted by trial failures (10% deduction per error; no credit for ≥10 failures).
13. Submission Requirements:
14. File named P1 – 2.asm.
15. Call ecall 513 to report results and use jalr to exit.
16. No infinite loops or simulator errors.
Project Grading
| Part Description | Percentage |
|——————————–|————|
| P1 – 1 (C code: correctness/style) | 25% |
| P1 – 2 (Assembly: correctness/style) | 25% |
| Static code size | 15% |
| Dynamic execution length | 25% |
| Operand storage requirements | 10% |
| Total | 100% |
Honor Policy
All code must be independently designed, implemented, and tested by the student. Use of AI tools, code sharing, or collaboration constitutes academic misconduct.
ECE2035 Project One
:#The Solution needs to be customised that’s why we didn’t attach the solution
For the Programming Help for this solution email or whatsapp me at:
[email protected] Whatsapp : +1 419 877 7882
ECE2035, Project, Search, solved
[SOLVED] Ece2035 project 1- dna search p0
$25
File Name: Ece2035_project_1__dna_search_p0.zip
File Size: 301.44 KB
Only logged in customers who have purchased this product may leave a review.

![[SOLVED] Ece2035 project 1- dna search p0](https://assignmentchef.com/wp-content/uploads/2022/08/downloadzip.jpg)

![[SOLVED] Project 2 Classification Network](https://assignmentchef.com/wp-content/uploads/2022/08/downloadzip-1200x1200.jpg)
Reviews
There are no reviews yet.