[SOLVED] 程序代写代做代考 Bioinformatics c++ matlab scheme algorithm HW4

30 $

File Name: 程序代写代做代考_Bioinformatics_c++_matlab_scheme_algorithm_HW4.zip
File Size: 668.82 KB

SKU: 3471424789 Category: Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Or Upload Your Assignment Here:


HW4

Homework#4forBioimagingandBioinformatics
BME2210–Spring2017
Bioinformaticsportion

Due(as5files–4.cppfilesand1.txtfile–totheICONdropbox)by11amonMonday,February
20

ImplementNeedleman-Wunschalgorithm

ImplementtheglobalalignmentNeedleman-Wunschalgorithmforproteinsequencesinan
iterativefashion(i.e.,donotuserecursion,whichsimplifiestheimplementationforthosethat
donothaveexperiencewithrecursion.)Foraniterativesolution,wewilluse3matrices:

• The“A”matrixwillstorethefinalalignmentscores.
• The“B”matrixissimplyaninitialized“A”matrixwithBlosum62scoresforeachpairof

residues.
• The“D”matrixwillstoreourdirections.

Ifyouwish,youcanlookatthesamplecode(includedbelow),whichprovidesyouwith
functionsto:

• initializeAand/orBmatrices
• printmatrices,and
• usetheBLOSUM62substitutionmatrixforscoring

However,notethatthesamplecodeiswritteninMatlab,tohelpyouunderstandtheoverall
algorithm,butyouneedtoturninyourcodeforparts1-4inC++(.cpp)files.Thus,whilethe
Matlabcodelimitsthenumberofaminoacidsto25,yourcodeshouldusestringsinorderto
handleanysizeofinput;andwhiletheMatlabcodemerelyinitializesthematrixtobefilled
withzeros(insteadofundefinedvalues),youactuallyneedtofillthematrixwithrealvalues.Its’
mainuseistoshowyouhowamatrixcanbefairlyeasilyprinted.

Hint1:C++caneasilyhandlepaddingastringwithspacesusingthesetwandsetfillcommands,
forwhichyouwouldneedtoincludeiomanipaswellasiostream–i.e.,asdescribedat
http://www.cplusplus.com/reference/iomanip/setw/.(actuallyyouwon’tevenneedsetfill(),
sincespacesarethedefaultpaddingmaterial–Ijustmentionthatforfuturereference)Ifyou
trytousetheolderCstyleofusingastatementsuchas‘printf(“%3s
”,x);’,forwhichyou
wouldneedtoincludestdio.h,youwillhaveamuchhardertimeprintingoffthematricesinthis
assignment,sinceyouwillhavetofigureoutawaytoconvertanobjectofstringclass(and
beingofanyarbitrarily-givensize)totheolderC-style‘characterarray’thatprintfdemands.
Thus,IhighlyrecommendtheC++style,asshowninthecodebelowwhereIprintanarbitrary
string(here,avalueof-2)withanarbitraryamountofspace-padding(here,3spaces,although
notethatthemaximumwidthofanyvalueintheBLOSUM62matrixisonly2characterspaces,
sinceits’maximumvalueis‘11’andits’minimumvalueis‘-4’):

#include
#include
#include
usingnamespacestd;

intmain()
{
stringx=”-2″;
cout<<setw(3)<<x<<endl; //somewhatequivalentto:printf(“%3s
“,x); return0;}Hint2:theNeedleman-WunschalgorithmisdescribedintheReadingMaterial(especiallyL8-PairwiseAlignment1-Fall2015.pdf),andinevenmorehelpfuldetailinthebook(chapter3,especiallypages76-80,anditmaybehelpfultokeepgoingandreadportionsoftherestofthechapter,likeSmith-Watermanon82,heuristicsonpage84,dotplotson85,etc.).Payattentiontothedetails–e.g.,eachmatrixisofdimensionality(n+1)*(m+1),wherethelengthoftheinputsequence1isnandthelengthoftheinputsequence2ism.Inotherwords,thesematricesareonedimensionlonger(inboth#ofrowsandcolumns)thanthelengthsofthetwosequences.(doyouseewhythisisso?the1addedtothenistheleftcolumn,andthe1addedtothemistheupperrow,showninFigure3.21(a)intheonlinebook;notethatthesearefilledinfirstbeforeanythingelseisaddedtothematrix,asshowninparts(b)-(g)ofthatfigure).Hint3:YoualreadyknowthatC++arrayindicesstartwith0,sothatthefirstpositionofastringisstoredinarrayindex0,thesecondinarrayindex1,etc.However,Needleman-Wunschintroducesatwist,inthatb/coftheadditionaldimensionaddedtoboththerowandthecolumn(addedatthe0thposition),thefirstpositionofthesequence1correspondstotheSECONDcolumnofthematrix,whichendsupmakingits’arrayindexequalto1afterall.Thesameappliestosequence2andtherowsofthematrix.Ratherthanattempttodescribethisfurtherin1,000+words,hereisapictureinstead:0 1 2 3 4 5Sequence Indices A R N C D ESequence Characters0 1 2 3 4 5 6Matrix Indices 0 0-2-4-6-8 -10 -12 0 A 1-2 4-1-2 0-2-11 R 2-4-1 5 0-3-2 02 N 3-6-2 0 6-3 1 03 D 4-8-2-2 1-3 6 24 C 5 -10 0-3-3 9-3-45 Q 6 -12-1 1 0-3 0 26 E 7 -14-1 0 0-4 2 5Hint4:Toforestallquestionsfrompeoplewhoarenotcarefulreaders,Iwillpointoutthatwhilesomepartsoftheonlinebooktalksaboutusinganidentitymatrixtoscorethealignment,otherpartstalkaboutusingaBLOSUM62oneinstead.e.g.,Figure3.20demonstratesanexamplewithBLOSUM62.Thelessonhereis:donotsimplyreadasentenceoutofcontext!Yes,youshouldactuallyreadtheWHOLEthing,ifyouwanttoknowhowitworks.Part1:Createafilehw4.1.cpp.Inthispart,simplyinitializethe“B”matrixandprintitout.Scoringscheme:• Gappenaltyof-2• Match/mismatchscoresfromBLOSUM62Youcangetthelatterfrom:ftp://ftp.ncbi.nih.gov/blast/matrices/BLOSUM62.Notethatyouarenotrequiredtowritecodetoreadthismatrixfromafile–youmayhard-codeitintoyourprogramasamatrixifyouprefer.Exampleoutput:Sequence 1: ARNDCQE Sequence 2: ARNCDE B matrix A R N C D E 0-2-4-6-8 -10 -12A-2 4-1-2 0-2-1R-4-1 5 0-3-2 0N-6-2 0 6-3 1 0D-8-2-2 1-3 6 2C -10 0-3-3 9-3-4Q -12-1 1 0-3 0 2E -14-1 0 0-4 2 5Part2:Createafilehw4.2.cpp.Inthispart,nowcreatethe“A”matrixandprintitout.(RememberthatBEFOREyoudothecalculationstoobtainthefinalvaluesthatthismatrixwillcontain,youwillneedtoinitializethismatrixthesamewaythatyouinitializedthe“B”matrix.e.g.,youcouldsimplycopythe“B”matrixinto“A”,ifthatiseasierforyou.AlthoughIfounditeasiertojustputtheinitializationcodeintoafunction,andthenIcalledittwice,firstwithBandthenwithA.)Hint:exceptforthefirstcolumnandfirstrow,eachmatrixscoreisthemaximumof3possibilities:analignmentoftworesidues,agapinsequence1,oragapinsequence2.Continuingtheexampleoutputfromabove:A matrix A R N C D E 0-2-4-6-8 -10 -12A-2 4 2 0-2-4-6R-4 2 9 7 5 3 1N-6 0 7151311 9D-8-2 513121917C -10-4 311222018Q -12-6 1 9202222E -14-8-1 7182227score = 27(doyouseewhere“score=”comesfrom?hint:bottom-rightcornerofthematrix…) Part3:Createafilehw4.3.cpp.Inthispart,nowcreatethe“D”matrixandprintoutthecontents.AsyouwerecalculatingtheAmatrixinpart2above,youhadtoselectadirectionfromwhichtocalculatethenewsub-totals(Vertical(V),Horizontal(H),orDiagonal(D)–thesearethearrowsshowninthelectureslides/readingmaterial/onlinebookfigures).Keeptrackofthesedirectionsinthis“D”matrix.PrintoutthisDmatrixaspartofyouranswer.IfmorethanonedirectionispossiblethatincludesDiagonal(D),thensimplyselectD.IfmorethanonedirectionispossiblebutDisnotincluded(i.e.,HandVonly),thenarbitrarilyselectV.Notethatyoucaninitializethefirstrow/columntoallH/V.[Optional]:Insteadofpickingarbitrarily,youcouldencodethecombinationsofdifferentdirectionswithadditionalcharacters.I’dsuggestthefollowingscheme,forcompatibilitywithotherpeople’scode:forV&H=>+,forD&V=>y,forD&H=>2,andforV,D&H=>*.Note
thatonereasontodothisisthatifyoudonotdealwithsuchscenarios,thenthecodebecomes
non-deterministic(meaningthatifIreversetheorderofthetwoinputsequences–i.e.,swap
sequence#1and#2–Icangetadifferentanswer,mainlyb/cofarbitrarilypickingV,and
thereforeswappingH&Vwillaffecttheoutcome).

Continuingthesampleoutputfromabove:

D matrix
A R N C D E
0 H H H H H H
A V D H H H H H

R V V D H H H H
N V V V D H H H
D V V V V D D H
C V V V V D H H
Q V V V V V D D
E V V V V V D D

Part4:Createafilehw4.4.cpp.Inthispart,itisnowstraight-forwardtousethe“D”matrixto
generatethealignment.Dothat,andthenprintoutthatalignment.

Continuingthesampleoutputfromabove,thealignmentwouldbe:

ARNDCQE
||| | |
ARN-CDE

Notehowtheverticalbarisonlyshownforidenticalletters,butnotforgapsormismatches.

[OptionalChallenge].Purelyforyourowninterest,youcouldimplementthisNeedleman-
Wunschalgorithmusingrecursion.Thereareanumberofreasonsthatyoumightwanttodo
this–butifyouendupdoingthat,whichdoyoufindthatyouprefer?

Part5:Createaplain-textfilehw4.5.txt.Inthispart,writeafewsentencestobrieflydiscuss
(i.e.,justafewsentenceseach)thefollowingissues:

a) Whatwerethemajorobstacles/hurdlesthatyouhadtoovercomeduringthis
assignment?

a. Personally,wasityourmasteryofthetoolthatyouused(C++),orbasicbiology
(knowingwhatan‘aminoacid’is),orbioinformatics(knowingwhatan
‘alignment’is),orjustfindingtimetodothehomework?Thoseareallhonest
andvalidresponsestothisquestion.

b. WhatIamparticularlyinterestedin,though,is:whatwerethemajorengineering
challengesthathadtobeovercometosolveproblemsofthistype?

b) NowthatyouhavecodethatimplementstheNeedleman-Wunschalgorithm,what
wouldbethenextstep(s)forsomeonetotake?(don’tworry,itdoesn’thavetobeyou,

atleastnotforthisclass:-)

a. Ifyouweregoingtomakeimprovementstoyourcode,whataresomeideasthat
youcouldstartoffwith,andwhymightyouwanttodothoseimprovements?

(Igaveyousomeideasabove,likeaddingsymbolsthatgobeyondjustH/V/D,
whichwouldallowmoreoptionstobeexploredinsteadofbeingcompletely
arbitrary;andanotherideawouldbetoreadintheBLOSUM62matrixfromafile
insteadofhard-codingitin,b/cthatwouldallowyoutouseother,alternative
matricesinthefuture,suchasBLOSUM45,orPAM250,etc.ButsinceIgaveyou
theseideas,pleasepickotherones–notethatrecursionisfinetoinclude,as
longasyoustatewhyitwouldbeuseful,butinthatcasepleasealsoincludeat
leastonemoreideaaswell)

b. Goingbeyondjustthiscode,whataresomemoregeneralimprovementsthat
youcouldmake,andwhywouldyouwanttodothem?

Hint:themainanswerthatIamlookingforhereisthenameofanalgorithmthat
isbetterthanNeedleman-Wunsch…canyouthinkofany?

c) PretendthatIamaworld-classexpertatthesubjectofbioinformatics(e.g.,Iknowall
aboutalignments,andalgorithms,andscoring,etc.),butyetsomehowIknewnothing
abouttheNeedleman-Wunschalgorithm(asifthatwerepossible!MaybeI’mfroman
alternateuniverseorsomething:-).DescribeformewhatNeedleman-Wunschisin
general,whatitdoesinparticular,andwhatitcandoforme.Like,whenshouldIuseit?
WhenwouldIbebetteroffnottouseit,andshouldusesomethingelseinstead?
Notably,whatarethekeyconceptsthatdistinguishitfromallotherbioinformatics
algorithms?

GRADINGRubric(100pointstotal):

15pointsforworkingcode,part1.
15pointsforworkingcode,part2.
30pointsforworkingcode,part3.
40pointsforworkingcode,part4.
5pointsforanswers,part5.

NOTE:Ifyourprogramdoesnotcompileyouwillreceiveazeroonthatpartofyourhomework.
Inaddition,latehomeworkwillnotbeaccepted.�DONOTWORKTOGETHER!Studentscaught
workingtogetheronthisoranyassignmentwilldropawholelettergradeforthecourse!�

% Code to start part HW4 from.
function HW4_0

% A trial sequence with all amino acids:
% ARNDCQEGHILKMFPSTWYV

% The trial sequences in the HW description:
% ARNDCQE
% ARNCDE

% Ask the user for protein sequence 1.
prompt = ‘Enter the first amino acid sequence of length 1 to 25: ‘;
seq1 = input(prompt, ‘s’);
len1 = length(seq1);
fprintf(1, ‘Seq 1 %s has %d residues
’, seq1, len1);

% Ask the user for protein sequence 2.
prompt = ‘Enter the second amino acid sequence of length 1 to 25: ‘;
seq2 = input(prompt, ‘s’);
len2 = length(seq2);
fprintf(1, ‘Seq 2 %s has %d residues
’, seq2, len2);

% Initialize a 25×25 matrix of zeros.
A = zeros(25,25);

% The supplied initMatrix must be modified!
A = initMatrix(A, seq1, seq2);

% Print the matrix A.
printMatrix(A, seq1, seq2);

end

% Print out a matrix
function mat = initMatrix(mat, seq1, seq2)
numRows = length(seq1) + 1;
numCols = length(seq2) + 1;

mat(1,1) = 0; % Init the value at (1,1) to 0.

% Initialize the first column.
for i=2:numRows
% Do something here!
end

% Initialize the first row.
for c=2:numCols
% Do something here!
end

% Init the rest of the matrix.
for r=2:numRows
for c=2:numCols
% Do something here!
end
end

end

% Print out a matrix
function printMatrix(mat, seq1, seq2)
len1 = length(seq1);
len2 = length(seq2);

% Print the first row of sequence letters
fprintf(1,’tt’);
for r=1:len2
fprintf(1,’%2ct’, seq2(r));
end
fprintf(1,’
’);

% Print the first row of scores.
fprintf(1,’t’);
for c=1:len2+1
fprintf(‘%2dt’, mat(1,c));
end
fprintf(1,’
’);

% Print the rest of the scores.
for r=2:len1+1
% Print the first letter of vertical sequence
fprintf(1, ‘%ct’, seq1(r-1));
for c=1:len2+1
fprintf(1,’%2dt’,mat(r,c));
end
fprintf(1,’
’);
end
end

function value = scorePair(char1, char2)
% Order: ARNDCQEGHILKMFPSTWYV
i1 = aaToInt(char1);
i2 = aaToInt(char2);
A = [
4 -1 -2 -20 -1 -10 -2 -1 -1 -1 -1 -2 -110 -3 -20;
-150 -2 -310 -20 -3 -22 -1 -3 -2 -1 -1 -3 -2 -3;
-2061 -30001 -3 -30 -2 -3 -210 -4 -2 -3;
-2 -216 -302 -1 -1 -3 -4 -1 -3 -3 -10 -1 -4 -3 -3;
0 -3 -3 -39 -3 -4 -3 -3 -1 -1 -3 -1 -2 -3 -1 -1 -2 -2 -1;
-1100 -352 -20 -3 -210 -3 -10 -1 -2 -1 -2;
-1002 -425 -20 -3 -31 -2 -3 -10 -1 -3 -2 -2;
0 -20 -1 -3 -2 -26 -2 -4 -4 -2 -3 -3 -20 -2 -2 -3 -3;
-201 -1 -300 -28 -3 -3 -1 -2 -1 -2 -1 -2 -22 -3;
-1 -3 -3 -3 -1 -3 -3 -4 -342 -310 -3 -2 -1 -3 -13;
-1 -2 -3 -4 -1 -2 -3 -4 -324 -220 -3 -2 -1 -2 -11;
-120 -1 -311 -2 -1 -3 -25 -1 -3 -10 -1 -3 -2 -2;
-1 -1 -2 -3 -10 -2 -3 -212 -150 -2 -1 -1 -1 -11;
-2 -3 -3 -3 -2 -3 -3 -3 -100 -306 -4 -2 -213 -1;
-1 -2 -2 -1 -3 -1 -1 -2 -2 -3 -3 -1 -2 -47 -1 -1 -4 -3 -2;
1 -110 -1000 -1 -2 -20 -1 -2 -141 -3 -2 -2;
0 -10 -1 -1 -1 -1 -2 -2 -1 -1 -1 -1 -2 -115 -2 -20;
-3 -3 -4 -4 -2 -2 -3 -2 -2 -3 -2 -3 -11 -4 -3 -2 112 -3;
-2 -2 -2 -3 -2 -1 -2 -32 -1 -1 -2 -13 -3 -2 -227 -1;

0 -3 -3 -3 -1 -2 -2 -3 -331 -21 -1 -2 -20 -3 -14];
value = A(i1, i2);
end

function value = aaToInt(aa)
switch aa
case ‘A’,
value = 1;
case ‘R’,
value = 2;
case ‘N’,
value = 3;
case ‘D’,
value = 4;
case ‘C’,
value = 5;
case ‘Q’,
value = 6;
case ‘E’,
value = 7;
case ‘G’,
value = 8;
case ‘H’,
value = 9;
case ‘I’,
value = 10;
case ‘L’,
value = 11;
case ‘K’,
value = 12;
case ‘M’,
value = 13;
case ‘F’,
value = 14;
case ‘P’,
value = 15;
case ‘S’,
value = 16;
case ‘T’,
value = 17;
case ‘W’,
value = 18;
case ‘Y’,
value = 19;
case ‘V’,
value = 20;
otherwise,
value = 1;
end
end

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] 程序代写代做代考 Bioinformatics c++ matlab scheme algorithm HW4
30 $