Homework 2
DebatingwithQuanteda
Inthesecondproblemset,youwillstudytheUSpresidentialdebatesusingthe
quantedapackage.SinceRisacollaborativesoftware,youwillstartwithafew
snippetsofcodetocleanthedebatefragments,donebylastquarterQTA
students.
1) Installthequantedapackageifyouhavenotdoneso.
[Aquickintroductionintoquantedaisprovidedhere:https://cran.r-
project.org/web/packages/quanteda/vignettes/quickstart.html]
2) Usethestartingcodetoreadthepresidentialdebateobject.
3) AtextfragmentintheDEBATEScorpusobjectisaresponsetoaquestion
orarebuttalandcanbethoughtofasaself-containeddocument.Explore
towhatextenttheHeapslawappliesforTrumpvsClinton.Isitstronger
orweakerforeitherspeaker?(tip:seethecodeusedinclass).
4) Analyzetheevolutionoflexicaldiversityacrossthecandidatesbeforeand
aftertheybecametheirpartiesrespectivecandidates?[tip:theRscript
providedhasaprimaryindicatorvariableforidentifyingthefragments
thatcamefromnon-primarydebates]
5) Afterexploring4),doyouhaveahypothesiswhypatternsmaybemore
orlesspronouncedbetweenTrumpvsClinton?Howcouldyoutestthis?
6) Removestopwordsfromthecorpus.
7) Usingthetokenizefunction,constructseparatebi-gramsfortheHilary
Clinton/DonaldTrumppartsofthecorpus.Tabulatethetenmost
frequentbigramsbyspeaker.Aretheseinformative?Whyorwhynot?
8) Usingthecollocationfunctioninquanteda(whichtakesatokenizeobject
asargument),constructcollocationsbasedonChi2testforeachspeaker.
OrderbyChi2teststatistic.Whatdoyounoticeorwhatisstrange?Can
youprovideaformalreasoningrelatingtotheChi2teststatisticformula?
9) Usingthecodeprovidedinthelectures,identifycollocationsthatare
distincttoTrumpvsClinton.Basedonyourperceptionofeachcandidate,
dotheresultsofthisanalysismakesense?Provideabriefanswerwhere
yousummarizethemostimportantresultsandyourexplanations.
SendyourPDFversionoftheMarkdowndocumentto
Pleaseusethefollowingnamingconvention:HW1_Surname_FirstName.pdf
Deadline:byFriday28thApril,2017.
Reviews
There are no reviews yet.