[Solved] DATA51100 Assignment 7-ggregating ACS PUMS Data

$25

File Name: DATA51100_Assignment_7-ggregating_ACS_PUMS_Data.zip
File Size: 442.74 KB

SKU: [Solved] DATA51100 Assignment 7-ggregating ACS PUMS Data Category: Tag:
5/5 - (1 vote)

5/5 – (1 vote)

Aggregating ACS PUMS Data

For this assignment, you will work again with the same ACS PUMS dataset as for assignment 6 to produce several tables which aggregate the data.

Note: For this assignment, you are free to program in any programming language of your choice; Python, R, Matlab, C++, Java,..etc. Please indicate the programming language you will use.

Requirements

You are to create a program in Python that performs the following using the pandas packages:

  1. Loads the ss13hil.csv file that contains the PUMS dataset (assume its in the current directory) and create a DataFrameobject from it if you use Python.
  2. Create 3 tables:TABLE 1: Statistics of HINCP Household income (past 12 months), grouped by HHT Household/family type
    • Table should use the HHT types (text descriptions) as the index
    • Columns should be: mean, std, count, min, max
    • Rows should be sorted by the mean column value in descending orderTABLE 2: HHL Household language vs. ACCESS Access to the Internet (Frequency Table)
    • Table should use the HHL types (text descriptions) as the index
    • Columns should be the text descriptions of ACCESS values
    • Each table entry is the sum of WGTP column for the given HHL/ACCESS combination, divided by the sum of WGTPvalues in the data. Entries need to be formatted as percentages.
    • Any rows containing NA values in HHL, ACCESS, or WGTP columns should be excluded.TABLE 3: Quantile Analysis of HINCP Household income (past 12 months)
    • Rows should correspond to different quantiles of HINCP: low (0-1/3), medium (1/3-2/3), high (2/3-1)
    • Columns displayed should be: min, max, mean, household_count
    • The household_count column contains entries with the sum of WGTP values for the corresponding range of HINCPvalues (low, medium, or high)
  3. Display the tables to the screen as shown in the sample output on the last page.
  4. Note: For this assignment, you are free to program in any programming language of your choice; Python, R, Matlab, C++, Java,..etc. Please indicate the programming language you will use.

Additional Requirements

  1. The name of your source code file should be tables.py for Python (or tables.m (for Matlab), tables.java (For Java, or tables.r (For R),..etc). All your code should be within a single file.
  2. You need to use the pandas DataFrame object for storing and manipulating data (if you use Python).
  3. Your code should follow good coding practices, including good use of whitespace and use of both inline and block comments.
  4. You need to use meaningful identifier names that conform to standard naming conventions.
  5. At the top of each file, you need to put in a block comment with the following information: your name, date, course name,semester, and assignment name.
  6. The output should exactly match the sample output shown on the last page.

What to Turn InYou will need to turn in the single tables.py (or tables.m (for Matlab), or tables.java (For Java), or tables.r (For R),..etc) file as well as a screen shot of the created tables using BlackBoard.

HINTS

  • To get the right output, use the following functions to set pandas display parameters in Python: pd.set_option(display.max_columns, 500) pd.set_option(display.width, 1000)
  • To display entries as percentages, use the applymap method, giving it a string conversion function as input. The string conversion function should take a float value v as an input and output a string representing v as a percentage. To do this, you can use the format() method

Sample Program Output

DATA-51100, [semester] [year]NAME: [put your name here]PROGRAMMING ASSIGNMENT #7

*** Table 1 Descriptive Statistics of HINCP, grouped by HHT ***

HHT Household/family typeMarried couple householdNonfamily household:Male householder:Not living aloneNonfamily household:Female householder:Not living aloneOther family household:Male householder, no wife presentOther family household:Female householder, no husband present 49638.428821 Nonfamily household:Male householder:Living alone 48545.356298 Nonfamily household:Female householder:Living alone 37282.245015

*** Table 2 HHL vs. ACCESS Frequency Table *** sum

                                             WGTPACCESS                             Yes w/ Subsrc.

HHL Household languageEnglish only 58.71% Spanish 7.83%

Yes, wo/ Subsrc.      No
           2.93%  16.87%           0.52%   2.60%           0.18%   1.19%           0.06%   0.28%           0.03%   0.14%

Other Indo-European languagesAsian and Pacific Island languagesOther languageAll 75.19%

*** Table 3 Quantile Analysis of HINCP Household income (past 12 months) *** min max mean household_count

HINCPlow -11200 37200 19599.486904 medium 37210 81500 57613.846298 high 81530 1425000 159047.588900

162949915754811578445
5.11%2.73%0.80%
106790.565562 79659.567376 69055.725901 64023.122122

3.73%

21.08%

mean

std

  100888.917804   74734.380152   63871.751863   59398.970193   48004.399101   60659.516163   44385.091076

All

 78.51% 10.95%  6.48%  3.08%  0.97%100.00%

count min

25495 -5100 1410 0 1193 0 1998 0 5718 -5100 5835 -5100 8024 -11200

max

1425000 625000 645000 610000

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[Solved] DATA51100 Assignment 7-ggregating ACS PUMS Data
$25