Skip to content

append_census() functionality #13

@lecy

Description

@lecy

Census data consists of the geo-level crosswalk and base harmonized census data tables.

The append_census() function would take the geographic level as the argument then operate as follows:

Aggregate the census table up to the requested level. For example, if the requested unit is MSA then population would be aggregated up from tracts:

d.msa <- 
  d.tract %>% 
  group_by( MSA ) %>%
  summarize( pop=sum(pop), unemployed=sum(unemployed), etc. )

Data tables are in nccsdata/geo/data.

Then the aggregated data is merged with the nonprofit data. The tractID field should be in every dataset, set add the MSA field and merge. Something like:

ids <- tractx %>% select( tractID, msaID )
core <- merge( core, ids, by="tractID", all.x=TRUE )
core <- merge( core, d.msa, by="msaID", all.x=TRUE )

The big caveat is that Census data has to be counts for the aggregation process to work. Weighted averages work as well. Fields like median income, however, are more challenging because the weighted average of median incomes is not mathematically equivalent to the median income of the full sample at the higher level of aggregation. It is probably good enough for most cases, but we need to provide some documentation.

Income inequality metrics (gini coefficients) are another example where aggregation is imperfect.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions