US Population by Ethnicity Visualization

US Census 2011 (ACS) – choroplethr

As a statistician, I’ve always had a soft spot in my heart for the US Census. I love the rich data sets that are made publicly available and I’ve often experimented with visualizing the results. A couple of months ago, Ari Lamstein (a data scientist at Trulia) released the choroplethr package on CRAN (a repository for R packages). I pulled it up a couple of days ago and found it be simple and intuitive. Only a couple of simple commands are required to build plots like this: USPop

1) Go to http://www.census.gov/developers/tos/key_request.html to get a ACS API key.
2) Visit http://factfinder2.census.gov/faces/affhelp/jsf/pages/metadata.xhtml?lang=en&type=survey&id=survey.en.ACS_ACS to find the appropriate ACS table ID for the attribute that you’re looking to explore.
3) Open up R, install choroplethr package, define your API key using the api.key.install() command
4) Explore away!

I started looking at the US population split by ethnicity.
USPopWhite

USPopBlack

USPopAsian

We can see very clearly the heavier concentrations of African-Americans in the Southeastern states, the Eastern seaboard and Southern CA. Asian-American population centers are focused on the West Coast and the NE Coast.

The R code is shown below:

###### Settings
library(choroplethr)
library(acs)
library(ggplot2)
 
###### API key
# Need to go to http://www.census.gov/developers/tos/key_request.html to set API key
api.key.install("###############")
 
###### Basic ACS Table IDs 
# B19301 = Per Capita Income
# B01003 = Population
 
###### Plotting
## Basic by State
choroplethr_acs(tableId="B19301",lod="state")
choroplethr_acs(tableId="B19301",lod="state",showLabels=FALSE)
choroplethr_acs(tableId="B19301",lod="state",showLabels=FALSE,num_buckets=9)
choroplethr_acs(tableId="B19301",lod="state",showLabels=FALSE,num_buckets=9)+labs(title="US 2011 Per Capita Income by State")
 
## Per Capita Income by County
choroplethr_acs(tableId="B19301",lod="county")
choroplethr_acs(tableId="B19301",lod="county",num_buckets=9,states=c("CA"))
 
## Population by County by Ethnicity
choroplethr_acs(tableId="B01003",lod="county")+labs(title="Total US Population by County (2011)")
choroplethr_acs(tableId="B02008",lod="county")+labs(title="US Population by County (2011) - White")
choroplethr_acs(tableId="B02009",lod="county")+labs(title="US Population by County (2011) - Black ")
choroplethr_acs(tableId="B02011",lod="county")+labs(title="US Population by County (2011) - Asian")
choroplethr_acs(tableId="B03001",lod="county")+labs(title="US Population by County (2011) - Hispanic")

 

Data Scientist/Statistician Job Market

The Bay Area specifically is currently suffering from an imbalance between data scientist positions and qualified workers to fill these openings.

I returned to the US from Shanghai almost a year ago and I’ve found the data scientist job markets to be quite different. In general, the employment atmosphere for qualified data workers in the US is much more friendly than the atmosphere in China. The “Big Data” wave has hit the US (particularly the Bay Area) hard, and demand for people who know how to pull/extract/transform/analyze/visualize data has skyrocketed. China’s economy is substantially more focused on production, industrial productivity, and quality control concerns and, consequently, the demand for data scientists is lower.

The Bay Area specifically is currently suffering from an imbalance between data scientist positions and qualified workers to fill these openings. Recruiters and friends alike contact me (and likely every other data scientist in the Bay Area) almost every day with data positions at tech companies large and small. I’m quite happy at my current job, so I’ve been passing these leads along to qualified friends graduating from school, but I also wanted to share a couple of resources that I think might be useful for budding data scientists looking for work:

1) LinkedIn – this seems to be recruiters’ primary platform. Everybody already knows this…
2) Burtch Works – these recruiters focus on the analytics market. Most positions here seem to be located in the MidWest. They are a little too SAS-oriented/marketing-oriented for my taste, but it is a valuable resource nonetheless.
3) Analytic Recruiting – these recruiters seem to have a wide geographic range.  Dedicated section for Wall St. quants if that’s your thing.
4) DataJobs – a relative newcomer on the scene. I don’t know much about them, but there are a large variety of data science listings.
5) R Jobs – this listing site just started out of R Bloggers and only has listings focused on R
6) Friends – I think this is always the best way.