R Error: <index> contains invalid column names

I am following this tutorial here https://rpubs.com/brandonkopp/creating-a-treemap-in-r on data visualization.

Since the dataset used for this tutorial is no longer available, I learned how to obtain this dataset in the last question I asked (Finding a Dataset That Is No Longer Available?):

outlays <- read.csv('https://web.archive.org/web/20160705194932if_/https://www.whitehouse.gov/sites/default/files/omb/budget/fy2017/assets/outlays.csv')
str(data)

From here, I am now trying to follow the rest of the tutorial:

library(xtable)
library(dplyr)
library(treemap)

#Remove commas from numeric values in number columns
outlays$X2017 <- gsub(",","", outlays$X2017)

#Convert numeric columns to a numeric data type
outlays$X2017 <- as.numeric(outlays$X2017)

#Create new data frame for positive (spending) values
spending <- outlays %>% select(Agency.Name, Bureau.Name, Account.Name, X2017) %>%
  group_by(Agency.Name, Bureau.Name, Account.Name) %>%
  summarize(X2017 = sum(X2017, na.rm=FALSE)) %>%
  filter(X2017 > 0)

#Create new data frame for negative (income) values
receipts <- outlays %>% select(Agency.Name, Bureau.Name, Account.Name, X2017) %>%
  group_by(Agency.Name, Bureau.Name, Account.Name) %>%
  summarize(X2017 = sum(X2017, na.rm=FALSE)) %>%
  filter(X2017 < 0) %>%
  mutate(X2017 = abs(X2017))

I get an error when I try to run the next part of code:

treemap(spending, #Your data frame object
        index=c("Agency.Name","Bureau.Name","Account.Name"),  #A list of your categorical variables
        vSize = "X2017",  #This is your quantitative variable
        type="index", #Type sets the organization and color scheme of your treemap
        palette = "Reds",  #Select your color palette from the RColorBrewer presets or make your own.
        title="Spending in President Obama's 2017 Budget", #Customize your title
        fontsize.title = 14 #Change the font size of the title
)

This is the error I get:

Error in treemap(spending, index = c("Agency.Name", "Bureau.Name", "Account.Name"),  : 
  <index> contains invalid column names

It says there are "invalid" column names – but is there a way to figure out which of these column names are invalid? Does anyone have any ideas how I can resolve this problem and continue with the visualizations?

Thank you!

>Solution :

Running this code:

outlays <- read.csv('https://web.archive.org/web/20160705194932if_/https://www.whitehouse.gov/sites/default/files/omb/budget/fy2017/assets/outlays.csv')
str(outlays)

library(xtable)
library(dplyr)
library(treemap)

#Remove commas from numeric values in number columns
outlays$X2017 <- gsub(",","", outlays$X2017)

#Convert numeric columns to a numeric data type
outlays$X2017 <- as.numeric(outlays$X2017)

#Create new data frame for positive (spending) values
spending <- outlays %>% select(Agency.Name, Bureau.Name, Account.Name, X2017) %>%
  group_by(Agency.Name, Bureau.Name, Account.Name) %>%
  summarize(X2017 = sum(X2017, na.rm=FALSE)) %>%
  filter(X2017 > 0)

#Create new data frame for negative (income) values
receipts <- outlays %>% select(Agency.Name, Bureau.Name, Account.Name, X2017) %>%
  group_by(Agency.Name, Bureau.Name, Account.Name) %>%
  summarize(X2017 = sum(X2017, na.rm=FALSE)) %>%
  filter(X2017 < 0) %>%
  mutate(X2017 = abs(X2017))

treemap(spending, #Your data frame object
        index=c("Agency.Name","Bureau.Name","Account.Name"),  #A list of your categorical variables
        vSize = "X2017",  #This is your quantitative variable
        type="index", #Type sets the organization and color scheme of your treemap
        palette = "Reds",  #Select your color palette from the RColorBrewer presets or make your own.
        title="Spending in President Obama's 2017 Budget", #Customize your title
        fontsize.title = 14 #Change the font size of the title
)

Produces the following output:

treemap output

Which seems to be the expected output?

Leave a Reply