This is a vignette to show you how to download the raw 990 data needed to generate factor scores.
Identify relevant 990 efiler variables
We first list all questions in the 990 series that are needed to generate the governance scores.
See other vignette for how we picked these ones.
See the NCCS website for a comprehensive list of all 900 efiler variable names.
Part/Schedule | Variable Name | 990 Item Name | Question asked on 990 |
---|---|---|---|
Part 4 | F9_04_AFS_IND_X | F990-PC-PART-04-LINE-12A | Independent audited financial statements? |
Part 4 | F9_04_AFS_CONSOL_X | F990-PC-PART-04-LINE-12B | Consolidated audited financial statement? |
Part 4 | F9_04_BIZ_TRANSAC_DTK_X | F990-PC-PART-04-LINE-28A | Was the organization a party to a business transaction with one of the following parties: A current former officer, director, trustee, or key employee? |
Part 4 | F9_04_BIZ_TRANSAC_DTK_FAM_X | F990-PC-PART-04-LINE-28B | Was the organization a party to a business transaction with one of the following parties: A family member of a current or former officer, director, trustee, or key employee? |
Part 4 | F9_04_BIZ_TRANSAC_DTK_ENTITY_X | F990-PC-PART-04-LINE-28C | Was the organization a party to a business transaction with one of the following parties: An entity of which a current or former officer, director, trustee, or key employee (or a family member thereof) was an officer, director, trustee, or direct, or indirect owner? |
Part 4 | F9_04_CONTR_NONCSH_MT_25K_X | F990-PC-PART-04-LINE-29 | Did the organization receive more than $25,000 in non-cash contributions? |
Part 4 | F9_04_CONTR_ART_HIST_X | F990-PC-PART-04-LINE-30 | Did the organization receive contributions of art, historical treasures, or other similar assets, or qualified conservation contributions? |
Part 6 | F9_06_GVRN_NUM_VOTING_MEMB | F990-PC-PART-01-LINE-03 | Number voting members governing body |
Part 6 | F9_06_GVRN_NUM_VOTING_MEMB_IND | F990-PC-PART-01-LINE-04 | Number independent voting members |
Part 6 | F9_06_GVRN_DELEGATE_MGMT_DUTY_X | F990-PC-PART-06-SECTION-A-LINE-03 | Delegation of management duties? |
Part 6 | F9_06_GVRN_DOC_GVRN_BODY_X | F990-PC-PART-06-SECTION-A-LINE-08A | Minutes of governing body? |
Part 6 | F9_06_POLICY_FORM990_GVRN_BODY_X | F990-PC-PART-06-SECTION-B-LINE-11A | Form 990 provided to governing body? |
Part 6 | F9_06_POLICY_COI_X | F990-PC-PART-06-SECTION-B-LINE-12A | Organization have conflict of interest policy? |
Part 6 | F9_06_POLICY_COI_DISCLOSURE_X | F990-PC-PART-06-SECTION-B-LINE-12B | Annual disclosure of interests by covered persons? |
Part 6 | F9_06_POLICY_COI_MONITOR_X | F990-PC-PART-06-SECTION-B-LINE-12C | Regular monitoring and enforcement? |
Part 6 | F9_06_POLICY_WHSTLBLWR_X | F990-PC-PART-06-SECTION-B-LINE-13 | Whistleblower policy? |
Part 6 | F9_06_POLICY_DOC_RETENTION_X | F990-PC-PART-06-SECTION-B-LINE-14 | Organization has written document retention policy? |
Part 6 | F9_06_POLICY_COMP_PROCESS_CEO_X | F990-PC-PART-06-SECTION-B-LINE-15A | Compensation process CEO? |
Part 6 | F9_06_DISCLOSURE_AVBL_OTH_X | F990-PC-PART-06-SECTION-C-LINE-18 | Form available Other (explain in Schedule O) |
Part 6 | F9_06_DISCLOSURE_AVBL_OTH_WEB_X | F990-PC-PART-06-SECTION-C-LINE-18 | Form990 Part VI - Own website |
Part 6 | F9_06_DISCLOSURE_AVBL_REQUEST_X | F990-PC-PART-06-SECTION-C-LINE-18 | Form available upon request |
Part 6 | F9_06_DISCLOSURE_AVBL_OWN_WEB_X | F990-PC-PART-06-SECTION-C-LINE-18 | Form990 Part VI - Other website |
Part 12 | F9_12_FINSTAT_METHOD_ACC_OTH | F990-PC-PART-12-LINE-01 | Indicates method of accounting is other |
Part 12 | F9_12_FINSTAT_METHOD_ACC_ACCRU_X | F990-PC-PART-12-LINE-01 | Method of accounting - Accrua |
Part 12 | F9_12_FINSTAT_METHOD_ACC_CASH_X | F990-PC-PART-12-LINE-01; | Method of accounting - Cash |
Schedule M | SM_01_REVIEW_PROCESS_UNUSUAL_X | SCHED-M-PART-01-LINE-31 | Review process reference unusual noncash gifts? |
Downloading the data
There are two methods for downloading the data. The first is downloaded the data sets directly from the NCCS website. This method is fast to do, but downloads much more data than you need to calculate a governance score. The second method is to download the data with the data.table package in R. This method is more complicated to implement, but it only downloads the data needed to calculate a governance score.
Downloading Data from NCCS Website Direcetly
All 990 Efiler data is hosted on the NCCS Website here.
The data here is organized be Part/Schedule, then by year. For our purposes, we only need Part 4, Part 6, Part 12, and Schedule M. Then for each Part/Schedule, you can click the “download” button for the relevant year(s). Once the data is downloaded, you can use the columns ““OBJECTID”, “URL”, “RETURN_VERSION”, and “ORG_EIN” as unique identifiers to bind the data together.
Downloading Data using R
To download the data through R, we first identify which years we are interested in. For this example, say we are interested in years 2013 and 2014.
years <- 2013:2014
We then see from the NCCS Website that the data Part/Schedule download links are structured as
https://nccs-efile.s3.us-east-1.amazonaws.com/parsed/F9-PARTNUMBER-TABLENUMBER-TABLENAME-YEAR.csv
Using this link structure and the variable names we identified in the previous section, we can download the data as follows:
### Part IV ----------------------------------------
#initialize data
dat_4 <- vector(mode = "list", length = length(years))
#get columns we want
keep_cols_part4 <- c("OBJECTID", "URL", "RETURN_VERSION", "ORG_EIN", "RETURN_TYPE",
"F9_04_AFS_IND_X", "F9_04_AFS_CONSOL_X",
"F9_04_BIZ_TRANSAC_DTK_X", "F9_04_BIZ_TRANSAC_DTK_FAM_X",
"F9_04_BIZ_TRANSAC_DTK_ENTITY_X",
"F9_04_CONTR_NONCSH_MT_25K_X",
"F9_04_CONTR_ART_HIST_X")
#download the data
for(i in 1:length(years)){
link <- paste0("https://nccs-efile.s3.us-east-1.amazonaws.com/parsed/F9-P04-T00-REQUIRED-SCHEDULES-", years[i], ".csv")
temp <- fread(link, select = keep_cols_part4)
dat_4[[i]] <- temp
}
#clean up data
dat_all_4 <-
rbindlist(dat_4) %>%
mutate( year = as.numeric(substr(RETURN_VERSION, 1, 4)))
### Part VI ----------------------------------------
#initialize data
dat_6 <- vector(mode = "list", length = length(years))
#get columns we want
keep_cols_part6 <- c("OBJECTID", "URL", "RETURN_VERSION", "ORG_EIN", "RETURN_TYPE",
"F9_06_GVRN_NUM_VOTING_MEMB",
"F9_06_GVRN_NUM_VOTING_MEMB_IND",
"F9_06_GVRN_DTK_FAMBIZ_RELATION_X",
"F9_06_GVRN_DELEGATE_MGMT_DUTY_X",
"F9_06_GVRN_DOC_GVRN_BODY_X",
"F9_06_POLICY_FORM990_GVRN_BODY_X",
"F9_06_POLICY_COI_X",
"F9_06_POLICY_COI_DISCLOSURE_X",
"F9_06_POLICY_COI_MONITOR_X",
"F9_06_POLICY_WHSTLBLWR_X",
"F9_06_POLICY_DOC_RETENTION_X",
"F9_06_POLICY_COMP_PROCESS_CEO_X",
"F9_06_DISCLOSURE_AVBL_OTH_X",
"F9_06_DISCLOSURE_AVBL_OTH_WEB_X",
"F9_06_DISCLOSURE_AVBL_REQUEST_X",
"F9_06_DISCLOSURE_AVBL_OWN_WEB_X"
)
#download the data
for(i in 1:length(years)){
link <- paste0("https://nccs-efile.s3.us-east-1.amazonaws.com/parsed/F9-P06-T00-GOVERNANCE-", years[i], ".csv")
temp <- fread(link, select = keep_cols_part6)
dat_6[[i]] <- temp
}
#clean up data
dat_all_6 <-
rbindlist(dat_6) %>%
mutate( year = as.numeric(substr(RETURN_VERSION, 1, 4)))
### Part XII ----------------------------------------
#initialize data
dat_12 <- vector(mode = "list", length = length(years))
#keep the columns we want
keep_cols_part12 <- c("OBJECTID", "URL", "RETURN_VERSION", "ORG_EIN","RETURN_TYPE",
"F9_12_FINSTAT_METHOD_ACC_OTH",
"F9_12_FINSTAT_METHOD_ACC_ACCRU_X",
"F9_12_FINSTAT_METHOD_ACC_CASH_X")
#download the data
for(i in 1:length(years)){
link <- paste0("https://nccs-efile.s3.us-east-1.amazonaws.com/parsed/F9-P12-T00-FINANCIAL-REPORTING-", years[i], ".csv")
temp <- fread(link, select = keep_cols_part12)
dat_12[[i]] <- temp
}
#clean up data
dat_all_12 <-
rbindlist(dat_12) %>%
mutate( year = as.numeric(substr(RETURN_VERSION, 1, 4)))
### Schedule M --------------------------------------
#initialize data
dat_M <- vector(mode = "list", length = length(years))
#get columns we want
keep_cols_partM <- c("OBJECTID", "URL", "RETURN_VERSION", "ORG_EIN", "RETURN_TYPE",
"SM_01_REVIEW_PROCESS_UNUSUAL_X")
#download data
for(i in 1:length(years)){
link <- paste0("https://nccs-efile.s3.us-east-1.amazonaws.com/parsed/SM-P01-T00-NONCASH-CONTRIBUTIONS-", years[i], ".csv")
temp <- fread(link, select = keep_cols_partM)
dat_M[[i]] <- temp
}
#clean up data
dat_all_M <-
rbindlist(dat_M) %>%
mutate( year = as.numeric(substr(RETURN_VERSION, 1, 4)))%>%
filter(year <= max(years))
Once we have all of the data from each part downloaded, we can merge the parts together. Additionally, we remove an EZ filers since their data will be incomplete for our purposes.