Herfindahl-Hirschman Index (HHI) is one of the most popular measures of market structure in nonprofit research and economics. Specifically, it is a measure of market concentration that is used to understand market competitiveness and organizations?? competitive behavior and performance in markets.
HHI can be calculated as follows:
Approach 1:
The index could be calculated as the sum of the squared market share of each firm competing in a market (Thornton & Belski, 2010). In the nonprofit sector, market share is determined by the ratio of a nonprofit??s total revenue to the aggregate revenue for the organization’s market. The calculated values range from 1/N to 1, where N is the number of organizations in a market. A market with an HHI of 1 is considered as a monopoly. In contrast, a market with an HHI close to 0 would be viewed as a nearly competition one.
\[H =\sum_{i=1}^N s_i^2\]
where \(s_i\) is the market share of firm i in the market, and N is the number of firms.
Thus, in a market with two firms that each have 50 percent market share, the Herfindahl index equals 0.502+0.502 = 1/2. wikipedia
Approach 2:
Another approach to calculate HHI used the same definition of market share as Approach 1; however, the market share of each organization is expressed as a whole number rather then a decimal in the calculation (Seaman, Wilsker, & Young, 2014; Soule & King, 2008; Thornoton, 2006; Castaneda, Garen, & Thornton, 2008). For example, we take the value of 40 to calculate HHI if a nonprofit??s market share is 0.4 (40%). The calculated values range from 1/N to 10,000. A market with an HHI of 10,000 is considered as a monopoly. In contrast, a market with an HHI close to 0 would be viewed as a nearly competition one. The U.S. Department of Justice suggests that a market with an HHI less than 1,500 to be an unconcentrated marketplace, an HHI of 1,500 to 2,500 to be a moderately concentrated marketplace, and an HHI greater than 2,500 to be a highly concentrated marketplace. There is only scale difference Approach 1 and 2.
Approach 3:
Normalized HHI ranges between 0 to 1, rather than 1/N to 1 or 1/N to 10,000.
We define a market as a Metropolitan Statistical Area (MSA) in which relevant nonprofits compete for resources to survive. The code to create HHI metrics is as follows.
library( haven ) # importing data files
library( tidyr ) # data wrangling
library( dplyr ) # data wrangling
library( ggplot2 ) # fancy plots
library( ggthemes ) # fancy plots
library( scales ) # re-scaling numbers
library( stargazer ) # nice tables
library( pander ) # format tables for HTML
library( DT ) # embed datasets in HTML docs
source( "r-helper-functions/helper-functions.R" )
core.2010 <- readRDS( "Data/nccs.core2010pc.rds" )
names( core.2010 ) <- toupper( names( core.2010 ))
core <- select( core.2010, EIN, MSA_NECH, NTMAJ12, TOTREV, CONT )
core$NTMAJ12[ core$NTMAJ12 == "AR" ] <- "Arts"
core$NTMAJ12[ core$NTMAJ12 == "BH" ] <- "Universities"
core$NTMAJ12[ core$NTMAJ12 == "ED" ] <- "Education"
core$NTMAJ12[ core$NTMAJ12 == "EH" ] <- "Hospitals"
core$NTMAJ12[ core$NTMAJ12 == "EN" ] <- "Environmental"
core$NTMAJ12[ core$NTMAJ12 == "HE" ] <- "Health"
core$NTMAJ12[ core$NTMAJ12 == "HU" ] <- "Human Services"
core$NTMAJ12[ core$NTMAJ12 == "IN" ] <- "International"
core$NTMAJ12[ core$NTMAJ12 == "MU" ] <- "Mutual Benefit"
core$NTMAJ12[ core$NTMAJ12 == "PU" ] <- "Public Benefit"
core$NTMAJ12[ core$NTMAJ12 == "RE" ] <- "Religion"
core$NTMAJ12[ core$NTMAJ12 == "UN" ] <- "Unknown"
core$NTMAJ12 <- factor( core$NTMAJ12 )
head(core) %>% pander()
EIN | MSA_NECH | NTMAJ12 | TOTREV | CONT |
---|---|---|---|---|
10017496 | NA | Human Services | 113807 | 23160 |
10024645 | 733 | Arts | 568558 | 156641 |
10078060 | NA | Hospitals | 81381144 | 1551088 |
10088397 | 5602 | Human Services | 36894 | 36894 |
10130427 | 6403 | Hospitals | 42460089 | 196583 |
10133442 | NA | Human Services | 1064865 | 312518 |
Recode negative revenues as zero because they cause HHIs above 1:
dat.hhi <-
core %>%
group_by( MSA_NECH, NTMAJ12 ) %>%
summarize( hhi= sum( (TOTREV / sum(TOTREV))^2 ),
n=n(),
revenue= sum(TOTREV),
contribution=sum(CONT) )
dat.hhi$hhi[ dat.hhi$hhi > 1 ] <- 1
dat.hhi$revenue[ dat.hhi$revenue < 0 ] <- 0
dat.hhi <- merge( dat.hhi, msa.pop, all.x=T )
dat.hhi %>%
# rescale for printing
mutate( hhi = 100 * hhi,
revenue_thousands = revenue / 1000,
contributions_thousands = contribution / 1000 ) %>%
select( hhi, n, revenue_thousands, contributions_thousands, MSA_POP ) %>%
stargazer( type = s.type,
digits=0,
summary.stat = c("min","p25","median","mean","p75","max", "sd"))
Statistic | Min | Pctl(25) | Median | Mean | Pctl(75) | Max | St. Dev. |
hhi | 0 | 12 | 24 | 36 | 51 | 100 | 31 |
n | 1 | 5 | 20 | 123 | 64 | 26,573 | 674 |
revenue_thousands | 0 | 3,893 | 30,766 | 524,531 | 190,652 | 73,476,463 | 2,695,455 |
contributions_thousands | 0 | 1,635 | 9,938 | 117,360 | 42,890 | 11,886,380 | 555,880 |
MSA_POP | 0 | 159,385 | 316,931 | 953,737 | 772,842 | 19,840,495 | 2,105,962 |
# There is a clarification called TOTREV2 on the list.
# What is the difference between TOTREV and TOTREV2?
# CONT represents total contributions.
dat.nhhi <-
core %>%
group_by( MSA_NECH, NTMAJ12 ) %>%
summarize(
nhhi= sum((((TOTREV / sum(TOTREV))^2) -
(1 / EIN)) / (1 -(1 / EIN))),
n=n(), revenue= sum(TOTREV))
dat.nhhi$nhhi[ dat.nhhi$nhhi > 1 ] <- 1
dat.nhhi$revenue[ dat.nhhi$revenue < 0 ] <- 0
Statistic | N | Mean | St. Dev. | Min | Pctl(25) | Pctl(75) | Max |
MSA_NECH | 2,967 | 4,402.272 | 2,647.043 | 0.000 | 2,120.000 | 6,740.000 | 9,360.000 |
nhhi | 2,976 | 0.357 | 0.306 | 0.001 | 0.118 | 0.514 | 1.000 |
n | 2,979 | 123.245 | 674.347 | 1 | 5 | 64 | 26,573 |
revenue | 2,979 | 524,531,232.000 | 2,695,454,623.000 | 0 | 3,893,027 | 190,651,631 | 73,476,463,183 |
NTMAJ12 | n | min | max |
---|---|---|---|
Arts | 280 | 0.01755 | 1 |
Education | 285 | 0.005273 | 1 |
Environmental | 276 | 0.003676 | 1 |
Health | 279 | 0.006371 | 1 |
Hospitals | 263 | 0.00335 | 1 |
Human Services | 282 | 0.0008126 | 1 |
International | 256 | NA | NA |
Mutual Benefit | 168 | 0.1664 | 1 |
Public Benefit | 280 | 0.007284 | 1 |
Religion | 280 | 0.01384 | 1 |
Universities | 216 | NA | NA |
Unknown | 114 | 0.07098 | 1 |
# library( DT )
these.buttons <- c( 'copy', 'csv', 'excel', 'print' )
datatable( dat.nhhi,
filter='bottom', rownames=FALSE,
#options=list( pageLength=5, autoWidth=TRUE ),
fillContainer=TRUE,
style="bootstrap",
class='table-condensed table-striped',
extensions = 'Buttons',
options=list( dom='Bfrtip',
buttons=these.buttons )) %>%
formatStyle( "MSA_NECH", "white-space"="nowrap" )
dat.msa <-
core %>%
group_by(MSA_NECH) %>%
summarize( hhi= sum( ( TOTREV / sum(TOTREV) )^2 ),
n=n(),
revenue= sum(TOTREV) )
# dat.msa$hhi[ dat.msa$hhi > 1 ] <- 1
# dat.msa$revenue[ dat.msa$revenue < 0 ] <- 0
summary( dat.msa ) %>% pander()
MSA_NECH | hhi | n | revenue |
---|---|---|---|
Min. : 0 | Min. :0.00213 | Min. : 1 | Min. :0.000e+00 |
1st Qu.:2160 | 1st Qu.:0.07627 | 1st Qu.: 163 | 1st Qu.:4.602e+08 |
Median :3990 | Median :0.14422 | Median : 341 | Median :1.179e+09 |
Mean :4409 | Mean :0.20635 | Mean : 1262 | Mean :5.370e+09 |
3rd Qu.:6712 | 3rd Qu.:0.27447 | 3rd Qu.: 786 | 3rd Qu.:3.115e+09 |
Max. :9360 | Max. :1.00000 | Max. :70410 | Max. :1.791e+11 |
NA’s :1 | NA’s :1 | NA | NA |
# msa.pop <- read.csv("Data/MSA_POP.csv")
# msa.names <- read.csv("Data/msa-names.csv")
# msa.pop <- merge( msa.pop, msa.names, all.x=TRUE )
dat.msa <- merge( dat.msa, msa.pop, by="MSA_NECH", all.x=T )
jplot( log10(dat.msa$MSA_POP),
dat.msa$hhi, xaxt="n",
xlab="MSA Population", ylab="HHI" )
axis( side=1,
at=c(0,1,2,3,4),
labels=c("1","10","100","1,000","10,000") )
ggplot( dat.msa, aes(log10(MSA_POP),hhi) ) +
geom_point( col="gray30", alpha=0.7) +
xlab( "MSA Population (logged)" ) +
ylab( "HHI Across MSAs" ) +
theme_minimal()
dat.sub <-
core %>%
group_by(NTMAJ12) %>%
summarize( hhi= round( sum( (TOTREV / sum(TOTREV))^2 ), 3 ),
revenue= sum(TOTREV) ) %>%
arrange( - hhi ) %>% pander( )
summary(dat.sub)
## Length Class Mode
## 1 knit_asis character
ggplot( dat.hhi, aes(x = hhi)) +
geom_density( alpha = 0.5, fill="blue" ) +
xlim( -0.05, 1 ) +
xlab( "HHI" )
ggplot( dat.hhi, aes( x=hhi ) ) +
geom_density( alpha = 0.5, fill="blue" ) +
xlim( -0.05, 1 ) +
xlab( "HHI" ) + facet_wrap( ~ NTMAJ12, nrow=3 ) +
theme_minimal()
ggplot( dat.hhi, aes(y = hhi) ) +
geom_boxplot( col="gray30", alpha=0.7) +
ylab( "HHI Across MSAs" ) +
facet_wrap( ~ NTMAJ12, nrow=3 ) +
theme_minimal()
jplot( log10(dat.hhi$n), dat.hhi$hhi, xaxt="n",
xlab="Number of Nonprofits", ylab="HHI", )
axis( side=1,
at=c(0,1,2,3,4),
labels=c("1","10","100","1,000","10,000") )
ggplot( dat.hhi, aes(log10(n),hhi) ) +
geom_point( col="gray30", alpha=0.7) +
xlab( "Number of Nonprofits (logged)" ) +
ylab( "HHI Across MSAs" ) +
theme_minimal()
ggplot( dat.hhi, aes(log10(n),hhi) ) +
geom_point( col="gray30", alpha=0.7) +
xlab( "Number of Nonprofits (logged)" ) +
ylab( "HHI Across MSAs" ) +
facet_wrap( ~ NTMAJ12, nrow=3 ) +
theme_minimal()
jplot( log10(dat.hhi$revenue), dat.hhi$hhi, xaxt="n",
xlab="Total Revenue (logged)", ylab="HHI" )
axis( side=1,
at=c(0,1,2,3,4,5,6,7,8,9,10,11),
labels=c("1","10","100","1K","10K","100K",
"1M","10M","100M","1B","10B","100B") )
ggplot( dat.hhi, aes(log10(revenue),hhi) ) +
geom_point( col="gray30", alpha=0.7) +
xlab( "Total Revenue (logged)" ) +
ylab( "HHI Across MSAs" ) +
theme_minimal()
ggplot( dat.hhi, aes(log10(revenue),hhi) ) +
geom_point( col="gray30", alpha=0.7) +
xlab( "Total Revenue (logged)" ) +
ylab( "HHI Across MSAs" ) +
facet_wrap( ~ NTMAJ12, nrow=3 ) +
theme_minimal()
ggplot( dat.hhi, aes(log10(revenue),hhi) ) +
geom_point( col="gray30", alpha=0.7) +
geom_smooth(method="lm") +
xlab( "Total Revenue (logged)" ) + ylab( "HHI Across MSAs" ) +
facet_wrap( ~ NTMAJ12, nrow=3 ) +
theme_minimal()
jplot( log10(dat.hhi$contribution), dat.hhi$hhi,
xlab="Contribution (logged)", ylab="HHI",
xaxt="n" )
axis( side=1,
at=c(0,1,2,3,4,5,6,7,8,9,10,11),
labels=c("1","10","100","1K","10K","100K",
"1M","10M","100M","1B","10B","100B") )
ggplot( dat.hhi, aes(log10(contribution),hhi) ) +
geom_point( col="gray30", alpha=0.7) +
xlab( "Contrubution (logged)" ) +
ylab( "HHI Across MSAs" ) +
theme_minimal()
ggplot( dat.hhi, aes(log10(contribution),hhi) ) +
geom_point( col="gray30", alpha=0.7) +
xlab( "Contrubution (logged)" ) +
ylab( "HHI Across MSAs" ) +
facet_wrap( ~ NTMAJ12, nrow=3 ) +
theme_minimal()
ggplot( dat.hhi, aes(log10(contribution),hhi) ) +
geom_point( col="gray30", alpha=0.7) +
geom_smooth(method="lm") +
xlab( "Contrubution (logged)" ) +
ylab( "HHI Across MSAs" ) +
facet_wrap( ~ NTMAJ12, nrow=3 ) +
theme_minimal()
Seaman, B., A. Wilsker, and D. R. Young. 2014. ????Measuring Concentration and Competition in the U.S. Nonprofit Sector: Implications for Research and Public Policy.???? Nonprofit Policy Forum 5(2): 231?V259.
Thornoton, J. 2006. ????Nonprofit Fund-Raising in Competitive Donor Markets.???? Nonprofit and Voluntary Sector Quarterly 35: 204?V224.
Soule, S. A., & King, B. G. (2008). Competition and resource partitioning in three social movement industries. American Journal of Sociology, 113(6), 1568-1610.
Castaneda, M. A., Garen, J., & Thornton, J. (2007). Competition, contractibility, and the market for donors to nonprofits. The Journal of Law, Economics, & Organization, 24(1), 215-246.
Thornton, J. P., & Belski, W. H. (2010). Financial reporting quality and price competition among nonprofit firms. Applied economics, 42(21), 2699-2713.
Mendoza-Abarca, K. I., & Gras, D. (2017). The Performance Effects of Pursuing a Diversification Strategy by Newly Founded Nonprofit Organizations. Journal of Management, 0149206316685854.