New NYC Business Incorporation 2005-2013 An Exploration of Non- Minority and Minority- Owned Enterprise Creation By Shelby Ahern [email protected] NYC Data Science Academy Student Demo day 07-21-2014 R005: Data Science by R(Beginner level)

New NYC Business Incorporation 2005-2013 An Exploration of Non-Minority and Minority-Owned Enterprise Creation By Shelby Ahern [email protected] NYC Data

Embed Size (px)

Citation preview

New NYC Business Incorporation 2005-2013 An Exploration of Non-Minority and Minority-Owned Enterprise Creation

By Shelby [email protected]

NYC Data Science AcademyStudent Demo day 07-21-2014R005: Data Science by R(Beginner level)


Explore • New Business Incorporation in NYC between 2005-2013, and

• New Business Incorporation, by Minority and Non-Minority Ownership

Data Sources• Active New York Corporations: Beginning in 18001

• NYC Online Directory of Certified Businesses: Minority-Owned Business Enterprises (MBE)2,3

• U.S. Census Population Estimates4

• Entity Type:• Domestic Business Corporation• Domestic Cooperative Corporation• Domestic Professional Corporation

Parameters and Notes• 2005-2013 (9 years)• Borough = County (ie. Manhattan: New York County, Brooklyn:

Kings County, Queens = Queens County, Bronx = Bronx County, Staten Island = Richmond County


Create Data Frames of Data from Each Source Run Summary Statistics for Validation Split by Borough and Combine DFs from

Different Sources Perform Calculations ie. New Incorporations per

Capita Data Viz! Test:

“Density” of New MBE Corps for Minority Population ≠ “Density” of New Non-MBE Corps per Non-Minority Population

Initial Exploration: Corporation Filings

An Initial Review of the Summaries of the Corporation Data and MBE-Certified Corporations show…

Major disparity between the Number of Incorporations per year, and number of MBE’s established in that year.

Why?- Data Quality: Change in Ownership Structure, Restrictions to MBE Certifications,

and/or Filing Lag- !! What the Data actually represent: MBE application purpose & process

Collating and Calculating Data


County year NewCorps NewMBECorps Tot_Pop MBE_pop1 NwCorpsperCap NwMBECorpsperCap NwMBECorpsperMBECapNwNonMBECorpsperCap

1 NEW YORK 2005 5101 35 1529774 690696 0.0033 2.30E-05 5.10E-05 0.0062 NEW YORK 2006 5395 42 1611581 738221 0.0033 2.60E-05 5.70E-05 0.00613 NEW YORK 2007 5373 39 1620867 724926 0.0033 2.40E-05 5.40E-05 0.0064 NEW YORK 2008 5602 38 1634795 696413 0.0034 2.30E-05 5.50E-05 0.00595 NEW YORK 2009 7617 39 1629054 669583 0.0047 2.40E-05 5.80E-05 0.00796 NEW YORK 2010 9872 34 1585873 674800 0.0062 2.10E-05 5.00E-05 0.01087 NEW YORK 2011 9909 24 1601948 703250 0.0062 1.50E-05 3.40E-05 0.0118 NEW YORK 2012 10326 15 1619090 697407 0.0064 9.30E-06 2.20E-05 0.01129 NEW YORK 2013 10345 3 1585873 546732 0.0065 1.90E-06 5.50E-06 0.01

After merging data from different data frames, we are able to calculate the number of new corporations filed per capita, on a yearly basis.

Further, we calculate the number of new corporations filed per capita of certain populations, like MBEs/Minority and Non-MBE’s/Non-Minority populations.

Example Data Frame, Manhattan

Incorporation Activity per Capita


Incorporations per Capita and MBE Incorporations per capita, 2005- 2013

MBE Incorporation Activity per Capita

MBE Incorporations per Capita, 2005- 2013



The per-capita incidence of incorporations increased across all boroughs, from 2005 - 2013.

Manhattan, Queens, and Brooklyn had the highest per-capita incorporations.

Queens appears to have the steepest increase in corporation filings.

MBE incorporations per capita are a thousands of times smaller than the general level of per-capita-incorporation.

The per-capita incidence of MBE incorporations varied by borough (led by Manhattan), and trended downward after 2009.

Incorporation Activity per Capita, cont.

Do the Frequency of Incorporations vary between Minority and Non-Minority Populations?

Hypothesis: The number of MBE incorporations per non-white person is not equal to the number of non-MBE incorporations per white person.

The approach:

1. Select Value to test: MBE Corps per Minority capita Non-MBE Corps per Non-Minority capita▪ Utilize data from all years and boroughs (5 boroughs x 9 years x 2 categories = 90 obs.)

2. Evaluate which test(s) to conduct.

Parametric vs. Non-parametric Means test vs. Other

3. Conduct test and analyze results.

Reviewing the MBE-Minority Data Set

Histogram, MBE Incorporations per Minority capita

QQplot, MBE Incorporations per Minority capita

Reviewing the Non-MBE – Non-Minority Data Set

Histogram, Non-MBE Incorporations per Non-Minority capita

QQplot, Non-MBE Incorporations per Non-Minority capita

Testing for Similarity/Dissimilarity

Neither MBE nor Non-MBE per capita data appear to be normally distributed. Hence, we’ll consider the following two non-parametric tests:

Mood’s Median Test

A nonparametric test where the null hypothesis of the medians of the populations from which two or more samples are drawn are identical. (Wikipedia)

H0: Medians of MBE - Minority cap and Non-MBE -- Non-Minority cap are equivalent.

H1: Medians of MBE - Minority cap and Non-MBE -- Non-Minority cap are NOT equivalent.

Mann-Whitney-Wilcoxon Test

A nonparametric test of the null hypothesis that two populations are the same against an alternative hypothesis, especially that a particular population tends to have larger values than the other. (Wikipedia)

H0: MBE - Minority cap and Non-MBE -- Non-Minority cap could be representative of the same set of data.

H1: MBE - Minority cap and Non-MBE -- Non-Minority cap could NOT be representative of the same set of data.


In both tests of parity, the null hypothesis is rejected, thus we find that the incidence of new business incorporations per capita are different between the two populations.