MACHINE LEARNING FOR ZOMBIE HUNTING · 2020. 1. 14. · ‘zombies ’, based on poor ... avoiding strong parametric assumptions. The sum-of-trees model is embedded in a . Bayesian

MACHINE LEARNINGFOR ZOMBIE HUNTING

FALCO BARGAGLI STOFFI MASSIMO RICCABONI

ARMANDO RUNGI

2nd Annual Conference of the JRC Community of Practice in Financial Research

• OECD estimates a good share of firms could be ‘zombies’, based on poor interest coverage ratios.

• For example, in the case of Italy, 19% of capital stock and 10% of employment could be sunk in ‘zombies’.

MOTIVATION

2

• Banks can be stuck in zombie lending (Peck and Rosengren, 2005; Caballero et al, 2008);

• Crowding-out of financial resources, especially in times of crisis (Schivardi et al., 2017);

• Lower aggregate productivity by dragging down country averages (Mc Gowan et al., 2018);

• Deter entry of more productive firms, hence less competitive pressures on incumbents (see also discussion on reforms of bankruptcy laws);

WHY SHOULD WE CARE?

3

• We propose a machine learning algorithm to predict firms’failures after trainings on big datasets of firm-level financialaccounts. We start with Italian manufacturing in 2008-2017.

• In this way, we are able to assign a probability value (risk offailure) to each firm.

• We improve on existing proxy methods where above/belowthresholds of few indicators.

• Potentially useful for assessing credit risk, but also fordetecting the share of firms under distress in an economy.

OUR CONTRIBUTION

4

• Originally, ‘zombie lending’ (Caballero et al., 2008): under-capitalized banks can decide to cut credit to more viable projects to avoid a public disclosure of non-performing loans in their portfolio. The intuition is that ‘zombie firms’ receive hidden subsidies under the form of bank credit. See also Schivardi (2017) and crowding out of resources for healthy Italian firms in times of crisis.

• But what is a ‘zombie’? Seminal working definition by Caballero (2008) based on how present interest payments compare to an estimated benchmark of debt structure and market interest rate. Other proxy indicators by Bank of England (2013) are negative value added and profitability.

• McGowan et al. (2018) considers also misallocation of productive resources (not only financial): look at productivity levels and consider market entry/exit barriers (e.g. bankruptcy laws). See also few discussion papers by OECD (2017a; 2017b).

• Parallel strand of research uses proxy methods for ‘predicting’ credit risk: 1. Z-scores (Altman, 1968; Altman et al. 2000) consider five ratios in an equation with weights; 2. Distance-to-default (Merton, 1974 ) according to which a firm's equity can be seen as a call

option on the underlying assets, focuses on financial information from the firm and from the market

LITERATURE REVIEW

5

ML techniques have been applied, so far, to a variety of economic problems (Mullainathan and Speiss (2017) :

Generation of new data sets (Jean et al., 2016; Cavallo and Rigobon, 2016)

Prediction (Bajari et al., 2015; Kleinberg et al, 2015; Kleinberg et al., 2017)

Testing theory (Hatford et. al., 2016; Erev et al., 2017; Plonsky et al., 2017)

Causal Inference (Hill, 2011; Belloni et al., 2011, 2014; Athey and Imbens, 2016; Bargagli-Stoffi and Gnecco, 2019)

MACHINE LEARNING IN ECONOMICS

6

WHICH ML TECHNIQUES WE USE

7

Bayesian additive regression trees (BART) provides a flexible approach to fitting a variety of regression models while avoiding strong parametric assumptions. The sum-of-trees model is embedded in a Bayesian inferential framework to support uncertainty quantification and provide a principled approach to regularization through prior specification (Hill et al., 2019)

BAYESIAN ADDITIVE REGRESSION TREES

8

• We train our algorithm on 304,869 manufacturing firms in Italy active in the period 2008-2017 with at least a valueknown for sales/turnover. The original source is Orbis, by Bureau Van Dijk.

• For each firm we have a status, as in following figure, with a status precise date. We consider firm exit the market whenthey are classified as bankrupted, dissolved or in liquidation

9

DATA

PREDICTORS

10

The procedure explicitly takes into account that missing values maynot be random, e.g. firms may avoid disclosure relatively more when in trouble. This information is used as a further ‘predictor’.

MISSING VALUES MAY NOT BE RANDOM

11

GOODNESS OF FIT

12

• Eventually, we can rank which financial accounts best predictedfailure after using a LASSO.

• Yet, the prediction power is the result of all bits of information inany financial account we used.

• This is a strong benefit, as the method adapts dynamically to ever-changing environments.

E PLURIBUS UNUM

13

For example, we may set a working threshold of risk for our purpose. Say above decile Q8 for three consecutive years is a persistently distressed firm.

A DISTRIBUTION FOR THE RISK OF FAILURE

14

This is how what we just defined as persistently distressed firms behave over time and in relationship with GDP growth rate

DISTRESSED FIRMS AND GDP

15

Ex post, we can look at firms’ characteristics along the distributionof probability of failure. For example, here we have the (log of) totalfactor productivity of firms above/below the 8th decile.

WHO IS DISTRESSED

16

WHERE THEY ARE

17

• Stability/Generalizability of the algorithm: what happens to prediction errors when further (non-orthogonal) predictors are included and/or longer time series.

• Across countries: first exercises with Spain, France, and Portugal show different rankings of predictors.

• Separate the wheat from the chaff: some innovative firms may appeardistressed in the short run but they are not. How far can we go with ML?

ROBUSTNESS AND SENSITIVITY CHECKS

18

• We propose a machine learning technique to assess the probability of failure of a firm after training on firm-levelfinancial accounts.

• This is potentially useful for attributing a ‘score’ to the firm, but also to check how much of an economy is in trouble, and why.

• Much work to do for further training and for checkingstability/generalizability of the algorithm, yet this is animprovement on previous proxy thresholds for assessingcredit risk.

CONCLUSIONS

19

Documents

MACHINE LEARNING FOR ZOMBIE HUNTING · 2020. 1. 14. · ‘zombies ’, based on poor ... avoiding strong parametric assumptions. The sum-of-trees model is embedded in a . Bayesian