Lyn Thomas-Book

Embed Size (px)

Citation preview

  • 8/10/2019 Lyn Thomas-Book

    1/85

    Credit Scoring-from risk assessment topricing, profits and portfolios

    Lyn C.Thomas

    Quantitative Financial Risk Management CentreSchool of Management

    University of Southampton

    Santiago de Chile, June 11 2008

  • 8/10/2019 Lyn Thomas-Book

    2/85

    Structure

    Recap of consumer credit and credit scoring

    Methodologies for building default risk scorecards Current Pressures

    Issues Arising and Future Developments

    Changing objectives of risk assessment bring new methodologies

    Using survival analysis to build scorecards

    Profitability modelling

    Variable pricing

    New issues in data cleaning and enhancing

    Impact of new Basel Accord

    Low default portfolios Loss Given Default modelling Need for models of credit risk of portfolios of consumer loans

    Conclusions

  • 8/10/2019 Lyn Thomas-Book

    3/85

    History of consumer credit

    Babylonians lent for seed to be repaid at harvest

    The commerce of consumer lending around for 750 years sinceMedieval pawnbrokers

    1920s saw Ford/Sloan not only mass produce cars but ways of

    financing them for the masses

    1960s saw the arrival of credit cards and the start of the explosion inconsumer credit. Same time saw the growth in home ownership in most

    Western countries

    Now consumer credit is ubiquitous it is argued as a human right.

  • 8/10/2019 Lyn Thomas-Book

    4/85

    Current consumer credit levels

    Chile; (main information from Cox, Parrado, Ruiz-Tagle 2006)

    Household debt 60% of average annual income (US is 130%, UK,Canada , etc>100%)

    75% own home: 64% of consumer debt is mortgage but held byonly 16% households ( US 76%, Canada 69%, UK 73%)

    3 million + Mastercard Credit cards

    Private label credit/store cards 5 million +

  • 8/10/2019 Lyn Thomas-Book

    5/85

  • 8/10/2019 Lyn Thomas-Book

    6/85

  • 8/10/2019 Lyn Thomas-Book

    7/85

    figure 1.1.1 Comparison of US household and business debt

    0

    2000

    4000

    6000

    8000

    10000

    12000

    14000

    1970 1975 1980 1985 1990 1995 2000 2005 2010

    $Billions

    Total household mortgage consumer credit total business corporate

    Comparison of US household and

    corporate debt 1974-2006

  • 8/10/2019 Lyn Thomas-Book

    8/85

    Countries with largestMastercard/Visa circulation 2003

    Top 10 Represent 72% of Global VISA/MC Cards

    2,362,042TOTALGLOBAL

    1,699,902TOTALTOP 10

    51,100Canada10

    56,239Spain9

    60,330Taiwan8

    94,632S. Korea7

    109,482Germany6

    121,281Japan5

    125,744UK4

    148,435Brazil3

    177,359China2

    755,300USA1

    CardsCountryRank

    VISA/MC (Credit + Debit) (Cards in Circulation) 000's

  • 8/10/2019 Lyn Thomas-Book

    9/85

    Recap of default based credit scoring-

    application scoring Two types of credit scoring

    application scoring and behavioural scoring

    Application Scoring:

    Grant credit to new applicant?

    Information available applicants application form details

    credit reference agency check

    application details/credit histories previous applicants

    No information available on credit histories of previous applicantswho were rejected. Leads to bias.

  • 8/10/2019 Lyn Thomas-Book

    10/85

    Application scoring Shaped by original application : whether to accept a new customer

    pragmatic philosophy, predict not explain, no causal modelling

    assumes credit worthiness time independent over 2-3 years

    redo scorecard rather than put in dynamics

    Objective to rank applicants correctly; default level forecast is secondary

    reflected in performance measures Gini coefficient, swap sets

    specific risk: prob. of missing 3 consecutive months in next year.

    50 years since first commerical application scoring introduced

    Credit bureau data greatly improved decision making accuracy

    different information levels available in different countries

    Legal considerations

    what cannot be used (race/gender/age?);

    what must be used (affordability in Australia)

    Can there be a world wide consumer credit risk system?

  • 8/10/2019 Lyn Thomas-Book

    11/85

    Behavioural models arrived in 1960s the revolution thatwasnt.

    uses performance data as well as application data (but dominates

    latter in strength of characteristics)

    what is behavioural scoring used for?

    different decisions to application scoring (credit limit, cross selling).

    Same risk PD in next 12 months

    use classification (static) models rather than model dynamics ofconsumer credit risk behaviour

    application scoring: snapshot to snapshot behavioural scoring: video clip to snapshot

    profit scoring; video clip to video clip

    History of consumer credit

    modelling- Behavioural scoring

  • 8/10/2019 Lyn Thomas-Book

    12/85

    Classification methods used incredit granting Take sample of previous applicants;

    classify into good payers or defaulters one year later.Classification methods find characteristics identifying two groups.

    In future accept those with good characteristics; reject bad.

    Existing credit scoring classification methods

    discriminant analysis/ linear regression logistic regression

    Classification trees, random forests linear programming

    Developmental credit scoring approaches

    neural networks Support vector machines

    expert systems genetic algorithms nearest neighbour methods Bayesian learning networks

  • 8/10/2019 Lyn Thomas-Book

    13/85

    xx

    xx

    x

    x

    x

    x

    xx

    xx

    x

    x

    xx

    x

    x

    x

    x x

    xx

    x

    xx

    x x

    x

    xx

    x

    x

    x

    x

    x

    x

    x

    age

    income

    x - bads

    x-goods

    Graph of simple scorecard on age and income

    Not perfectBut only two parametersAge+a(income)=b

    Better classifier

    But lots moreparameters

  • 8/10/2019 Lyn Thomas-Book

    14/85

    Linear regression and Logistic regression

    Discriminant analysis(LDF) is equivalent to linear

    regression if only two classification groups so can useleast squares

    pi = Exp{Yi}= w1X1+....+wpXpwhere Y

    i

    = 1 if ith applicant good; 0 if bad

    Logistic regression (LR) assumes

    . log(pi/(1-pi)) = w1X1+....+wpXp

    LR holds for much wider class of models than LDF

    In both cases need to coarse classify variables to deal withnon-monotonicity in relationship with defaulting

  • 8/10/2019 Lyn Thomas-Book

    15/85

    Default risk with age

    0

    5

    10

    15

    20

    25

    30

    18 24 30 36 42 48 54 60 66 72 78

    default risk

  • 8/10/2019 Lyn Thomas-Book

    16/85

    All variables are categorical

    Since risk is not linear in the

    continuous variables, make thesevariables categorical as well

    So age splits into are you 18-21;

    22-28; 29-36; 37-59;60+ So coarse classify all variables-

    categorical and continuous0

    510

    15

    20

    25

    18-21

    22-28

    29-36

    37-59

    6

    0+

    default risk

  • 8/10/2019 Lyn Thomas-Book

    17/85

    Linear Programming approach

    Assume nG goods labelled i = 1, 2, nGnB bads labelled i = nG+1, .nG+nB

    Require weights wj j= 1, 2, . . . . p and a cut off value, c such that

    For goods: w1 xi1 + w2 xi2 + + wp xip > cFor bads: w1 xi1 + w2 xi2 + + wp xip

  • 8/10/2019 Lyn Thomas-Book

    18/85

    Classification treesgrouping rather than scoring

    Methods like classification trees, expert systems neural nets classifyapplicants into groups rather than giving a scorecard.

    Classification tree developed both in statistics and computer science sois also called Recursive partitioning algorithm

    Splits sample A into two subsets, using attributes of one characteristicso two subsets have maximum difference in bad rate

    Take each subset and repeat the process until one decides to stop

    Each terminal node is classified as Good or Bad

    Classification tree depends on

    Splitting rule how to choose best daughter subsets

    Stopping rule- when one decides this is a terminal node

    Assigning rule- which categories for terminal nodes

  • 8/10/2019 Lyn Thomas-Book

    19/85

    Classification tree:credit risk example

    wholesample

    residential status

    owner not owner

    years at bank age

    years > 2 years < 2 age < 26 age >26

    numberchildren

    0 child 1+

    employment

    prof not prof

    age

    21

    res. status

    parents otherwith

    Extend to random forests:lots of such trees each onsubset of sample data and subset of characteristicsMajority voting to classify

  • 8/10/2019 Lyn Thomas-Book

    20/85

    Yrs ataddress

    Income

    Age

    (X1)

    (X2)

    (Xp)

    NET OUT

    ARTIFICIAL NEURON

    ACTIVATIONFUNCTION

    Neural Network

    W1

    W2

    Wp

    1 1( ) ( ) ( ... )p pOUT f NET f f wx w x= = = + +w.x

    Neural network; computer system consisting on number of processing unitsProcessors connected together in layersFor credit scoring, characteristics nput layer, prediction of Bads output layer

  • 8/10/2019 Lyn Thomas-Book

    21/85

    TWO LAYER NEURAL NETWORK

    X1

    X2..

    .

    .Xp

    ZGood/Bad

    W11

    W12

    W1qK1

    K3

    If only input and output layer then can be no better than linear regression

    Train by pattern discrimination or backward propogation

    Age

    Yrs at bank

    Income

    w11 x1 + w21 x2 + .... = s1

    o1 = 1 / ( 1 + es

    1 )

    K2

  • 8/10/2019 Lyn Thomas-Book

    22/85

    Problems when using Neural networks

    in Credit ScoringCan take too long to run Do not meet legal requirements that one can give reasons for rejecting Local Minima

    A

    B

    C

    Error

    Time

    How many hidden layers? - often only three*How many nodes in each layer?How to interpret weights or restrict connections?

    ornn11.ppt

  • 8/10/2019 Lyn Thomas-Book

    23/85

    Is there a best classification method ?

    Logistic regression industry norm

    often used in conjunction with other approaches

    classification trees, linear regression, linear programming

    Segmented population: different scorecard each segment

    system reasons ( e,g. new accounts)

    Statistical reasons ( way of dealing with interactions in variables) Strategic reasons ( want to be able to deal differently with some groups)

    Newer classification methods have been piloted

    dont have transparency or robustness

    Flat maximum effect lots of almost equally good scorecards

  • 8/10/2019 Lyn Thomas-Book

    24/85

    Relative ranking of 17 methods on 8 consumer credit data sets(Baesens JORS 2003)

    9512211012181310Number times statisticinsignificant difference

    with best

    022634411Number time method

    best out of 17 tried

    NearestNeighbour

    Otherversionsof

    classificationtrees

    Bestversionof

    classi

    ficationtrees

    NeuralNets

    other

    versionsof

    SVM

    BestversionofSVM

    Linea

    rProgram

    Logis

    ticReg

    Linea

    rReg

    Methods applied to 8 datasets using 3 measures (

    24 tests)

  • 8/10/2019 Lyn Thomas-Book

    25/85

    Differences are in other features Regression approach allows statistical tests to say how

    important each characteristic is to classification

    gives lean/mean scorecards

    helps devise new application forms

    Linear programming allows firms to set requirements on scores

    score (age score (age >60) deals more easily with large numbers of application characteristics

    Classification trees, neural nets, Support vector machines pickup relationships between variables which may not be obvious

  • 8/10/2019 Lyn Thomas-Book

    26/85

    Measuring scorecards in credit scoring

    Three aspects of scorecard performance

    Discriminatory power ( only scorecard needed) How good is the system at separating the two classes of goods and bads

    Divergence statistic

    Mahalanobis distance

    Somers D-concordance statistic

    Kolmogorov Smirnov statistic

    ROC curve

    Gini coefficient

    Calibration of forecast ( scorecard plus population odds) Not used much until Basel requirements and so few tests

    Chi-square ( Hosmer-Lemeshow ) test

    Binomial and normal tests

    Prediction error( scorecard + population odds + cut-off) how many erroneous classifications

    Error rates

    Confusion matrix, swap sets, specificity, sensitivity

  • 8/10/2019 Lyn Thomas-Book

    27/85

    ROC curves and Gini Coefficient

    Gini coefficient,G, =2x(ratio of area between curve and diagonal )=2(ABC)

    G =1 then perfect discrimination; G =0 no discrimination.

    K-S is greatest vertical distance from diagonal to curve.

    A

    C

    F(s | G)

    F(s|B)

    B

    F(s|B)

  • 8/10/2019 Lyn Thomas-Book

    28/85

    Current pressures

    Lenders want to maximise profit not minimise default rates

    want to optimise all decisions in customer relationship

    not just whether to accept customer for vanilla loan.

    Consumers

    market near saturation in some countries

    so take rates dropping, attrition rates rising

    want customized products will they buy into risk-based pricing ?

    Industry

    Basel New capital Accord begun in 2007 means IRB systems ofconsumer credit de rigeur

    need models of credit risk of portfolio of consumer loans

    Basel II uses corporate model , need models for Basel III securitization: bundling and pricing models are primitive

    Ch i bj i f i k

  • 8/10/2019 Lyn Thomas-Book

    29/85

    Changing objectives of riskassessment bring new methodologies

    Changes in objectives are more likely than a need forimproved accuracy to force changes in methodology

    Move to assessing profitability not just default risk

    Need to estimate several events- default, cross selling,churn and also when these events will occur

    survival analysis approaches

    Markov chain models

    need for dynamic models which incorporate economic/market effects

  • 8/10/2019 Lyn Thomas-Book

    30/85

    Traditional approach to credit scoring

    Take Fixed time horizon T

    If default occurs within that time Bad; if no default within that time - Good/Indeterminate

    Arbitrary: if time horizon is T, default at T-1 is bad, default at T+1 isgood ( or at least indeterminate).

    Lose information: indeterminates left out.

    Those who fail at 3 months classified same as those who fail at T-1 months.

    Competing risks ignored: those who leave/pay off early duringoutcome period left out of default scorecard building and vice versa.

  • 8/10/2019 Lyn Thomas-Book

    31/85

    Survival analysis: ask when

    Ask when events happen- default, early repayment,

    purchase

    deals with censored data easily

    gives a handle on profit as profit depends on time until

    certain event occur (default,switch lenders)

    does not require any choice of time horizon so noarbitrariness or loss of information

    uses the data on everyone so no loss of information

    allows competing risks models so can build default ,purchase and attrition models on same data.

  • 8/10/2019 Lyn Thomas-Book

    32/85

    Censoring Mechanism

    Months on Books

    0 6 12 24 47

    A

    B

    C

    Default

    Censored (Closed Account)

    Censored (Truncated)

    XX

    End of

    sample date

    End ofEnd of

    sample datesample date

    Censored (Truncated and

    started after start of sample)

  • 8/10/2019 Lyn Thomas-Book

    33/85

    Using Survival Analysis

    How long customers survivebefore they default?

    How long customers staybefore they change companies?

    How long until customer makes next purchase? =

    How long deteriorating systems survivebefore failure?

    Survival analysis analysis of lifetime data when censoring

    Lifetime T lime before loan defaults ( repays early,purchase made).

    Standard ways of describing the randomness of T are

    distribution function, F(t), where F(t) = Prob{ T t}

    ( S(t)=1-F(t) is the survivor function)

    density function, f(t) where Prob{ t T t+t)= f(t)t

    hazard function h(t) =f(t)/(1-F(t)) so h(t)t = Prob{t T t+t |T t)

  • 8/10/2019 Lyn Thomas-Book

    34/85

    Hazard Function

    T- r.v. representing failure time (time to default/early pay-off)

    Hazard function

    If discrete time , probability default in period t given not defaulated before.

    ( )

    ( )

    +

  • 8/10/2019 Lyn Thomas-Book

    35/85

    Proportional hazards (PH) and

    accelerated life (AL) models Explanatory variables allows for heterogeneity of the population.

    Proportional hazard models and accelerated life models connect

    explanatory variables to failure times in survival analysis

    Let x = ( x1, x2, ...., xN)be explanatory variables.

    Accelerated life models assume

    S0 ,h0 are baseline survivor /hazard rate function and x's speedup or slow down 'ageing'

    Proportional hazard models

    Explanatory variables have

    multiplier effect on base hazard rate.

    0 0( , ) ( ) or ( , ) ( )S t S e t h t e h e t = =b.x b.x b.xx x

    h(t)

    0( , ) ( )h t e h t =b.xx

  • 8/10/2019 Lyn Thomas-Book

    36/85

    Cox Proportional Hazards Model

    ( Non-parametric approach)

    :

    Cox showed can estimatebwithout knowledge of h0(t) by using rankof failure and censored times.

    If times are discrete so 'lots of ties' need approximation in MaximumLikelihood estimator.So if T- r.v. representing failure time (time to default/early pay-off)and x -vector of covariates

    h(t,x) is hazard for individual with characteristics x

    acts like a scorecard ( minus to ensure higher score better loan)

    ( )

    0

    1 1 2 2

    ( , ) ( )

    ( ) . . . .

    hs

    h n n

    h t e h t

    s b x b x b x

    =

    = + + +

    xx

    x

  • 8/10/2019 Lyn Thomas-Book

    37/85

    Comparison of logistic regression

    and survival analysis For a borrower with characteristics x

    Logistic regression

    Performance horizon of t*; if p=PG(t*,x), score sdefined by

    Proportional hazards

    Can estimate P G (t,x) for any t and x. Consider p=PG(t*,x)

    -

    - - log (-log( ))she

    h

    p c s p= = =w.x

    1

    ln .1 1 sp

    s pp e

    = = = + w.x

  • 8/10/2019 Lyn Thomas-Book

    38/85

    Building a credit scorecard for estimating whencustomers default using proportional hazards

    Take sample of past customers with their applicationand bureau characteristics ( as usual)

    For each, give time of default or the time history was

    censored (no further info in sample/ time left lender)

    Coarse classify variables without using time horizon

    Check need for time dependent variables

    Build proportional hazards model

    Statistical tests for validating model

  • 8/10/2019 Lyn Thomas-Book

    39/85

    Coarse-classifying using

    PH approach Split variable into n binary variables, (each covering a category or in continuousvariable case range of (1/n)th of population).

    Apply PH model with these binary variables as characteristics

    Chart parameter estimates

    choose splits based on similarity of parameter estimates

    Note: It is important to do splits separately for every type of failure. Here areestimates for default( left), early repayment(right)

    Comparing Logistic Regression and

  • 8/10/2019 Lyn Thomas-Book

    40/85

    Comparing Logistic Regression andProportional Hazards for estimating default risk

    Two definitions of bad

    1. Defaulted on loan in first 12 months

    2. still repaying after 12 months but defaulted in the nexttwelve months.

    Two separate LR models for each definition.

    One PH model predicting time to early pay-off.

    So LRs should be best as they are designed for each specific

    definition of bad

    Compare models performance using ROC curves

  • 8/10/2019 Lyn Thomas-Book

    41/85

    ROC curves for PH and LR predicting default

    PH vs LR 1

    (1st 12 mths)

    PH vs LR 2

    (2nd 12 mths)

    li i i l

  • 8/10/2019 Lyn Thomas-Book

    42/85

    pplication in Basel II

    Basel II is new regulations concerning how much banksneed to set aside to cover credit losses

    Use credit scoring to identify PD,probability of default innext 12 months which feeds into Basel formulae of howmuch to set aside

    Low default Portfolios ( like mortgages) do not have enoughbads over 12 months to build good models

    Use longer time intervals go bad at any time

    How to recover 12 month PD

    Answer survival analysis

  • 8/10/2019 Lyn Thomas-Book

    43/85

    Profitability Modelling

    Emphasis moving away from minimising default to maximising profit

    Acceptance decisions ( no longer yes/n0) Several variants of the product

    Customized product

    price appropriate for profit

    Customize non price features on line

    Operational decisions

    Credit limit adjustments

    Cross sell or up sell Counter attrition measures

    optimise collections process for defaulters

    Behavioural score on its own not enough

    Current Profit Approach

  • 8/10/2019 Lyn Thomas-Book

    44/85

    ppRisk/Reward Matrix

    $5000$ 1000No overdraftBehav score500

    Balance >$5000Balance $1000-$5000

    Balance < $1000Overdraft limit

    Use behavioural score (risk) and average balance ( return)No recognition of dynamics of customer behaviourSubjective decision in each cell. No optimization model within each cellOvercome this by using dynamic models-

    survival analysis and markov chains

    PH model to calculate

  • 8/10/2019 Lyn Thomas-Book

    45/85

    PH model to calculateprofit on fixed term loan

    L - loan amount; T- term of the loan a- repayment per period

    r - interbank lending rate ;

    he(i) hazard function that early repayment in period i

    Can generalise and allow r to be time dependent (yield curve) or stochastic

    Build PH model to estimate time to default and henceS( i) no default probability before month i

    Similarly build PH model for time until early repayment and henceE(i) no early repayment probability before monthI

    1

    1

    Profit(no consideration of default/early repayment)= (1 )

    ( ) ( , )True Profit ( ) ( 1)

    (1 ) (1 )

    T

    ii

    Te

    i ii

    a

    Lr

    h i R r LaS i E i L

    r r

    =

    =

    +

    = +

    + +

  • 8/10/2019 Lyn Thomas-Book

    46/85

    Plots of profits at for loans

    of different durations

    Default score: Increasing default probability

    M k Ch i M d l

  • 8/10/2019 Lyn Thomas-Book

    47/85

    Markov Chain Models Already used for roll rate analysis

    Extend to more general states

    0 1 2 3

    0.95 0.05 0 0

    .4 .2 .4 0

    .3 .1 .1 .5

    months month month month+

    1000000Default

    0100000Overlimit

    0010000Closed

    .1.05.05.3.3.20Beh Score Band 4

    .05.1.05.2.3.3.1Beh Score Band 3

    0.05.05.05.05.4.4Beh Score Band 2

    00.10.02.03.85Beh Score Band 1

    DefaultOverlimi

    t

    ClosedBS4BS3BS2BS1

    Overdue by

    0 months

    1 month

    2 months

    Credit Limits set by

  • 8/10/2019 Lyn Thomas-Book

    48/85

    Markov chain profitability model

    (Capital One; Interfaces 2003)

    State: s, Credit limit L

    Estimate monthly profit r(s,L),- transition probability p(s|s,L)

    Markov decision process Vn(s,L) optimal profit over n periods startingin state s with limit L

    Can improve model by

    Second Order Markov chain ( s(t) = ( BS(t), BS(t-1)) Include economic variables in transition matrix

    Include age of loan in transition matrix

    Segment population

    Mover/stayer Revolver/transactor

    1( , ) ( , ) ( | , ) ( , )n nL L

    s

    V s L m ax r s L p s s L V s L

    = +

  • 8/10/2019 Lyn Thomas-Book

    49/85

    Pricing Surprising for 40 years

    consumer lending has had only

    one price

    Decision was is riskacceptable/non acceptable

    Now beginning to price for risk

    177.0%Provident

    23.1%Citi Finance

    16.7%Autocredit

    11.4%Lloyds TSB

    9.9%Tesco

    9.7%Intelligent Finance

    8.9%Nationwide Building Society

    8.7%Halifax

    8.0%Nat West Bank

    7.8%Royal Bank of Scotland

    7.4%Northern Rock

    7.1%Sainsbury

    6.9%GE Money

    6.7%Bradford and Bingley

    6.3%Yourpersonalloan.co.uk

    APR rate advertisedCompany

    Key points in developing

  • 8/10/2019 Lyn Thomas-Book

    50/85

    y p p gpricing models

    Lost quote data is valuable

    Find out who did not take offer ( and if possible why)

    Regulations will set constraints on minimum and maximum prices

    Could say take everyone but the price for some is so high no one willaccept but there are always idiots ( who the regulations will protect).

    Market changes much faster than economic changes

    response scorecards need to be rebuilt faster than risk ones Utilization of product is important for profitability

    Pre payment and re financing need modelling

    Prices are always will be negotiable once they are variable.

    Adverse selection

    Offer at interest rate 6% does not get normal population mix , butmore of those who could not get better offer than 6%

  • 8/10/2019 Lyn Thomas-Book

    51/85

    Take Probability q(p,r)

    Profitability depends vitally on take probability

    Take probability is function of

    risk probability p, ( prob of being good) of borrower

    Rate offered r

    Take probability can also depend on other features

    Need to estimate this probability Cannot estimate without considering adverse

    selection ( i.e. does depend on p and more so thanyou may estimate)

  • 8/10/2019 Lyn Thomas-Book

    52/85

    Pr { Take} as function of Pr{Good}

    Common risk free

  • 8/10/2019 Lyn Thomas-Book

    53/85

    Common risk free

    take functions q(r) q(r) fraction who will take loan at rate r

    dq/dr 0 w(r)- density function of maximum willingness to pay

    Linear response function

    Logistic response function

    )q(moreorrpaytowillingpopulationofFraction)( 111

    rdrrwr

    ( ) max{0,1 ( )} for 0L Lq r b r r r r = >

    ( )( ) ln

    1 1 ( )

    a br

    responsea br

    e q rq r a br s

    e q r

    = =

    +

    Optimal price for risk free

  • 8/10/2019 Lyn Thomas-Book

    54/85

    Optimal price for risk free

    response function

    ( )( )[ ( )] ( ) ( ) ( ((1 )r A F D F ax E P r q r r r p l r p= +

    ( )( ) ( ) ( ((1 ) ( ) 0

    ( )(1 )( )

    ( )

    ( )( )

    ( )

    F D F

    D FF

    s

    F D F

    q r r r p l r p q r p

    l r pq rr r

    q r p

    q rr r l r e

    q r

    + + =

    + = +

    = + +

    Example with logistic response

  • 8/10/2019 Lyn Thomas-Book

    55/85

    a p e w t og st c espo se

    a=4, b=32, rF =0.05, lD =0.5

    54.711.91.00

    52.712.20.99

    50.512.40.98

    45.713.00.96

    40.213.70.94

    28.015.50.9

    4.522.00.8

    0.231.70.7

    0.00344.80.6

    0.00000963.10.5

    Take probability q(r ) as %Optimal interest rate r as %Probability of being Good, p

    Risky response rates q(r p)

  • 8/10/2019 Lyn Thomas-Book

    56/85

    Risky response rates q(r,p)

    Same principle but now have to worry about

    Adverse selection

    Affordability is probability of borrower being Good if interest

    rate charged is r, if p is probability of being Good atbenchmark interest rate

    ( , )p r p%

    [ ( , )] ( , )(( ( ) ) ( , ) ( )(1 ( , )))A F D FE P r p q r p r p r p r p l r p r p= + % %

    ( )

    ( )

    ( , )

    ( , ) ( , )( ( ) ) ( , ) ( )(1 ( , ) ( , ) ( , ) ( ( ) ) 0

    ( ) ( , ) ( , ) ( )

    ( , ) ( , )

    F D F D

    q r p

    D F rD

    r

    q r p p r pr p r p r p l r p r p q r p p r p r p l

    r r

    l r p r p q r pr p l

    p r p q r p

    + + + + =

    +

    = +

    %% % %

    %

    %

    Example with logistic response

  • 8/10/2019 Lyn Thomas-Book

    57/85

    p g p

    a=4, b=32, rF =0.05, lD =0.5 and c=50

    54.754.711.91.00

    52.758.213.00.9950.561.314.20.98

    45.766.516.60.96

    40.270.619.10.94

    28.076.524.40.94.584.238.50.8

    0.287.453.30.7

    0.00388.468.60.6

    0.00000987.384.60.5

    Take probability from

    equivalent risk free logit

    response rate function

    Take probability

    q(r ) as %

    Optimal interest

    rate r as %

    Probability of being

    Good, p

    ( )( ) ln

    1 1 ( )

    a br cp

    responsea br cp

    e q rq r a br cp s

    e q r

    = =

    + Risky Response rate

  • 8/10/2019 Lyn Thomas-Book

    58/85

  • 8/10/2019 Lyn Thomas-Book

    59/85

  • 8/10/2019 Lyn Thomas-Book

    60/85

    Profit scoring and pricing

  • 8/10/2019 Lyn Thomas-Book

    61/85

    g p g

    Profit scoring involves much more of organisationthan default based scoring

    Risk based pricing needs much more carefulmodelling and parameter estimation

    Adverse selection

    Cannibalisation

    Other features might affect response rate not justprice ( interest rate charged)

    Dynamic Price modelling will come

    .proved successful in airlines, hotels, car rentals

    Has arrived in consumer credit

    HBOS claim benefits of 7 million per year already

    Storing up trouble: Data cleaningand parameter estimation

  • 8/10/2019 Lyn Thomas-Book

    62/85

    and parameter estimation

    Reject inference :

    sample biased because of those rejected in the past

    Well established problem with controversial but standard techniques

    used by industry resurgence of interest, new ideas suggested and old approaches

    revisited. Some ideas coming from economics literature

    Surely 1 in n(s) must be satisfactory compromise

    Drop/withdrawal(churn) inference

    this group can be 2 to 5 times larger than reject group

    should they be in the sample /Could make product attractive to them

    Policy inference

    Customer scores used in more operating decisions, will affectsubsequent performance of customer, including default risk

    Can one ( How to ) construct what would performance/risk have beenunder vanilla operating policy

    New Basel Capital Accord(started parallel implementation Jan 1 2007

  • 8/10/2019 Lyn Thomas-Book

    63/85

    started for real 1 January 2008) Basel committee of banking regulators ( Fed etc) required banks to set aside

    8% of loans ( capital requirement) to cover risks on losses.

    New system based on using banks internal risk rating systems.

    Risks split into market, credit and operational. Capital set aside to cover each

    For credit risk, minimum capital requirement can be set using internal ratingsbased (IRB) model as well as standard ( fixed %) model

    In IRB models, segment portfolio of loans and for each segment give

    PD ( long run average probability of default in next 12 months)

    LGD (downturn loss given default)

    EAD ( expected exposure at default)

    Used in Basel formulae to calculate capital needed to cover UL (unexpected lossdue to credit risk).

    EL ( expected loss due to credit risk) should be covered by provisions

    For customer lending IRB is credit scoring

    Basel forces scores to forecast accurately not just rank accurately.

    Credit risk weighted assetsfor corporate and retail exposures

  • 8/10/2019 Lyn Thomas-Book

    64/85

    for corporate and retail exposures

    Capital needed is

    where N is Cumulative Normal Distribution, N-1 is inverse distributionand R is correlation

    Only covers unexpected risk; so if R=0, K=0 ; if R=1, K=LGD(1-PD)

    +

    =

    35

    35

    35

    35

    1

    1116.0

    1

    10.03R

    e

    e

    e

    e PDPD-

    +

    +

    =

    b5.11

    2.5)b-(M1)999.0(1

    2/1

    1

    )(12/1

    R-1

    1NLGD.KCapital PDN

    R

    RPDN

    Retail exposuresM=1 ( maturity term disappears)For Mortgages R=0.15

    For Revolving R=0.04For other retail

    Corporate exposures b=(.11852-.05478ln(PD))2

    +

    =

    50

    50

    50

    50

    1

    1124.0

    1

    10.12R

    e

    e

    e

    e PDPD-

  • 8/10/2019 Lyn Thomas-Book

    65/85

    Updated Basel capital requirements for K to cover UL ( LGD=0.5)

    0

    0.05

    0.1

    0.15

    0.2

    0.25

    0 0.2 0.4 0.6 0.8 1 1.2

    PD

    K

    residential revolving other retail corporate

    Impact of Basel Accordon credit scoring development

  • 8/10/2019 Lyn Thomas-Book

    66/85

    on credit scoring development

    Need to estimate calibration of scorecard not justdiscrimination

    Small numbers of defaults(180 days overdue) meanstake all defaults not just ones in 12 month period

    Estimate risk with data of different time periods

    Coxs proportional hazards models

    Loss given default ( or Recovery Rate) completely newproblem where outcome is mix of

    decisions by lenders ( collect in house/use agent/sell off debt)

    uncertainty of borrower willing/able to pay back

    Stress testing and need for long run average PD meanshave to incorporate economic variables into defaultmodels or at least the dynamics of the default models

    Mimic corporate credit risk models??

    Problems with validatingLow Default Portfolios (LDP)

  • 8/10/2019 Lyn Thomas-Book

    67/85

    Low Default Portfolios (LDP)

    Problems:

    Very few defaults to use in back testing soone extra default makes a huge difference

    Procyclicality will be more obvious

    Subprime market is always in recession

    Solutions:

    Use as much data as you can

    Make prudent assumptions

    Low default portfolios:Pluto and Tasche (2005)

  • 8/10/2019 Lyn Thomas-Book

    68/85

    No defaults, assumption of independence

    Use largest set possible

    take PD value whose lowest confidence limit is 0

    Rating grades A, B, C with nA, nB, nC obligors

    Assume borrower ranking to be correct

    PDA PDB PDC

    Most prudent estimate of PDAobtained under assumption

    PDA=PDC, or PDA=PDB=PDC

    Determine confidence region for PDAat confidence level (e.g. =90%)

    Confidence region is values of PDAsuch that probability of not observingany default is higher than 1-

    Confidence limits;

  • 8/10/2019 Lyn Thomas-Book

    69/85

    usual and P and T

    PD from actual data

    Best estimateof PD

    Lowe

    r-confidencelimitofPD

    Upper-confidencelimitofPD

    Ifestimatetrue

    wha

    tcouldhappenin

    ofcases

    Low

    err-confidenceli

    mitofPD

    Ifestimatetrue

    wha

    tcouldhappenin

    ofcases

    Highest value of PDThat gets lower

    limit to agreewith actual data

    Low Default Portfolios:Using survival analysis directly

  • 8/10/2019 Lyn Thomas-Book

    70/85

    g y y

    Problem; How to calculate PDas default in first 12 months if

    using data including defaultsand bads at any time ?

    Answer 1:Use survival analysis-proportional hazard models -to

    estimate when loan willdefault/bad rather thanprobability it goes bad in 12months? Take data on whole

    portfolio for as long as you haveit

    Survival analysis

    Use Coxs proportional hazardmodels to estimate hazard unction

    h(s,x) for loan with characteristics x

    So obtain PD for 12 month time

    horizon

    ( )

    12

    0

    ( , )

    12

    h s ds

    S e

    = x

    Modelling Loss Given Default

  • 8/10/2019 Lyn Thomas-Book

    71/85

    Very little work done on modelling this until mid 90s

    Regression models used for LGD corporate loan models

    Modelling approaches Regression on type of loan/company, economic conditions

    ( needs lots of data points)

    Segment and use historic averages ( need lots of defaults) Build model of collections process

    For consumer loans, modelling collection process onlyoption

    LGD models in consumer lending has mix of randomevents ( defaulter will not pay, cannot pay) and decisions

    by lender (what collection strategy to use)

    Collections strategy

  • 8/10/2019 Lyn Thomas-Book

    72/85

    Strategic level

    Collect in house 0 LGD 1 ( though can exceed bothbounds)

    Use agency ( who keep 40% collected) 0.4 LGD 1

    Sell off debt ( say at 5p in 1) LGD = 0.95

    Operational level

    What sequence of contacts to make

    Telephone contact possible?

    Arrange repayment schedule Letters nice

    Letters nasty

    Legal proceedings

    LGD model for credit cardsDecision tree approach

  • 8/10/2019 Lyn Thomas-Book

    73/85

    Default

    No trace Trace

    Agent Sell off In house Agent

    Satisfactory

    Sell off

    SatisfactoryNot

    satisfactory

    Sell off

    SatisfactoryNot

    Satisfactory

    Agent Sell off

    SatisfactoryNot

    satisfactory

    Not

    satisfactory

    Sell off

    Second

    agent Sell off

    SatisfactoryNot

    satisfactory

    Sell off

    Distribution of LGD for in house

  • 8/10/2019 Lyn Thomas-Book

    74/85

    collections

    - 0 . 1 0 0 0 . 0 7 5 0 . 2 5 0 0 . 4 2 5 0 . 6 0 0 0 . 7 7 5 0 . 9 5 0 1 . 1 2 5

    LGD

    D

    e

    n

    s

    i

    t

    y

    Actual LGD can stray outside 0 to 1LGD has spikes at LGD =1 and LGD =0For agent/sold debt, spike at LGD=1 predominates

    Distribution 0

  • 8/10/2019 Lyn Thomas-Book

    75/85

    Need to model as a mixed distribution; Here seemed to bethree classes:

    Agree and abide by repayment schedule LGD=0

    Pay back reduced amount LGD

  • 8/10/2019 Lyn Thomas-Book

    76/85

    Corporate credit risk models have been developed for last decade and some include economicparameters which can be used for stress testing

    Corporate credit risk models split into four classes

    Structural models

    Assume companies default when debts exceed assets ( Merton model)

    Try to model the dynamics of their assets

    Basel formula based on very simple version of this

    Reduced form models

    Cuts to the chase when will firms default as function of economic conditions

    Hazard( survival analysis) or intensity models build a model of hazard rate hi (t) chance firm I will default at t given not done so before

    Markov chain rating based models. Models how firms credit ratings change dynamically withone rating being defaulted

    Actuarial models

    Models at segment level not individual level. Estimated default rate and LGD rate using actuarial

    distributions and historic parameter estimates.

    Very few assumptions so can be used in retail area but where are economy variables in it. Scorecard based

    z scores, less successful as consumer credit scoring and no economic effects in them

    Can we use these models to build stress tests for consumer loan portfolios ?

    No. Assumption and data available are so different but appoach might work.

    Introduce economic variables

  • 8/10/2019 Lyn Thomas-Book

    77/85

    into consumer credit risk models

    Introducing economic variables into credit risk modelsallows

    Estimating Long run average PD for Basel

    Stress testing required by Basel

    Ways of building correlation between defaults of different loans

    Pricing portfolios for securitization

    Comparison of retail and corporaterisk environments and models

  • 8/10/2019 Lyn Thomas-Book

    78/85

    corporate loans

    Objective is to price bonds

    well established market

    market price continuously available

    bonds only infrequently withdrawn

    contingent claim model says default iswhen loans exceeds assets

    Correlation of defaults related tocorrelation of assets related tocorrelation of share prices

    Economic conditions built into models

    consumer loans

    Objective is to rank borrowers

    no established market-only occasionalsecuritization sales

    no price available as no public sales

    consumers often leave lender (attrition)

    default caused by cash flow (consumerhas no idea of assets nor can realise

    them)

    no share price surrogate for correlationof defaults

    Economic conditions not in models

    Corporate credit risk modelling

  • 8/10/2019 Lyn Thomas-Book

    79/85

    Corporate credit risk models include

    Structural models

    Assume default when debts exceed assets ( Merton model)

    Model dynamics of their assets ( Basel formula simple version)

    Reduced form models

    Default mode: Hazard( survival analysis) or intensity models

    build a model of hazard rate Mark to Market: Markov chain rating based models. Actuarial models

    Models at segment level not individual level. Estimate PD and LGDusing actuarial distributions/ historic parameter estimates.

    Factors ( risk) used to give dynamics and correlations

    it

    2

    t 1

    i,t 1 1

    Basel Model: R R is assets of firm; c is loan;

    (1 ) F is systemic factor ( world economy); U is idiosyncratic

    c z is economic factors

    ( , ) Pr{ 1| , ) (

    t

    it t it

    t

    t it t t

    c

    R wF w U

    z

    p f z D f z N z

    Value of credit worthiness

    Translates into behavioural score above debt cut-off Model dynamics of behavioural score

    Affordability

    Repay if cash flow means can afford repayment

    Model dynamics of cash flow

    Consumer credit Default Modereduced form models

  • 8/10/2019 Lyn Thomas-Book

    81/85

    Extend Cox Proportional Hazard Models to get these

    If t is time since loan started, the hazard of default at

    time t for a person iwith economic conditions EcoVar(t)and behavioural scoreBehScr(t) is

    ( ) ( ) ( )( )iii VintageEcoVarBehScri

    ethth ++= tt )(

    0

    Cox Regression estimates , and and then use Kaplan-Meier form

    of distribution function to recover baseline hazard function

    Idiosyncratic Risk

    IdiosyncrIdiosyncr

    atic Riskatic RiskSystemic

    Risk

    SystemicSystemic

    RiskRisk

    Months

    on BooksFactor

    MonthsMonths

    on Bookson BooksFactorFactor

    VintageFactor

    VintageVintage

    FactorFactor

    Consumer credit reduced form

  • 8/10/2019 Lyn Thomas-Book

    82/85

    mark to market model:Markov chain approach

    Think of rating grades as states of Markov Chain

    So state is score band or default status ( 0,1,2,3+ overdue) At least one state corresponds to default

    Markov assumption is states of system describes allinformation concerning credit risk of customer

    Estimate transition probabilities of moving from state ito state j in next time period

    Use logistic regression to get transition probabilities tobe functions of economic variables

    In stress test choose the economic variables for a stressedscenario ( scenario could last over several periods)

    Are securitization problems

  • 8/10/2019 Lyn Thomas-Book

    83/85

    due to credit scoring? Securitized products were priced top down

    What was market paying last week

    Assumption all products were essentially the same ( or could be madeso)

    Little investigation of borrowers credit scores and individual product

    features

    No model of correlation between default risks

    Previous portfolio credit risk models would allow a bottom up approach

  • 8/10/2019 Lyn Thomas-Book

    84/85

    US Sub prime mortgage crisis Other half of the disaster

    Main reason was conspiracy of optimism

    Lenders : scores were low but no one had defaulted for ages

    Borrowers: house prices will go up, so can refinance before payments gethigh

    Some lessons for scoring

    Products had hike in repayments

    Allow for affordability in default probability (recall pricing)

    Survival analysis (allow for rate terms)

    If scorecard known, borrowers will work the system

    Scorecard doctors guaranteed increase FICO score by 150

    Conclusions

  • 8/10/2019 Lyn Thomas-Book

    85/85

    Profit scoring, pricing and customizing products , credit risk ofportfolios of credit loans are just a few of the new problems in the area.

    Still exciting area where many different statistical, probability and OR

    techniques- Markov chains, survival analysis, Support vector machines,Brownian processes prove very useful

    After 50 years, research in credit scoring is as vital as ever, and willcontinue.

    All progress is based upon a universal innate desire of every organism

    to live beyond its income. (Samuel Butler)