21
1 THE USE OF GENETIC ALGORITHMS TO OBTAIN HIGHER RETURNS ON INVESTMENT STRATEGIES BASED ON FINANCIAL STATEMENT ANALYSIS: A TEST IN BRAZIL 1. Introduction Investing in value stocks is a recurring subject in literature (Graham and Dodd, 1934; Fama and French, 1995). There is large evidence particularly on developed markets, that portfolios of high book-to-market (HBM) stocks outperform portfolios of low book-to-market stocks. Rosenberg, Reid and Lansteisn (1984), Fama and French (1992, 1995) and Lakonishok et al. (1994), agree on the evidence that the book-to-market ratio is strongly and positively correlated to future stock performance. Abarbanell and Bushee (1997) document that an investment strategy based on financial signals help investors to earn significant abnormal returns. Concerning specific accounting signals, Sloan (1996) finds evidence that firms with higher amounts of accruals underperform in the future. Piotroski (2000) aggregates the HBM effect to financial statement analysis and shows that the mean return earned by a HBM investor can be increased by at least 7.5% annually through the selection of financially strong HBM firms. Piotroski (2000) proposes a strategy based on financial statements analysis to select value stocks with the potential to outperform the market. The strategy consists in selecting high book-to-market (HBM) companies and rank them according to a system of points based on financial signals. When the signaling is positive (or “good”), the indicator of this signal equals 1; when it is negative (or “bad”), the indicator equals 0. The sum of all indicators gives the score achieved by the company. Nine financial signals and their respective indicators compose the strategy and were chosen to assess profitability, capital structure and operational efficiency: return on assets, variation of the return on assets, accrual, free cash flow, liquidity variation, leverage variation, gross margin variation, assets turnover and public offer. Lopes and Galdi (2007) and Galdi

THE USE OF GENETIC ALGORITHMS TO OBTAIN HIGHER …

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: THE USE OF GENETIC ALGORITHMS TO OBTAIN HIGHER …

1

THE USE OF GENETIC ALGORITHMS TO OBTAIN HIGHER RETURNS ON

INVESTMENT STRATEGIES BASED ON FINANCIAL STATEMENT

ANALYSIS: A TEST IN BRAZIL

1. Introduction

Investing in value stocks is a recurring subject in literature (Graham and Dodd, 1934; Fama

and French, 1995). There is large evidence particularly on developed markets, that portfolios of

high book-to-market (HBM) stocks outperform portfolios of low book-to-market stocks.

Rosenberg, Reid and Lansteisn (1984), Fama and French (1992, 1995) and Lakonishok et al.

(1994), agree on the evidence that the book-to-market ratio is strongly and positively correlated

to future stock performance.

Abarbanell and Bushee (1997) document that an investment strategy based on financial

signals help investors to earn significant abnormal returns. Concerning specific accounting

signals, Sloan (1996) finds evidence that firms with higher amounts of accruals underperform

in the future. Piotroski (2000) aggregates the HBM effect to financial statement analysis and

shows that the mean return earned by a HBM investor can be increased by at least 7.5%

annually through the selection of financially strong HBM firms. Piotroski (2000) proposes a

strategy based on financial statements analysis to select value stocks with the potential to

outperform the market. The strategy consists in selecting high book-to-market (HBM)

companies and rank them according to a system of points based on financial signals. When the

signaling is positive (or “good”), the indicator of this signal equals 1; when it is negative (or

“bad”), the indicator equals 0. The sum of all indicators gives the score achieved by the

company. Nine financial signals and their respective indicators compose the strategy and were

chosen to assess profitability, capital structure and operational efficiency: return on assets,

variation of the return on assets, accrual, free cash flow, liquidity variation, leverage variation,

gross margin variation, assets turnover and public offer. Lopes and Galdi (2007) and Galdi

Page 2: THE USE OF GENETIC ALGORITHMS TO OBTAIN HIGHER …

2

(2008) adapted this strategy to the Brazilian market characteristics, also reaching significantly

higher-than-average returns.

The strategy above assumes that all indicators are equally important to the composition of

the score and thus, have similar relevance to the market. Conversely, the literature is extensive

showing the different relevance of financial indicators (Sloan 1996, Ou and Penman 1989,

Abarbanell and Bushee 1998, to mention some) and that this relevance varies within countries

(Bartov, 2001). The objective of this paper is to contribute to the identification of different

importance levels of financial indicators: if there is a set of weights λi for the indicators Fi that

optimizes the selection of stocks, the return of the strategy would be even higher than those

found in the above mentioned researches. Besides, it would empirically show which financial

figures are taken more seriously by the market.

Assigning weights to financial indicators to form a scalar value is not a new idea. For

example, Ou and Penman (1989) propose this method to predict stock returns and use a logit

model to estimate weights for the financial indicators. In this paper however, a computational

technique often used for optimization called Genetic Algorithms (GA) was applied to find the

set of weights λi that maximizes the return of a portfolio.

Our paper adds to the previous literature by proposing more discussion about the equally

weighted metric used by the most of the quoted studies. Selecting financial signals to

implement a financial statement analysis is a discretionary task, as it is to use the same weight

to these signals. Our paper analysis if a modern technique based on genetic algorithms selection

can improve returns of financial statement based investment strategies. If it is arbitrary to use

the same weight to all signals, why not try to optimize the result of an investment strategy by

allowing the weights to change? Could the weights reveal some peculiarity that could be used

in a financial statement analysis considering the characteristics of the market that firms are

immersed?

Page 3: THE USE OF GENETIC ALGORITHMS TO OBTAIN HIGHER …

3

In the same vein that Lopes and Galdi (2007) we use Brazil, a prominent emerging market,

to test our investment strategy considering that the weights obtained by an optimization

strategy could be significantly different when comparing the implementation of the strategy in

a developed market and in a developing one.

Our results show that the obtained weights yield an increase of the market-adjusted return

from 36% to 76% (1-year buy-and-hold strategy) and from 68% to 112% (2-years buy-and-

hold strategy) when comparing the equally-weighted score to the optimized weighted score.

Our analysis show that the more relevant signals are change in cash and equivalents and change

in firm’s current firm-year turnover. These findings can raise the discussion about the

liquidity´s importance on financial statement analysis of firms immersed in a developing

country like Brazil, where the cost of capital is significantly high and most of the public firms

raise capital by issuing debt.

The remaining of the paper is organized as follows. Section 2 reviews the adapted strategy

used by Lopes and Galdi (2007) to the Brazilian market. Section 3 presents the main features of

genetic algorithms procedures. Section 4 presents our experiment description and shows how

the strategy was implemented. Results are presented in Section 5. Section 6 concludes the paper.

2. From Fundamental Analysis to BrF_SCORE

The premise that the financial characteristics of a company may be linked to its real value

is the central theme of fundamental analysis (Damodoran 2002). There is a vast literature

showing several ways of how interpreting these characteristics may lead to the composition of a

portfolio that outperforms the market. Particularly, the selection of HBM companies has proven

to be a successful strategy (Fama and French 1992; Lakonishok, Schleifer, Vishny 1994).

Nonetheless Piotroski (2000) claims that this superior result relies upon the strong performance

of a few companies, tolerating the feeble performance of most of the others. He proposes a

strategy based on accounting numbers that aims, by means of the identification of “positive”

Page 4: THE USE OF GENETIC ALGORITHMS TO OBTAIN HIGHER …

4

signals, to select among these HBM companies only those that would outperform the market.

He proposes the observation of nine financial figures that assess a company regarding

profitability, capital structure and operational efficiency: return on assets, variation of return on

assets, accrual, free cash flow, liquidity variation, leverage variation, gross margin variation,

assets turnover and public offer. The strategy is simple: each variable is associated to an

indicator, equal to one when the value of the variable is recognized as “good” to the

performance of the company and equal to zero when it is said to be “bad”. The sum of all nine

indicators results a score (F_SCORE). Each year, after all companies have disclosed their

results, the higher quintile of HBM firms is identified and, among these firms, those with

higher F_SCORE are chosen to be kept in the portfolio for 1 and 2 years (buy-and-hold). The

application of this technique in the US market yielded average adjusted returns (above market

average) of 13.4% and 28.7% for 1 and 2 years, respectively.

Lopes and Galdi (2007), adapting Piotroski’s (2000) proposal to the characteristics of the

Brazilian market, changed the F_SCORE and created the BrF_SCORE:

BrF_SCORE = F_ROA + F_∆ROA + F_ACCRUAL + F_CF + F_∆LIQUID + F_∆LEVER + F_∆MARGIN + F_∆TURN + EQ_OFFER

where:

F_ROA = 1 if ROA > 0, zero otherwise. ROA is defined as net income scaled by beginning-of-the-year total assets;

F_∆ROA = 1 if ∆ROA > 0, zero otherwise. ∆ROA is defined as current firm-year ROA less the previous firm-year ROA;

F_ACCRUAL = 1 if ACCRUAL < 0, zero otherwise. ACCRUAL is defined as the variations in

current assets (except cash and equivalent) less the variations in current liabilities (except short term debt). This value is scaled by beginning-of-the-year total assets. (This calculation was

applied because Brazilian accounting standards did not require cash flow statements until

2008);

F_CF = 1 if CF > 0, zero otherwise. CF is defined current firm-year change in cash and

equivalents, scaled by beginning-of-the-year total assets;

F_∆LIQUID = 1 if ∆LIQUID > 0, zero otherwise. ∆LIQUID measures the changes in the

firms’s current ratio in relation to previous year. The current ratio is defined as the ratio of

current assets to current liabilities at the end of the year;

F_∆LEVER = 1 if ∆LEVER <0, zero otherwise. ∆LEVER is the change in the ratio of total

gross debt to total assets in relation to prior year;

Page 5: THE USE OF GENETIC ALGORITHMS TO OBTAIN HIGHER …

5

F_∆MARGIN = 1 if ∆MARGIN > 0, zero otherwise. ∆MARGIN is the change in firm-year current gross profit scaled by total sales (gross margin ratio) compared to previous year;

F_∆TURN = 1 if ∆TURN > 0, zero otherwise. ∆TURN is the change in firm’s current firm-year

Sales scaled by beginning-of-the-year total assets (asset turnover ratio);

OF_PUB = 1 if the firm did not issue equity in the year preceding portfolio construction, zero

otherwise.

Similar to previous research, this one has also shown that the use of this technique in the

Brazilian market yielded higher adjusted returns1 for 1 and 2 years periods.

3. Genetic Algorithms

Genetic Algorithms (GA) are an Evolutionary Computation (EC) technique, one of the

Artificial Intelligence branches that proposes a new paradigm, alternative to conventional data

processing, in which previous knowledge about problem solving is not required to find a

solution. Its mechanism is based on Charles Darwin’s theory of natural selection, where the

best adapted individuals have higher chance to reproduce, leading to the evolution of the

population (Bittencourt, 1998).

When using GAs, the problem to be optimized is the environment and the population is a

set of individuals, each of them a possible solution to the problem. Individuals are composed by

genes, the solution’s parameters, and their capability to solve the problem is evaluated by

means of a fitness function. As an example, given a situation where one wants to optimize fuel

consumption of transporting a load between two known points, where it is possible to define

the load’s weight and the vehicle’s average speed, each individual would be composed by a

couple of genes (weight, speed) and would be assessed by the fuel consumption that this

combination would produce. GA are usually applied to the solution of search and optimization

problems, specially when they are not differentiable, present multiple local optima or its

mathematical model is too complex or simply does not exist.

Figure 1 represents GA’s mechanism using as example individuals with eight genes. An

initial population (a) is assessed by the fitness function (b). It sets the values 24, 23, 20 and 11

Page 6: THE USE OF GENETIC ALGORITHMS TO OBTAIN HIGHER …

6

to the four individuals: the better the individual, the higher the value assigned and the greater

chances it will have to reproduce. In this particular example, the probability an individual has

to reproduce is directly proportional to the result of the fitness function. In (c) couples are

randomly selected for reproduction following probabilities in (b). In this example, one of the

individuals was selected twice, while another one was not even once selected. Still in (c), the

crossover points are also randomly selected, creating in (d) the next generation individuals.

This mechanism consists in combining the first part (before the crossover point) of a genitor

with the later part of the other genitor: the first descendent of the first couple inherited its first

three genes from the gray individual and the five last from the white individual, while the

second descendent inherited its first three genes from the white individual and the five last from

the gray individual. Finally, in (e) each individual can suffer random mutation at one of its

genes, producing genes that could be not present in the initial population (Russel and Norvig,

2003).

2 4 7 4 8 5 5 2 24 31% 3 2 7 5 2 4 1 1 3 2 7 4 8 5 5 2 3 2 7 4 8 1 5 2

3 2 7 5 2 4 1 1 23 29% 2 4 7 4 8 5 5 2 2 4 7 5 2 4 1 1 2 4 7 5 2 4 1 1

2 4 4 1 5 1 2 4 20 26% 3 2 7 5 2 4 1 1 3 2 7 5 2 1 2 4 3 2 2 5 2 1 2 4

3 2 5 4 3 2 1 3 11 14% 2 4 4 1 5 1 2 4 2 4 4 1 5 4 1 1 2 4 4 1 5 4 1 7

(b)(a)

Initial Population

(c)

SelectionFitness Function

(d)

Crossover

(e)

Mutation

Figure 1 – Genetic Algorithm. Initial population (a) is assessed and ranked (b), selecting

couples for reproduction (c). The generated descendents (d) may have their genes affected

by mutation (e).

Source: Russell and Norvig (2003)

In a good GA implementation it is said that the population must “converge”, or

successively evolve and become more homogeneous, while the evaluation of the best

individual converges to the global optimum. However, given the characteristics of the method,

it is not sure that a global optimum will be found. Yet, “acceptably good” optima can be

usually found in acceptable time (Beasley, Bull, Martin 1993).

1 Using Bovespa’s IBrX-100 index as benchmark.

Page 7: THE USE OF GENETIC ALGORITHMS TO OBTAIN HIGHER …

7

4. Experiment Description

Non financial or insurance firms listed at Bovespa between 1997 and 2008 were selected.

Data was extracted from Economatica database and refer to the period from 1997 to 2006

(financial reports) and from 1997 to 2008 (1 and 2 years market adjusted returns). For each

firm-year, 1 and 2 years buy-and-hold market adjusted returns are calculated using the first

price information available in the 100 days following May 1st. Bovespa’s index IBrX-100 was

used as benchmark to market-adjust the returns.

In Brazil, both common and preferred shares are considered as equity and in many cases

there is even more than one type of preferred shares. The most liquid2 shares for each firm-year

were selected, excluding firm-year missing data to calculate book-to-market indexes, indicators

or returns.

Then, the HBM portfolio for each year was composed by the firms in the top quintile of

BM, resulting 370 observations. The cut-off BM ratios are presented in Table 1 and the

descriptive statistics in Table 2. Finally, the 1% higher and lower firm-year returns (3 for 1-

year return and 3 for 2-years return) were excluded, resulting 367 firm-year observations.

Table 1 – Cut-off BM ratios

One can notice in Table 1 the reduction of the BM ratio since 2003 caused by the market

value growth of firms listed on Bovespa and, possibly, indicating the formation of a bubble that

ended up bursting at the end of 2008.

2 Liquidity is measured by the formula )/()(100 VvNnPpLa ⋅⋅⋅⋅⋅= , where p is the number of days in

which the share was traded at least once in the chosen period, P is the number of days in the chosen period, n is the number of times the share was traded in the chosen period, N is the total number of trades

with all shares in the chosen period, v is the share’s trade volume (in cash terms) in the chosen period and

V is the total volume traded of all shares in the chosen period.

Year 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006

BM 3.68 4.94 2.91 2.95 2.91 2.98 2.13 1.92 1.47 1.06

Page 8: THE USE OF GENETIC ALGORITHMS TO OBTAIN HIGHER …

8

Table 2 – Descriptive Statistics

Descriptive Statistics of HBM firms’ financial indicators (Firm-year observations from 1997 to 2006)

Mean Standard

Deviation Min Q1 Median Q3 Max n

BM 5.4763 4.9631 1.0581 2.9939 4.2462 6.5059 61.0257 370

Mkt Cap

(R$ thousand) 622,401 3,092,327 510 10,685 31,138 112,142 33,247,392 370

Assets

(R$ thousand) 3,769,002 16,383,111 19,717 117,280 332,411 1,057,393 121,891,641 370

Share Liquidity 0.0979 0.4271 0.0000 0.0000 0.0004 0.0059 4.0458 370

ROA -0.0076 0.0820 -0.4477 -0.0345 0.0040 0.0240 0.5284 370

CF -0.0018 0.0807 -0.9132 -0.0088 0.0003 0.0144 0.3267 370

∆ROA -0.0027 0.1075 -0.6643 -0.0306 -0.0031 0.0270 0.8952 370

Accrual -0.0199 0.1069 -0.4489 -0.0657 -0.0269 0.0154 0.7450 370

∆Liquid -0.1771 2.8207 -47.3857 -0.2087 -0.0239 0.1970 7.9159 370

∆Lever -0.0035 0.0905 -0.4658 -0.0270 0.0000 0.0322 0.4371 370

∆Turn 0.0038 0.2053 -0.8631 -0.0592 0.0035 0.0777 1.3805 370

∆Margin 0.0067 0.1442 -1.0611 -0.0346 0.0000 0.0427 1.5088 370

BM – book-to-market ratio, or end-of-the-year equity divided by end-of-the-year Mkt. Cap. Mkt. Cap.(R$ thousand) - share price at the end of the year multiplied by the number of outstanding shares. Assets (R$ thousand) – firm’s total assets at the end of the year. Share Liquidity – index representing the liquidity of shares traded at Bovespa. The higher the value, the more liquid

the share is. Defined as: )/()(100 VvNnPpLa ⋅⋅⋅⋅⋅= , where p is the number of days in which the share was

traded at least once in the chosen period, P is the number of days in the chosen period, n is the number of times the share was traded in the chosen period, N is the total number of trades with all shares in the chosen period, v is the share’s trade volume (in cash terms) in the chosen period and V is the total volume traded of all shares in the chosen period. ROA – return on assets, or net profit divided by the firm’s beginning-of-the-year total assets.. FCF – free cash flow, defined as the variation of cash and equivalent divided by firm’s beginning-of-the-year total assets.

∆ROA – variation of the return on assets, or current firm-year ROA less previous firm-year ROA. Accrual – variations of current assets (except cash and equivalent) less the variations of current liabilities (except short-term debt) less depreciation in the period. This value is scaled by firm’s beginning-of-the-year total assets (this calculation was applied because Brazilian accounting standards did not require cash flow statements until 2008). ∆Liquid – variation of firm’s current ratio in relation to previous year. The current ratio is calculated dividing firm’s current assets by firm’s current liabilities in the end of the year. ∆Lever – leverage variation in relation to previous year. Leverage is calculated as the total debt divided by firm’s asset in the beginning of the year. ∆Turn – variation of assets turnover. Assets turnover is calculated as the firm’s net revenue divided by total assets in the begining of the year. ∆Margin – variation of firm’s current gross margin scaled by beginning-of-the-year assets. Gross margin is calculated dividing firm’s gross profit by sales.

For the sake of simplification the original method was modified: the indicator EQ_OFFER

was excluded from the model since, in the analyzed period, very few firms listed at Bovespa

raised capital issuing shares. Rather, companies financed themselves mostly by means of debt,

following a more common way to raise funds in the Brazilian market. Besides, Galdi (2008)

Page 9: THE USE OF GENETIC ALGORITHMS TO OBTAIN HIGHER …

9

shows that this indicator shows a significantly smaller correlation to returns than the other

indicators.

Table 3 shows the returns of the strategy based on BrF_SCORE for the set of firm-year

observations considered. One can observe that the returns are lower, for instance, than those

obtained by Lopes and Galdi (2007): 1-year (2-years) accumulated return of 12.9% (23.9%)

above all HBM firms, against 20.9% (77,7%) obtained by the quoted research. Besides, one can

also notice that the difference between high-score firms’ returns and the average return is not,

contrary to the observed by Galdi (2008), statistically significant. This difference may be due to

the time frame difference of the two samples (our sample contemplates also 2005 and 2006

financial reports) or due to the way the returns were calculated (tolerance to find the first

quotation after May 1st). Nevertheless, this does not invalidate our results, for its objective is to

check whether the optimization of weights yields returns higher than those obtained with

uniform weights (BrF_SCORE). As mentioned above, the effectiveness of BrF_SCORE was

demonstrated by previous research (Piotroski, 2000 and Lopes and Galdi, 2007).

In this experiment an individual is composed by eight genes, each of them being the weight

for one of the indicators. The individual is assessed by the fitness function, defined as the

yearly average adjusted return between 1997 and 2004. In this way, the individuals whose

genes (or weights) lead to a better assessment (higher return) will have higher chances to

reproduce, spreading its genes and generating better adapted descendents, able to select shares

that will produce higher returns.

It was set an initial population of 100 individuals that reproduce up to 1000 generations.

The algorithm is also interrupted if no new individual with a better fitness result arises in 500

generations. The initial population is randomly defined, with its genes ranging from +0.5 to -

0.5, and then evolutes according to the reproduction and mutation mechanisms. No restrictions

are applied to the range of values a gene can assume (i.e. negative values are allowed).

Different experiments were carried out for 1-year and 2-years adjusted returns.

Page 10: THE USE OF GENETIC ALGORITHMS TO OBTAIN HIGHER …

10

A new score (O_BrF_SCORE) was created to combine the weights into the strategy:

∑=

i

ii FSCOREBrFO λ__

The strategy based on BrF_SCORE assigns to each firm-year a score that ranges from 0 to

9 and selects those with high score (7 to 9) to form the portfolio. However, the values

O_BrF_SCORE can assume do not have a fixed range, once the weights may assume any value

during the optimization process. Therefore, “high score” is defined as the higher third of

O_BrF_SCORE for a given set of weights (or a given individual). High score firm-year

observations are then selected to compose the portfolio of that year. It was arbitrated that at

least five companies would compose the portfolio. If less then five companies achieve a high

score in a given year, the subsequent companies in the O_BrF_SCORE ranking would

complete the portfolio.

Despite the highest adjusted returns (1% of HBM firm-year observations) were excluded

from the sample, there were still observations whose adjusted return was still much higher than

the average. This represents a robustness problem to the optimization algorithm, once it tends

to “choose” weights so that these companies remain in the portfolio, even if these weights end

up picking also low-return firms. This situation was by-passed imposing a limit to the 1 and 2-

years adjusted returns of the 2% highest-return firm-year observations (n=6), being this limit

equal to the sixth highest adjusted return (for 1 and 2 years). It is important to mention that this

limitation was imposed only to the set of data used for weight optimization, while the final

result obtained with the optimized weights was calculated using the original adjusted returns.

The mutation mechanism, described in the previous section, exists to help the algorithm

escape from a local optimum, but it does not ensure finding a global optimum. Thus, in the

attempt to increase the chances to find a better result, the optimization process was repeated 20

times for each experiment and then, among these 20 set of weights, the best one was selected.

At last, the weights were normalized (i.e. scaled by the highest weight in modulus) to ease

the comparison between different sets of weights.

Page 11: THE USE OF GENETIC ALGORITHMS TO OBTAIN HIGHER …

11

Table 3 – Returns of the strategy based on BrF_SCORE

This table presents in panels A and B, respectively, the 1 and 2-years buy-and-hold returns based on the financial signals extracted from the financial statements of HBM firms. Low-score firms are those with BrF_SCORE from 0 to 2, while high-score firms are those ranging from 6 to 8.

Panel A – 1-year market adjusted return (firm-year observations from 1997 to 2006)

Mean Percentile

10%

Percentile

25% Median

Percentile

75%

Percentile

90% n

All firms 0.2350 -0.4728 -0.2951 0.0309 0.4922 1.1875 367

BrF

_S

CO

RE

0 -0.2864 - -0.7568 -0.2855 0.1822 - 4

1 -0.0180 -0.5038 -0.3618 -0.2888 0.2577 0,7810 20

2 0.0209 -0.5491 -0.3750 -0.2294 0.3758 0,7082 41

3 0.3727 -0.3675 -0.1948 0.0938 0.6272 1,2884 57

4 0.0966 -0.5385 -0.2820 -0.0153 0.4065 0,7158 84

5 0.3494 -0.4121 -0.2277 0.1392 0.7694 1,5427 70

6 0.3562 -0.4728 -0.1899 0.1689 0.5417 1,2170 57

7 0.3689 -0.4777 -0.3142 0.0148 0.4218 1,6049 28

8 0.4072 - -0.4475 0.0729 1.3572 - 6

Low Score (0-2) -0.0100 -0.5491 -0.3750 -0.2520 0.3525 0,7162 65

High Score (6-8) 0.3635 -0.4728 -0.2654 0.1281 0.5385 1,3572 91

High – Low 0.3734 0.0763 0.1096 0.3801 0.1861 0.6410

estat-t 3.0624

p-Value 0.0013

High – All 0.1285 0.0000 0.0297 0.0972 0.0463 0.1697

estat-t 1.1714

p-Value 0.1219

Panel B – 2-years market adjusted return (firm-year observations from 1997 to 2006)

Mean Percentile

10%

Percentile

25% Median

Percentile

75%

Percentile

90% n

All firms 0.4380 -0.5739 -0.3664 0.0507 0.9607 2.1624 367

BrF

_S

CO

RE

0 -0.5746 - -0.9139 -0.5457 -0.2932 - 4

1 0.2348 -0.7901 -0.6099 -0.2118 0.8378 2,4004 20

2 0.0986 -0.7132 -0.5164 -0.0926 0.3083 1,4264 41

3 0.6670 -0.5487 -0.3363 0.2763 1.2733 2,4793 57

4 0.1923 -0.6042 -0.4672 -0.1036 0.4526 1,4113 84

5 0.5521 -0.5800 -0.2269 0.1552 1.2250 2,0800 71

6 0.6114 -0.5179 -0.2819 0.3672 1.0930 2,2326 56

7 0.7230 -0.5058 -0.2876 0.1138 1.0043 3,5245 28

8 1.0730 - -0.2151 1.0336 2.1025 - 6

Low Score (0-2) 0.0991 -0.7521 -0.5365 -0.1765 0.3585 1,4667 65

High Score (6-8) 0.6769 -0.4057 -0.2523 0.3329 1.0930 2,4974 90

High – Low 0.5778 0.3464 0.2841 0.5094 0.7345

estat-t 3.1804

p-Value 0.0009

High – All 0.2389 0.1683 0.1140 0.2822 0.1323

estat-t 1.5886

p-Value 0.0574

Page 12: THE USE OF GENETIC ALGORITHMS TO OBTAIN HIGHER …

12

5. Results

Figure 2 and Figure 3 show the results of the optimization experiments for 1 and 2-years

periods, comparing them with the original strategy of uniform weights. The results are

significantly higher (α=5%) than the original strategy both for 1- and 2-years-ahead returns.

Two points call the attention already at first glance: the difference between weights for 1

and 2-years holding period and the presence of negative weights.

F_ROA F_CF F_∆ROA F_ACCRUAL F_∆LIQUID F_∆LEVER F_∆MARGIN F_∆TURN

Uniform Weights 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00

Optimized Weights - 1 year -0.03 1.00 -0.70 -0.60 -0.18 0.99 0.27 0.62

Optimized Weights - 2 years -0.53 0.91 0.40 -1.00 0.79 0.38 0.22 0.82

-1.00

-0.50

0.00

0.50

1.00

Uniform Weights Optimized Weights - 1 year Optimized Weights - 2 years

Figure 2 – Weights of Indicators

Page 13: THE USE OF GENETIC ALGORITHMS TO OBTAIN HIGHER …

13

1 year 2 years

Uniform Weights 0.3635 0.6769

Optimized Weights 0.7614 1.1215

p-Value 0.0183 0.0351

Uniform Weights

Uniform Weights

Optimized Weights

Optimized Weights

0.0000

0.4000

0.8000

1.2000

Figure 3 – Average Adjusted Returns of 1 and 2-years buy-and-hold strategy (1997 to

2006)

From the first observation one can notice that companies that have the best 1-year

performance will not necessarily be among the top performers in the 2-year period (otherwise

weights would be similar). Moreover, it shows that these firms’ characteristics are different.

The existence of negative weights is even more interesting given that, by definition, all

indicators should point to “good” conditions. Nevertheless, analyzing the weights jointly, not

individually, may shed light to some hypothesis, argued below.

Among the optimized weights for 1-year return the two most important are those related to

free cash flow and leverage reduction. Together they could indicate that the firm used cash

generated by operations to reduce leverage. Asset turnover and margin increase also had

positive weight, but lower in modulus. ROA’s weight was inexpressive, while ∆ROA’s weight

indicate firms that reduced their return on assets. One hypothesis for this apparently

contradictory result would be the fact that the portfolio is built in May, some time after the

disclosure of companies results and possibly providing enough time for the market’s negative

reaction (maybe excessively), creating opportunities for superior results. These hypothesis is

Page 14: THE USE OF GENETIC ALGORITHMS TO OBTAIN HIGHER …

14

coherent with the evidences of conservatism found in the North-American (Basu 1997) and

Brazilian (Costa, Lopes and Costa 2006; Almeida, Scalzer and Costa 2008) markets.

The reduction of liquidity (negative F_∆LIQUID), despite being the second least important

indicator, may indicate that a part of the noncurrent liabilities could have turned into current

liabilities and be, together with a negative ∆ROA, one of the reasons for the negative reaction

of the market. F_ACCRUAL’s weight was not coherent with those from F_ROA and F_CF:

with a negative (or close to zero) profit and a positive cash flow, a negative accrual was

expected (remember that F_ACCRUAL = 1 if ACCRUAL < 0, i.e. a negative weight tends to

select positive accruals). However, observing the way the accruals were calculated may lead to

other hypotheses:

Accruals = (∆CurrAssets – ∆Cash&Eq) – (∆CurrentLiab– ∆STDebt) – Depreciation

If the firm negotiates with its suppliers the extension of liabilities’ payment term, these

liabilities could be transferred to the non-current liabilities account. Alternatively, the

shareholders may decide to invest more money in the company and this money is used to pay

claimants other than debtholders. These situations would have a positive impact on the accrual

calculated as above and overestimate it, but would not affect the real accrual. It is possible that

this kind of situation happens to financially distressed firms. As a matter of fact, a closer look

at firms with negative ROA, positive CF and positive ACCRUAL (unreported) revealed these

situations. As a summary, the indicators seem to sketch firms that were about to go bankrupt

but didn’t. Maybe because of that these firms have a high 1-year return, as the market corrects

its too pessimistic expectations, but are not among those with the highest 2-year return, as in a

longer term their performances still don’t look so bright.

In the set of weights for 2-year returns F_ROA is negative and F_∆ROA is positive,

meaning companies that were not profitable, but that improved their performance. Not being

profitable may have contributed for the firm to be a HBM one, while the improvement of ROA

might indicate the beginning of operational performance improvement leading to higher future

returns. The hypothesis of performance improvement is also reflected on the positive (and

Page 15: THE USE OF GENETIC ALGORITHMS TO OBTAIN HIGHER …

15

relevant) weights of cash generation, asset turnover and liquidity increase, indicating that the

company is somehow becoming more prepared for the operations in the long term. Leverage

reduction and margin increase are also aimed, but less important. F_ACCRUAL’s negative

weight could be interpreted in the same way as for 1-year-return weights, specially the

hypothesis of shareholders’ investment to liquidate liabilities other than debt (coherent with

liquidity improvement). Differently from the 1-year-return weights, these apparently seek

companies that have been improving operational results and with growth perspectives.

Obviously, the above mentioned hypotheses are merely speculative and require further

investigation.

Table 4 presents the distribution of returns by O_Br_SCORE ranges. Panel A shows that

firms with high score obtained 1-year-adjusted return 49 p.p. higher than those with low score

and 53 p.p. higher than all HBM firms from 1997 to 2006. The results are statistically

significant at 1%. Similarly, panel B shows companies with high score obtaining 2-years-

adjusted returns 89 p.p. above those with low score and 68 p.p. higher than all HBM firms in

the same period. Also in this case the results are significant at 1%.

Page 16: THE USE OF GENETIC ALGORITHMS TO OBTAIN HIGHER …

16

Table 4 – Return of the strategy using optimized weights

The figures below show the performance of the optimized weights year by year comparing

them with the uniform weights. It can be observed that the optimized weights yield higher

Panels A and B present the 1 and 2-years buy-and-hold returns yielded by the weights optimized using Genetic Algorithms applied to financial data from 1997 to 2006. For 1-year (2-years) weights, firms ranked as high score obtained O_BrF_SCORE from 1.42 to 2.88 (from 1.84 to 3.52), while firms ranked as low score obtained O_BrF_SCORE from -1.51 to -0.05 (from -1.53 to 0.15).

Panel A – 1-year market adjusted return (firm-year observations from 1997 to 2006)

Mean Percentile

10%

Percentile

25% Median

Percentile

75%

Percentile

90% n

All firms 0.2350 -0.4728 -0.2951 0.0309 0.4922 1.1875 367

O_B

rF_S

CO

RE

-1.51 to -0.78 0.4594 -0.5491 -0.1932 -0.0251 0.2499 3.5690 9

-0.78 to -0.05 0.2409 -0.4019 -0.3082 0.0671 0.4471 1.3551 54

-0.05 to 0.69 0.0247 -0.5385 -0.3441 -0.1345 0.2837 0.7162 142

0.69 to 1.42 0.2344 -0.4475 -0.2655 0.1038 0.4917 0.9888 112

1.42 to 2.15 0.8531 -0.2654 -0.0692 0.4790 1.5092 2.3862 43

2.15 to 2.88 0.3792 -0.7477 -0.0705 0.4752 1.1967 1.2222 7

Low Score 0.2721 -0.4019 -0.3082 0.0628 0.4471 1.5095 63

High Score 0.7614 -0.2699 -0.0705 0.4764 1.2989 2.0133 50

High-Low 0.4893 0.1320 0.2378 0.4136 0.8518 0.5039

estat-t 2.5689

p-Value 0.0060

High-All 0.5264 0.2028 0.2247 0.4455 0.8066 0.8258

estat-t 3.1319

p-Value 0.0014

Panel B – 2-year market adjusted return (firm-year observations from 1997 to 2006)

Mean Percentile

10%

Percentile

25% Median

Percentile

75%

Percentile

90% n

All firms 0.4380 -0.5739 -0.3664 0.0507 0.9607 2.1624 367

O_

BrF

_S

CO

RE

-1.53 to -0.69 0.2586 -0.6880 -0.5293 -0.1765 0.9670 1.2588 27

-0.69 to 0.15 0.2169 -0.5979 -0.4679 -0.0704 0.5325 1.6498 79

0.15 to 1.00 0.3688 -0.5499 -0.3675 0.0474 0.7619 1.7360 108

1.00 to 1.84 0.3796 -0.5487 -0.3395 0.1375 0.7424 1.5487 103

1.84 to 2.68 1.2038 -0.3320 -0.0485 0.6591 2.1025 4.0301 37

2.68 to 3.52 1.0110 -0.3432 0.2037 1.2968 1.8718 2.1624 13

Low Score 0.2275 -0.6838 -0.4696 -0.0774 0.6695 1.5867 106

High Score 1.1215 -0.3432 -0.2003 0.7546 2.0800 2.6274 50

High-Low 0.8940 0.3406 0.2693 0.8320 1.4105 1.0407

estat-t 3.8414

p-Value 0.0001

High-All 0.6836 0.2308 0.1660 0.7039 1.1193 0.4650

estat-t 3.1371

p-Value 0.0014

Page 17: THE USE OF GENETIC ALGORITHMS TO OBTAIN HIGHER …

17

returns in all 10 periods (1-year returns) and in 9 out of 10 periods (2-years returns). Such a

result was already expected, since all periods were used in the optimization process. The

robustness of the weights (i.e. its performance in periods not used in the optimization process)

were not tested and could be a matter for further studies.

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1997 1998 1999 2000 2001 2002 2003 2004 2005 2006

Uniform Weights Optimized Weights

Figure 4 – Yearly returns (1-year buy-and-hold)

0

0.5

1

1.5

2

2.5

1997 1998 1999 2000 2001 2002 2003 2004 2005 2006

Uniform Weights Optimized Weights

Figure 5 – Yearly returns (2-years buy-and-hold)

As mentioned above, the optimization process was repeated 20 times for each experiment

and only the best set of weights was explored in this paper. Hence there are other sets of

Page 18: THE USE OF GENETIC ALGORITHMS TO OBTAIN HIGHER …

18

weights that, even not yielding returns as high as those here presented, also lead to much higher

returns than those with uniform weights. These other sets of weights also reveal different

profiles of firms that achieved superior returns. A deeper analysis of them could be the subject

of further studies, as well as the variation of the relative importance of the weights indicating a

possible change in market’s assessment criteria.

6. Conclusion

This paper tries to identify which financial indicators are more important when identifying

firms whose shares would have future returns higher than the market average and speculates

about the relation of these indicators. It is based on a strategy developed by Piotroski (2000)

and adapted to the Brazilian market by Lopes and Galdi (2007) and Galdi (2008), which

observes 9 financial indicators and selects firms with more “positive” signals. The empirical

approach here presented associates weights to these indicators and, by means the computational

technique Genetic Algorithms, optimizes the selection of firms (and the return of the portfolio)

by choosing these weights.

The results showed that the obtained weights yield an increase of the adjusted return from

36% to 76% (1-year buy-and-hold strategy) and from 68% to 112% (2-years buy-and-hold

strategy) when compared to uniform weights, being both results significant at 5%, but not at

1%. More than that, it shows significant differences in the importance of the financial

indicators and reveals that the relation among them is not trivial. It also shows that companies

with the best return in 1-year period are not the same as those with the best return in 2 years,

since different sets of weights were found when optimizing the portfolio for 1 and 2-years

returns.

Given the nature of the employed method it is possible that the weights found yield

significantly higher returns only for the period used in the optimization. In this sense, further

research could test the robustness of the weights in subsequent periods, not used in the

optimization process. Another alternative for further studies would be the testing an investment

Page 19: THE USE OF GENETIC ALGORITHMS TO OBTAIN HIGHER …

19

strategy combining more than one set of weights (result of other optimization processes), so it

would be possible to increase the number of shares and allow the formation of higher volume

portfolios, given the reduced liquidity of many HBM firms.

References

ABARBANELL, J.S.; BUSHEE, B.J. Financial Statement Analysis, Future Earnings and

Stock Prices. Journal of Accounting Research 35. p. 1-24, 1997.

ABARBANELL, J.S.; BUSHEE, B.J. Abnormal Returns to a Fundamental Analysis Strategy.

The Accounting Review 73, no. 1, 1998.

ALMEIDA, J.C.G., SCALZER, R.S., COSTA, F.M. Níveis Diferenciados de Governança

Corporativa e Grau de Conservadorismo: Estudo Empírico em Companhias Abertas Listadas

na Bovespa. Revista de Contabilidade e Organizações, v. 2, n. 2, p. 117 - 130 jan./abr. 2008.

BARTOV, E., GOLDBERG, S.R., KIM, M.S. The Valuation-relevance of Earnings and Cash

Flows: an International Perspective. Journal of International Financial Management and

Accounting, vol 12, issue 2, 2001.

BASU, S. The conservatism principle and the asymmetric timeliness of earnings. Journal of

Accounting and Economics 24, p.3-37, 1997.

BEASLEY, D., BULL, D.R., MARTHIN, R.R. An overview of genetic algorithms: Part 1,

fundamentals. Tech. rep., Inter-University Committee on Computing, 1993.

BITTENCOURT, G. Inteligência Artificial: Ferramentas e Teorias. Florianópolis: Editora da

UFSC, 1998.

CHEN, C.W.S., CHIANG, T.C., SO, M.K.P. Asymmetrical reaction to US stock-return news:

evidence from major stock markets based on a double-threshold model. Journal of Economics

and Business 55, p.487-502, 2003.

Page 20: THE USE OF GENETIC ALGORITHMS TO OBTAIN HIGHER …

20

COSTA, F.M., LOPES, A.B., COSTA, A.C.O. Conservadorismo em Cinco Países da América

do Sul. R. Cont. Fin. – USP, São Paulo, n. 41, p. 7 – 20, Maio/Ago. 2006.

DAMODARAN, A. Investment Valuation: Tools and Techniques for Determining the Value of

Any Asset. 2nd ed. New York: Wiley, 2002.

FAMA, E.F; FRENCH, K.R. The cross section of expected stock returns. Journal of Finance 47,

p.427-465, 1992.

FAMA, E.F.;FRENCH, K.R. Size and Book-to-Market factors in Earnings and Returns.

Journal of Finance 50,131-135, 1995.

GALDI, F.C. Estratégias de investimento em ações baseadas na análise de demonstrações

contábeis: é possível prever o sucesso? PhD Thesis – Universidade de São Paulo, São Paulo,

2008.

GRAHAM, B., and DODD, D.L. Security Analysis. McGraw-Hill, New York, 1934.

LAKONISHOK, J.; SCHLEIFER, A.; VISHNY, R. Contrarian investments, extrapolation and

risk. Journal of Finance 49, p.1541-1578, 1994.

LOPES, A.B.; GALDI, F.C. Does Financial Statement Analysis Generate Abnormal Returns

Under Extremely Adverse Conditions? In: American Accounting Association (AAA) Meeting,

Chigago-EUA, 2007.

MARTINEZ, A.L. Gerenciamento dos resultados contábeis: estudo empírico das empresas

abertas brasileiras. Tese (Doutorado) – Faculdade de Economia e Administração da

Universidade de São Paulo, São Paulo, 2001.

OU, J.A.; PENMAN, S.H. Accounting Management, Price-Earnings Ratio and the information

Content of Security Prices. Journal of Accounting Research, Vol. 27, 1989.

PEREIRA, L.C.S. Impacto do Gerenciamento de Resultado no Retorno Anormal: estudo

empírico dos resultados das empresas listadas na Bolsa de Valores de São Paulo – BOVESPA.

Page 21: THE USE OF GENETIC ALGORITHMS TO OBTAIN HIGHER …

21

Dissertação (Mestrado) – Fundação Instituto de Pesquisas em Contabilidade, Economia e

Finanças – FUCAPE, Vitória, 2007.

PIOTROSKI, J.D. Value Investing: The Use of Historical Financial Statement Information to

Separate Winners from Losers. Journal of Accounting Research 38, p.1-41, 2000.

ROSENBERG, B., REID, K., and LANSTEIN, R. Persuasive evidence of market inefficiency.

Journal of Portfolio Management 11. 1984, p. 9-17.

RUSSELL, S.; NORVIG, P. Artificial Intelligence A Modern Approach. 2nd

edition. Upper

Saddle River, New Jersey: Prentice Hall, 2003.

SLOAN, R.G. Do Stock Prices Fully Reflect Information in Accruals and Cash Flows About

Future Earnings? The Accounting Review, vol. 71, no. 3, July 1996.