22
JOURNAL OF APPLIED ECONOMETRICS J. Appl. Econ. 23: 235–256 (2008) Published online in Wiley InterScience (www.interscience.wiley.com) DOI: 10.1002/jae.981 USING THE VARIANCE STRUCTURE OF THE CONDITIONAL AUTOREGRESSIVE SPATIAL SPECIFICATION TO MODEL KNOWLEDGE SPILLOVERS OLIVIER PARENT a * AND JAMES P. LESAGE b a Department of Economics, University of Cincinnati, Cincinnati, Ohio, USA b McCoy College of Business Administration, Department of Finance and Economics, Texas State University—San Marcos, San Marcos, Texas, USA SUMMARY This study investigates the pattern of knowledge spillovers arising from patent activity between European regions. A Bayesian hierarchical model is developed that specifies region-specific latent effects parameters modeled using a connectivity structure between regions that can reflect geographical proximity in conjunction with technological and other types of proximity. This approach exploits the fact that interregional relationships may exhibit industry-specific technological linkages or transportation network linkages, which is in contrast to traditional studies relying exclusively on geographical proximity. We also allow for both symmetric and asymmetric knowledge spillovers between regions, and for heterogeneity across the regional sample. A series of formal Bayesian model comparisons provides support for a model based on technological proximity combined with spatial proximity, asymmetric knowledge spillovers, and heterogeneity in the disturbances. Estimates of region-specific latent effects parameters structured in this fashion are produced by the model and used to draw inferences regarding the character of knowledge spillovers across the regions. The method is illustrated using sample data on patent activity covering 323 regions in nine European countries. Copyright 2008 John Wiley & Sons, Ltd. Received 15 December 2005; Revised 9 February 2007 1. INTRODUCTION Recent literature has recognized that knowledge spillovers from external sources may have an important impact on innovation processes and economic development (see, for example, Keller, 2002). The new growth theory has devoted considerable attention to the regional dimen- sion of innovation and economic development. Numerous studies (e.g., Jaffe et al., 1993; Maurseth and Verspagen, 2002) find that in both the USA and Europe knowledge spillovers tend to be more intense between agents located nearby in space. This central conclusion that proximity is important for knowledge transmission has led to the use of spatial econometric models that accommodate the implied spatial dependence inherent in patent activity (Anselin et al., 1997). One contribution of this study is a formal spatial econometric methodology that intro- duces geographical proximity to address spatial dependence in combination with other types of proximity during model specification, estimation and testing. For example, we combine geographical proximity with technological similarity/proximity, and we also demonstrate a Ł Correspondence to: Olivier Parent, Department of Economics, University of Cincinnati, 1208 Crosley Towers, PO Box 210371, Cincinnati, OH 45221-0371, USA. E-mail: [email protected] Copyright 2008 John Wiley & Sons, Ltd.

Using the variance structure of the conditional autoregressive spatial specification to model knowledge spillovers

Embed Size (px)

Citation preview

Page 1: Using the variance structure of the conditional autoregressive spatial specification to model knowledge spillovers

JOURNAL OF APPLIED ECONOMETRICSJ. Appl. Econ. 23: 235–256 (2008)Published online in Wiley InterScience(www.interscience.wiley.com) DOI: 10.1002/jae.981

USING THE VARIANCE STRUCTURE OF THE CONDITIONALAUTOREGRESSIVE SPATIAL SPECIFICATION TO MODEL

KNOWLEDGE SPILLOVERS

OLIVIER PARENTa* AND JAMES P. LESAGEb

a Department of Economics, University of Cincinnati, Cincinnati, Ohio, USAb McCoy College of Business Administration, Department of Finance and Economics, Texas State University—San

Marcos, San Marcos, Texas, USA

SUMMARYThis study investigates the pattern of knowledge spillovers arising from patent activity between Europeanregions. A Bayesian hierarchical model is developed that specifies region-specific latent effects parametersmodeled using a connectivity structure between regions that can reflect geographical proximity in conjunctionwith technological and other types of proximity. This approach exploits the fact that interregional relationshipsmay exhibit industry-specific technological linkages or transportation network linkages, which is in contrastto traditional studies relying exclusively on geographical proximity. We also allow for both symmetric andasymmetric knowledge spillovers between regions, and for heterogeneity across the regional sample. Aseries of formal Bayesian model comparisons provides support for a model based on technological proximitycombined with spatial proximity, asymmetric knowledge spillovers, and heterogeneity in the disturbances.Estimates of region-specific latent effects parameters structured in this fashion are produced by the modeland used to draw inferences regarding the character of knowledge spillovers across the regions. The methodis illustrated using sample data on patent activity covering 323 regions in nine European countries. Copyright 2008 John Wiley & Sons, Ltd.

Received 15 December 2005; Revised 9 February 2007

1. INTRODUCTION

Recent literature has recognized that knowledge spillovers from external sources may havean important impact on innovation processes and economic development (see, for example,Keller, 2002). The new growth theory has devoted considerable attention to the regional dimen-sion of innovation and economic development. Numerous studies (e.g., Jaffe et al., 1993;Maurseth and Verspagen, 2002) find that in both the USA and Europe knowledge spilloverstend to be more intense between agents located nearby in space. This central conclusion thatproximity is important for knowledge transmission has led to the use of spatial econometricmodels that accommodate the implied spatial dependence inherent in patent activity (Anselinet al., 1997).

One contribution of this study is a formal spatial econometric methodology that intro-duces geographical proximity to address spatial dependence in combination with other typesof proximity during model specification, estimation and testing. For example, we combinegeographical proximity with technological similarity/proximity, and we also demonstrate a

Ł Correspondence to: Olivier Parent, Department of Economics, University of Cincinnati, 1208 Crosley Towers, PO Box210371, Cincinnati, OH 45221-0371, USA. E-mail: [email protected]

Copyright 2008 John Wiley & Sons, Ltd.

Page 2: Using the variance structure of the conditional autoregressive spatial specification to model knowledge spillovers

236 O. PARENT AND J. P. LESAGE

model that combines geographical and transport proximity. Intuitively, knowledge generatedby a region should be more easily adopted by another nearby region whose research activ-ity and production occur within a similar scientific field or involves similar productiontechnologies.

Another contribution of our methodology is allowance for asymmetric spatial spillover effects.Technology gap theories posit that technology diffusion is in no sense automatic, but requires a cer-tain level of economic development and absorptive capacity to benefit from knowledge spillovers.For example, leading regions might exploit their development advantage by acquiring innovationsfrom their less developed neighbors.

Recognition that geographical space is of crucial importance for the spillover of knowledgeraises questions regarding the underlying mechanism that creates the observed spatial dependencein spillovers as well as the processes governing their diffusion over space. A complete explorationmight involve issues such as: the importance of ‘path dependence’ since the birthplace of high-technology firms may be important, proximity to leading universities and their role in spawningtechnology firm start-ups, the role of amenities that might influence location of research scientistsand entrepreneurs, and so on. Many of these issues would best be explored with a space–timemodel and panel data sources.

Our cross-sectional sample of 323 European regions is used in conjunction with a Bayesianhierarchical linear model that results in a more limited focus for this investigation. Region-specific latent effects parameters are introduced and specified using a host of alternative structures.Some of the specifications involve only geographic space, while others involve geographicspace in addition to potentially important factors such as technological or transport networkproximity. These specifications are interacted with both symmetric or asymmetric structures forthe spillovers, as well as normal distributions or independent Student-t (mixture of normals)distributions for the disturbances (Geweke, 1993), which have the ability to accommodateheteroscedasticity across the sample of regions. From this bevy of alternative specifications forthe region-specific latent effects parameters that capture spatial spillovers, we attempt to answerquestions regarding the symmetric versus asymmetric nature of spillovers, the importance ofheterogeneity, and the role of technological versus communication/transport networks in knowledgespillovers.

Our findings based on formal Bayesian model comparison methods indicate that allowing forasymmetric spillovers, and heterogeneity across regions is important for modeling knowledgespillovers. In addition, we find that models which incorporate both geographic and technologicalproximity are more consistent with the sample data than those based on geographic or networkproximity.

Section 2 of the paper provides background on the uses of geographic, technology, andtransportation network proximity indices. It also motivates a model-based construction forthe first stage of the hierarchical linear model that specifies a heterogeneous variance struc-ture and introduces the basic conditional autoregressive (CAR) spatial model specificationin the context of a hierarchical linear model. In Section 3 we further develop the CARspecification for spatial dependence in innovative activity in the context of a knowledgeproduction function. Section 4 discusses estimation of the model as well as model com-parison procedures that allow us to test models based on alternative structures that incor-porate various types of spatial, transport network, and technological proximity. An appliedillustration of the method can be found in Section 5 and conclusions are in the finalsection.

Copyright 2008 John Wiley & Sons, Ltd. J. Appl. Econ. 23: 235–256 (2008)DOI: 10.1002/jae

Page 3: Using the variance structure of the conditional autoregressive spatial specification to model knowledge spillovers

SPATIAL ECONOMETRICS AND SPILLOVER EFFECTS 237

2. CONSTRUCTION AND USE OF TECHNOLOGICAL AND GEOGRAPHICALPROXIMITY INDICES

The relation between innovative activity and geography is a subject of growing interest ineconomics. Empirical studies have confirmed the importance of geographical proximity inthe transmission of knowledge (Jaffe et al., 1993), and have shown that some regions haveadvantages in generating innovations. The spatial dimension of knowledge spillovers can beextremely important as motivated by Autant-Bernard (2001), who provides an extensive discussiondetailing the role of spatial proximity in the innovation process. One way to quantify thepotential for spillovers that takes into account both technological and spatial proximity is anextension of the Jaffe (1986) index to measure technological similarity of patents in neighboringregions.

2.1. Conceptualization and Measurement of Technological, Transportation, andGeographical Indexes

It is well known that there are large differences between European regions in terms of technologicalcompetencies. The empirical application in this study is based on 323 regions in nine Europeancountries for the year 1996. The data are taken from the EUROSTAT Regio data set. This providesa large sample for exploring differences in innovation based on both spatial and technologicalproximity. We use the NUTS 3 (Nomenclature of Territorial Units) level of regional aggregationfor Denmark, France, Great Britain, Ireland, and Italy, NUTS 2 for Belgium, Holland, andWest Germany, and NUTS 1 for Luxembourg. These decisions were made to achieve similarsized regions. Figure 1 shows the various levels of innovative activity across regions for theyear 1996. The innovation intensity is high in southern England, Belgium, Netherland, southernWest Germany, northern Italy, and eastern and central France. The mean of patents granted per100,000 inhabitants for all European regions in our sample is 8.92 ð 10�2 and the variance is1.60 ð 10�2, reflecting substantial dispersion across regions. This section focuses on quantitativemeasures that capture both the spatial and technological nature of these differences in innovativeactivity.

Jaffe (1986) measured technological proximity using the distribution of patents over dif-ferent technology fields for a sample of US firms. He measured closeness between any twofirms by calculating a correlation coefficient (cosine index) between vectors representing thedistribution of firm-level patents over technology fields. Following Jaffe (1986), we adopt atechnology classification scheme that distinguishes k D �1, . . . , m� different fields of technol-ogy based on the International Patent Classification (IPC), where we rely on m D 8 differ-ent technology sectors. The IPC is an internationally agreed upon mutually exclusive andcomprehensive system for patent classification. We measure distances between technologicalfields by analyzing patent activity occurring between regions in the same technological codes(IPC). Using i D �1, . . . , 323� to denote the number of regions, we assign patents grantedin each region i to one or more classification codes. The Jaffe index shown in (1) providesa measure of technological proximity between two regions i and j based on technologi-cal classes, where Fki represents the number of patents granted in classification code k for

Copyright 2008 John Wiley & Sons, Ltd. J. Appl. Econ. 23: 235–256 (2008)DOI: 10.1002/jae

Page 4: Using the variance structure of the conditional autoregressive spatial specification to model knowledge spillovers

238 O. PARENT AND J. P. LESAGE

Figure 1. Log of patents per 100,000 inhabitants for the year 1996

region i:

Sij D

m∑kD1

FkiFkj√√√√ m∑kD1

F2ki

m∑kD1

F2kj

�1�

From (1), it is easy to see that Sij takes on larger values when two regions grant patents in thesame technological fields (i.e., the two classes are cognitively close to each other). Conversely,Sij will be zero for regions which have technological classes with no overlap.

Audretsch and Feldman (1996) argue that transportation costs are inversely related to thedistance shipped, so that higher transportation costs should be associated with lower geographic

Copyright 2008 John Wiley & Sons, Ltd. J. Appl. Econ. 23: 235–256 (2008)DOI: 10.1002/jae

Page 5: Using the variance structure of the conditional autoregressive spatial specification to model knowledge spillovers

SPATIAL ECONOMETRICS AND SPILLOVER EFFECTS 239

concentration of production. Extending this idea to the quality of transportation infrastructure, wecan assume that spatial knowledge interaction is inversely related to transportation time betweenareas, which we measure using (2):

Tij D 1

TTij�2�

where TTij is a measure of transportation time between the main administrative city of region iand the main administrative city of region j.

Note that both the technological and transportation proximity measures are symmetric. Animportant question that relates to spatial modeling is the amount of regional variation in themeasure of technological proximity. It may be the case that proximity among technologies is anintrinsic feature related to the properties of the underpinning knowledge bases, making it relativelyinvariant across regions. On the other hand, technological proximity may depend heavily uponthe characteristics of each regional innovation system. Audretch and Feldman (1996) note that thelocation of manufacturing activity is one of many factors that explain the spatial distribution ofinnovative activity. Our model-based approach will allow us to explore the relative importanceof geographical, technological, and transport proximity by comparing models that include thesedifferent proximity indexes.

2.2. A Motivation for Model-Based Indices

Despite their intuitive appeal, empirical indices such as the Jaffe index are based on a numberof important structural assumptions. Most studies that take interregional spillovers into accounthave assumed that the technological distance between two regions is symmetric (Jaffe, 1986). Thisassumption may be convenient from a measurement and modeling point of view, but asymmetrictechnological distance between regions seem plausible and should be accommodated in appliedmodeling. This can be accomplished by using the level of gross domestic product (GDP) as a proxyfor the size of each region. One motivation for use of GDP is that the level of economic activitymay have an impact on observed spillover effects. It seems plausible that spillovers depend onboth technological proximity and the intensity of economic activity. For example, if two regionsinnovate in the same field, their interactions may differ depending on whether they share the samelevel of economic activity, or the spillover producing and receiving regions exhibit different levelsof activity.

Incorporating the relative levels of economic activity for each region can be accomplished bymodifying the technological proximity index from region i to j as shown in (3):

Wij D(

GDPiGDPj

)1/2

ð Sij

D(

GDPiGDPj

)1/2

ð

m∑kD1

FkiFkj(m∑kD1

F2ki

m∑kD1

F2kj

)1/2 �3�

In (3), the ratio �GDPi/GDPj� measures the output gap between areas i and j (Maurseth andVerspagen, 2002). Larger values of this ratio correspond to larger gaps between j (low level of

Copyright 2008 John Wiley & Sons, Ltd. J. Appl. Econ. 23: 235–256 (2008)DOI: 10.1002/jae

Page 6: Using the variance structure of the conditional autoregressive spatial specification to model knowledge spillovers

240 O. PARENT AND J. P. LESAGE

economic activity) and i (high level of economic activity). Thus, the asymmetric effects betweenthe technological distances Wij and Wji are more pronounced when the economic activity gapbetween areas i and j increases. We note that the Jaffe index suggests technological proximitycan only accentuate spillovers. Our modification of this index with the output gap measurecreates an asymmetry between spillovers from i to j and j to i, that depends on differencesin regional economic activity. These asymmetric spillover effects are consistent with the idea thatleading regions may exploit their technological advantage through acquisitions from neighbors(Van Pottelsberghe and Lichtenberg, 2001).

The role of the transportation network quality on regional innovative activity is measured byan asymmetric index:

Wij D(

GDPiGDPj

)1/2

ð Tij �4�

where Tij are elements of the symmetric matrix (which we denote T) incorporating transportationtimes between regions. This measure will capture unequal access between regions and allowus to test the importance of transport infrastructure on inter-regional knowledge spillovers. Forconvenience, we will refer to this as transportation network proximity in the sequel.

A key aspect of the methodology proposed here is use of alternative matrices W with elementsWij such as those in (3) and (4) in the connectivity structure of our random effects CARspecification. This allows us to produce alternative models that emphasize different types ofproximity. Details regarding how this is accomplished are provided in the next section.

3. SPATIAL HIERARCHICAL MODELING

The model methodology set forth here allows us to systematically explore the relative importanceof geographical versus technological distance between the regions in our European sample. Theprevious section motivated that several non-geographic factors may play an important role indetermining knowledge flows. The effects of technological specialization, output gaps (betweenspillover-receiving and spillover-producing regions), the level of innovative activity and the effectof national laws and regulations on innovation need to be accommodated by our model. Spatialspillover effects on innovation need to be considered in light of these non-spatial factors.

3.1. General Model Structure

One approach to empirically model the characteristics of localized knowledge flows and theirinfluence on regional innovation is the knowledge production function framework introducedby Griliches (1979). We slightly modify this production function so that the increment of theinnovative output depends upon a number of additional factors related to the economic andinstitutional environment within which the process of innovation takes place. For each regioni D �1, . . . , n�, patents are used as a proxy for knowledge output (Hall et al., 2001; Anselin et al.,1997). We propose a Cobb–Douglas production function that relates knowledge output (y) to publicand private research and development inputs in a regression-based framework. This framework isextended to include both a spatially structured component as well as an unstructured componentto model latent or unobservable influences.

Copyright 2008 John Wiley & Sons, Ltd. J. Appl. Econ. 23: 235–256 (2008)DOI: 10.1002/jae

Page 7: Using the variance structure of the conditional autoregressive spatial specification to model knowledge spillovers

SPATIAL ECONOMETRICS AND SPILLOVER EFFECTS 241

As explanatory variables we use two measures of the research and development, (1) R&Dexpenditure (EXP) of the business enterprise sector (measured in euros) as a percentage of GDPin 1996, and (2) the number of R&D personnel (NPriv) in full-time employment of the businessenterprise sector. A measure of the sectorial decomposition of innovation (Div) for each regionis also introduced as an explanatory variable to capture diversity or specialization of technologyin the regions. A greater variety of technology within a region may promote or hinder knowledgespillovers and innovation (Jacobs, 1969). This diversity variable is defined as the share of thelargest four patenting sectors in the total number of patents granted in each region, and allowsus to test the extent to which the nature of economic organization within regions is important inthe innovation production process. Higher values for this variable reflect less industrial diversityin the region. These variables (in log form) along with a constant term constitute the explanatoryvariables matrix X in our model.

Due to a lack of data on regional R&D expenditures at the NUTS3 geographical level, thelogarithm of the EXP and N Priv variables are based on larger geographic regions for each area i.Specifically, we use NUTS2 sample data information for areas based on NUTS3 geography, andNUTS1 information for areas corresponding to NUTS2 geography. The dependent variable vectory contains logged patents granted per 100,000 inhabitants in each of the 323 regions.

The linear mixed model provides a natural framework to introduce random effects as explanatoryvariables. A common statement of the linear mixed model leads to an alternative formulation ofthe basic regression shown in (5):

y D ˛C Xˇ C � C � �5�

The nð 1 vector y contains the (log of) patents granted per 100,000 inhabitants, X is an nð pmatrix of explanatory variables, and ˛ is an intercept. The pð 1 vector of regression coefficients ˇis often referred to as fixed effects, whereas � and � represent spatially structured and unstructuredcomponents, respectively. The nð 1 vector � represents latent regional effects, and the innovativeaspect of our methodology centers on specification of both � and �.

As previously stated, we assume that the production of knowledge in a region depends not onlyon its own research efforts and internal factors but also on the knowledge available in other regions.In addition to conventional factors of production such as labor and capital, there are additionalinfluences such as externalities related to human capital, public capital, network externalities, andagglomeration economies, which are internal to the region. With our specification of spatial effects,all of these factors may also influence innovative activities in neighboring areas. This is in contrastto past empirical studies that proxy knowledge available in other regions by either a spatial lag ofinnovation output from neighboring regions measured through their patents (Anselin et al., 1997),or by explanatory variables reflecting research effort in neighboring regions.

We allow for a richer specification where spatial effects are based on the difference �y � ˛� Xˇ�between the output and the input of the regions. This can be viewed as representing systematicdifferences between knowledge output (y) (proxied by patents granted) and observed inputs�˛C Xˇ�, since � the latent vector is structured to exhibit spatial (and possibly) technologicaland transport dependence. That is, our approach can be viewed as conditioning on the differencebetween observed output and inputs �y � ˛� Xˇ�. Since the structured latent regional effects areused to capture regional knowledge output not explained by own-region inputs, we interpret themagnitudes of the latent parameter vector as reflecting spatial spillovers that arise in the knowledgeproduction process.

Copyright 2008 John Wiley & Sons, Ltd. J. Appl. Econ. 23: 235–256 (2008)DOI: 10.1002/jae

Page 8: Using the variance structure of the conditional autoregressive spatial specification to model knowledge spillovers

242 O. PARENT AND J. P. LESAGE

3.2. Incorporating Spatial Dependencies

First, we specify the joint distribution of the spatially structured effects as

� ¾ N�0, �2��� �6�

where � is an nð n positive definite correlation matrix. The parameter �2� > 0 is introduced to

control the overall variance of the spatial components �i and the parameter � may be interpretedas measuring the strength of spatial dependence. Various structured forms may be assumed for� to model the spatial dependency (see Berger et al., 2001, for a review). Spatial variability willbe introduced through a Gaussian conditional autoregressive (CAR) specification, which requiresthat a matrix W reflecting the spatial connectivity structure between regions be specified. Thisrepresents a difficulty that arises in CAR model specification, since it is often unclear how tochoose elements of the weight matrix Wij. We take regions i and j to be neighbors if they sharea common boundary, often referred to as first-order contiguity. This is reasonable because thesample regions were defined so that all regions are of a similar size.

Cressie (1995) notes that if we rely on properties of a Markov random field, the conditionaldistribution of �i given f�j, i 6D jg depends only on f�j, j 2 υig, where υi represents a set of regionsthat are ‘neighbors’ to region i. This definition is in concordance with several empirical worksarguing that spatial proximity favors technological externalities (Jaffe et al., 1993). Specifically:

�ijf�j, i 6D jg ¾ N

∑j2υi

�Wij�j, �2�Mii

�7�

Various constraints are needed on the specification of the matrices W and M to ensure that� is symmetric positive definite. First, � is only symmetric if WijMjj D WjiMii. Second,Var��ij�j� D �2

�Mii > 0, so Mii must be positive. From the CAR specification (7), the jointdistribution corresponding to (6) is given by

� ¾ N�0, �2��In � �W��1M� �8�

Introducing the output gap Mii D GDPi allows us to satisfy the condition of symmetryWijMjj D WjiMii. The asymmetric matrix Wij represents either the technological proximity index(3), or the index of transport infrastructure from (4).

The variance matrix in (6) takes the form∑

� D M1/2�I� �M�1/2WM1/2��1M1/2. To ensure� is positive definite, � must lie between �1

min and �1max, where min and max are the smallest

and largest eigenvalues of M�1/2WM1/2. We also note that the variance �2�Mii of each spatial

effect should be proportional to economic activity �GDPi� allowing for high variability betweenregions that differ in terms of GDP.

This structured CAR approach represents a departure from typical spatial models in that bothspatial and technological or transportation connectivity of the regions under study may enter themodel. Since empirical studies of innovation are often confronted with a lack of data informationpertaining to the myriad of factors that might explain variation in knowledge flows, this approachto modeling these as unobservable or latent factors seems appropriate.

Copyright 2008 John Wiley & Sons, Ltd. J. Appl. Econ. 23: 235–256 (2008)DOI: 10.1002/jae

Page 9: Using the variance structure of the conditional autoregressive spatial specification to model knowledge spillovers

SPATIAL ECONOMETRICS AND SPILLOVER EFFECTS 243

3.3. A Comparison of our Approach to Other Models

There are alternatives to our approach that have been used to model innovation and patent activity.The most frequently used are spatial regression models that include a spatial lag of the dependentvariable (Anselin et al., 1997). Our specification allows different variances for the structured spatialeffects and unstructured regional heterogeneity, and we focus on both local spatial clustering aswell as global heterogeneity. This two-component error decomposition cannot be used in modelsthat rely on spatial lags of the dependent variable.

There is an additional complication that interpretation of the parameters ˇ is not straightforwardin these models. The model takes the form in (9), with an implied data-generating process in (10)(see Kim et al., 2003):

y D �Wy C Xˇ C ε �9�

y D �In � �W��1Xˇ C �In � �W��1ε �10�

�In � �W��1 D In C �WC �2W2 C �3W3 C . . .

where W is a row-stochastic spatial weight matrix, with zeros on the diagonal, and the matrix Xrepresent explanatory variables. The k-element vector ˇ contains regression parameters associatedwith fixed factors, and the n-element vector ε is typically assumed distributed as N�0, �2In�.

In our extension of conventional regression, partial derivatives of yi with respect to the ithobservation and rth variable, xir , have a simple form: ∂yi/∂xir D ˇr for all i, r; and ∂yi/∂xjr D 0for j 6D i for all r. In contrast, the spatial regression model in (9) results in complicated derivativesof yi with respect to xir and xjr . Since the explanatory variables matrix X is transformed by thenð n matrix inverse �In � �W��1 this results in a situation where any change to an explanatoryvariable in one region can affect the dependent variable in all other regions, and the conventionalinterpretation of estimated parameters from the conventional model no longer holds. Specifically,∂yi/∂xjr 6D 0, and ∂yi/∂xir 6D ˇr . Typical applications of these models ignore this and interpret theparameters ˇ as if they were equivalent to those from least-squares (for a discussion of this seeKim et al., 2003, and Abreu et al. 2004).

Two competing approaches to specifying spatially structured effects are the simultaneous andconditional autoregressive models that we label SAR and CAR. An illustration of spatially structureeffects parameters based on the SAR model is provided by Smith and LeSage (2004), whereasCressie (1995) treats the CAR specification in detail.

A spatial autoregressive specification for the latent effects parameters would lead to avariance–covariance structure for the effects equal to: �In � �W��1�In � �W��10

, where ahomoscedastic specification based on D �2In is usually assumed. This contrasts with the het-eroscedastic variances of the CAR structured effects, �2�In � �W��1M, that arise from the needto achieve symmetry. Use of one structure versus another may be largely a matter of modelingpreference, computational convenience, or empirical performance in any particular application.

We compared using the SAR and CAR structures for the effects parameters specification ofour model by relying on both a homoscedastic and heteroscedastic SAR specification proposal byWall (2004), where D �2M. A number of variants for our CAR specification are compared inSection 5.1 using log-marginal likelihood values described in Section 4.2. These values ranged from�505.5 to �411.1 for our CAR specifications, versus �513.42 and �527.61 for the heteroscedasticand homoscedastic variants of the SAR structured effects, respectively. The SAR structure was

Copyright 2008 John Wiley & Sons, Ltd. J. Appl. Econ. 23: 235–256 (2008)DOI: 10.1002/jae

Page 10: Using the variance structure of the conditional autoregressive spatial specification to model knowledge spillovers

244 O. PARENT AND J. P. LESAGE

applied to effects in the model that produced the highest CAR log-marginal likelihood value of�411.1, resulting in a difference of over 100 between these two models. The difference betweenthe log-marginal of �513.42 from the best (heteroscedastic) SAR specification and the worst ofour CAR specifications ��505.5� was 7.92. In Section 4.2 we discuss use of log Bayes factorsfor model comparison and note that differences between log-marginal likelihood values for twomodels greater than 4.60 are interpreted as ‘decisive evidence’ against the competing model, inthis case the SAR specification for the spatial effects parameters. This leads us to conclude thatour modeling preference for the CAR over the SAR structure in this application is consistent withour sample data.

3.4. Robust Modeling Using the t-Distribution

The Student-t distribution has frequently been used to deal with sample data containing outliers(e.g., Geweke, 1993), since aberrant observations can exert a large impact on the parameterestimates. In the context of spatial samples, where some regions have very small or large numbersof patents, these outliers can contaminate parameter estimates of the neighboring locations. Thismay create an artifact that resembles a spatial clustering pattern in the estimates, since it allowsa single outlier to produce a contagion effect that can impact estimates for an entire region of thesample. To overcome this problem we use an approach proposed by Geweke (1993) that takes intoaccount non-constant variance and outliers. In these models, the disturbances take the followingform:

� ¾ N�0, �2�V�

p�V� D ∏niD1Ga

�1(vij�2 ,

�2

) �11�

where V D diag�v1, v2, . . . , vn� represents a set of n variance scalars, and � represents the singleparameter of the inverted Gamma distribution. This specification allows us to estimate the nvariance scaling parameters vi by adding only a single parameter to the model (see Parent andLeSage, 2006, for details on implementation). For the purpose of discussing estimation resultswe note that posterior estimates indicating small values of the parameter � lead to a skeweddistribution for the variance scalars in V that can deviate greatly from their prior mean of unity.This provides evidence in favor of a thick-tailed error distribution and heterogeneity. On the otherhand, large posterior estimates for � suggest thin-tailed disturbances obeying homoscedasticity.

4. ESTIMATION OF THE MODEL USING MARKOV CHAIN MONTE CARLO

A fully hierarchical Bayesian approach to estimation requires specification of the prior distributionsfor the fixed effects parameters ˇ, the parameter � measuring the strength of spatial dependence,the variance �� for the spatial effects �, and the variances �� and V of the disturbances �.

Because the general model contains only proper prior distributions, the joint posterior distributionis proper. However, in our application, we compare this general model to special cases that involveimproper priors, a subject taken up in the next section.

Copyright 2008 John Wiley & Sons, Ltd. J. Appl. Econ. 23: 235–256 (2008)DOI: 10.1002/jae

Page 11: Using the variance structure of the conditional autoregressive spatial specification to model knowledge spillovers

SPATIAL ECONOMETRICS AND SPILLOVER EFFECTS 245

4.1. Comparing Models of Technological and Spatial Proximity

In an effort to assess the relative importance of technological, transport, and spatial proximity aswell as robust modeling for the disturbance variance, we compare models that emphasize thesedifferent facets of the general model described in the previous sections. Four modeling scenariosconsidered here are summarized in Table I.

Model 1 represents the general model that includes asymmetric spatial effects as well as themixture distribution for the disturbances that is robust to outliers. As noted, we rely on anestimated hyperparameter for � to control the amount of heterogeneity in the disturbances, with aMetropolis–Hastings (M-H) step used to draw from the posterior distribution p��jV� as describedin Parent and LeSage (2006).

Model 2 is a special case of Model 1 where the parameter � controlling the variability of theheterogeneity effects is fixed and the hyperparameter �0 is eliminated from the model. The weightmatrix Wij remains the same, allowing for asymmetric spatial dependencies.

Model 3 reflects a model with symmetric spatial effects. This is achieved by restricting theweight matrix Wij to be symmetric:

�ijf�j, i 6D jg ¾ N��i, �2�/ni� �12�

where �i D j2υiWijWiC�j and WiC D jWij. The CAR structure based on spatial contiguity alone

implies that Wij D 1 if areas i and j are adjacent, and Wij D 0 otherwise (with Wii set to 0). Inthis variant of the CAR specification (7), Mii D 1/WiC D 1/ni, where ni is the number of areasadjacent to area i. This model is still robust to heteroscedastic disturbances, allowing the Student-tdistribution to govern the disturbance process as in Model 2. However, the parameter � measuringthe strength of the spatial dependence is set to max D 1, the maximum eigenvalue of W. Onefeature of this conditional specification is that given the specific choice of W the model does notyield a positive definite precision matrix. Hence the corresponding joint specification does notexist. A possible benefit of this is that the model becomes non-stationary and may be capable ofcapturing spatially irregular behavior. Besag and Kooperberg (1995) resolve the lack of a full jointspecification by including a separate intercept term with an invariant uniform prior and imposingthe constraint that the �i sum to zero.

Table I. Models

Level Model 1 Model 2 Model 3 Model 4

Level 1 y = a C Xˇ C � C �Level 2 � ¾ N�0, �2

��In �� W��1M�� ¾ N�0, �2

�V�Level 3 p�vi� ¾ Ga�1

(�2 ,�2

)— — —

Prior p�ˇ� ¾ N�f,T� p�ˇ� ¾ N�f,T� p�ˇ� ¾ N�f,T� p�ˇ� ¾ N�f,T�p�a� ¾ N�f0, T0� p�a� ¾ N�f0, T0� p�a� ¾ U��1,C1� p�a� ¾ U��1,C1�p��2

�� ¾ Ga�1�a, b� p��2�� ¾ Ga�1�a, b� p��2

�� ¾ Ga�1�a, b� p��2�� ¾ Ga�1�a, b�

p��2� � ¾ Ga�1�c, d� p��2

� � ¾ Ga�1�c, d� p��2� � ¾ Ga�1�c, d� p��2

� � ¾ Ga�1�c, d�

— p�vi� ¾ Ga�1(�12 ,

�12

)p�vi� ¾ Ga�1

(�12 ,

�12

)—

p��� ¾ Beta�a0, a0� p��� ¾ Beta�a0, a0� — —p��� ¾ Exp��0� — — —

Copyright 2008 John Wiley & Sons, Ltd. J. Appl. Econ. 23: 235–256 (2008)DOI: 10.1002/jae

Page 12: Using the variance structure of the conditional autoregressive spatial specification to model knowledge spillovers

246 O. PARENT AND J. P. LESAGE

Model 4 fits the simplest model with the same CAR structure based only on spatial proximityas Model 3, but without the mixture distribution for the disturbances. This allows us to draw aninference regarding the impact of accounting for heteroscedasticity in the sample data.

Alternative choices for the spatial weight matrix W lead to three different variance–covariancestructures (a, b, c) for Models 1 and 2 and three others (d, e, f) for Models 3 and 4. Asymmetricspatial effects are introduced only in Models 1 and 2. For each of these two models, the threechoices for the W matrix are set forth:

(a) Wij D(

GDPiGDPj

)1/2

ð Sij if region i is contiguous to j, and 0 otherwise, introducing the

technological proximity index between contiguous regions.

(b) Wij D(

GDPiGDPj

)1/2

ð Tij if region i is contiguous to j, and 0 otherwise, taking into account

information about network transportation proximity.(c) Wij D �GDPi/GDPj�1/2 if i is contiguous to j, and 0 otherwise, measuring only asymmetric

geographical effects. Spatial effects will be more variable between regions with large differencesin their level of economic activity.

These weight matrices W are asymmetric since they all contain the GDP ratio allowing forcontiguous regions with higher capacity of absorption to benefit more from spatial spillovers.Models 3 and 4 correspond to symmetric cases of the latter two models. Three specifications areset forth here as well:

(d) Wij D Sij if i is contiguous to j, and 0 otherwise and Mii D 1/�jSij�, where Sij are definedin (1). This specification favors spatial effects for contiguous regions which hold patents insimilar fields of innovative activity.

(e) The information regarding transport proximity between regions is implemented in the weightmatrix: Wij D Tij if i is contiguous to j, and 0 otherwise and Mii D 1/�jTij�, where Tij aredefined in (2).

(f) Here we set Wij D 1 when region i is contiguous to j, and 0 otherwise, and Mii D 1/ni. Thisspecification allows only spatial contiguity effects, as noted in the discussion surrounding model(12).

Direct evaluation of the joint posterior distribution for estimation and model comparisonpurposes would involve multidimensional numerical integration and is not computationallyfeasible. We use MCMC sampling methods which involve generating sequential samples fromthe complete set of conditional posterior distributions detailed in Parent and LeSage (2006).

4.2. Model Comparison

Bayesian model comparison often relies on Bayes factors which involve calculation of the marginallikelihood for competing models. For our hierarchical models we need to produce an estimate ofthe marginal likelihood, because exact calculation would require integration over the spatiallystructured effects parameters equal to the number of observations in the model.

Copyright 2008 John Wiley & Sons, Ltd. J. Appl. Econ. 23: 235–256 (2008)DOI: 10.1002/jae

Page 13: Using the variance structure of the conditional autoregressive spatial specification to model knowledge spillovers

SPATIAL ECONOMETRICS AND SPILLOVER EFFECTS 247

We rely on the MCMC method of Chib (1995) and Chib and Jeliazkov (2001) to estimate themarginal likelihood, with a detailed exposition of how their method applies to our model in Parentand LeSage (2006).

Based on results from Chib and Jeliazkov’s method for estimating the marginal likelihood ofthe models we wish to compare, any two rival models denoted Mf and Mg can be consideredusing the Bayes factor:

BFfg D expflog m�yjMf�� log m�yjMg�g

The Bayes factor can be viewed as the relative success of the two models in predicting thesample observations y. According to the well-known scale of Jeffreys, log Bayes factors withvalues in the ranges of (0, 1.15), (1.15, 3.45), (3.45, 4.60) and (4.60, 1) provide ‘very slight’,‘slight’, ‘strong to very strong’ and ‘decisive’ evidence against the model Mg relative to Mf.

Due to the computationally intense nature of Chib and Jeliazkov’s method, a host of alternativetechniques have been proposed to provide an asymptotic approximation to the marginal likelihoodapproach for choosing among competing models. We present results based on the devianceinformation criteria (DIC) of Spiegelhalter et al. (2002) for comparison purposes, where smallervalues of DIC indicate a better-fitting model while controlling for complexity. While the DICaddresses how well the posterior distribution might predict future data, abstracting from sensitivityto specification of priors, the Bayes factor focuses on how well the prior and model predict theobserved sample data.

4.3. Prior Parameter Settings

To complete the model specification, we specify the prior parameter settings used. In the absenceof any prior information it is reasonable to adopt a flat prior for the intercept and a relativelynon-informative prior for the covariate coefficients ˇ D �ˇ1, ˇ2, ˇ3�0, achieved using a zero meanand diagonal variance–covariance matrix with scalar variances of 10,000. We assume that thevariance components �2

� and �2� are mutually independent, and it seems reasonable to assume

these two components have similar magnitudes. Without any prior information, we choose vagueprior settings for the parameters a, b, c and d of the gamma distributions, a D c D 0.5 andb D d D 0.0005. A priori, we adopt a Beta prior (13) for � which is distributed in the interval� �1

min, �1max�, where min < 0 is the smallest eigenvalue and max > 0 is the largest from the

matrix M�1/2WM1/2:

p��� D 1

Be�a0, a0�

�� � �1min�

a0�1� �1max � ��a0�1

� �1min � �1

max�2a0�1 �13�

We set values of a0 D 1.01 to produce a relatively uninformative prior that downweights to zerothe prior weight placed on end points of the interval for �, consistent with theoretical restrictions.

To complete the model specification, we use in Models 2(a, b, c) and 3(d, e, f) the mixturerepresentation of a Student-t distribution for the variance scalars discussed in the previous section,with � D 4 as suggested by Geweke (1993).

For Models 1(a, b, c), we choose a general specification for the exponential prior distributionon the parameter � corresponding to the degrees-of-freedom parameter of the implied Student-tdensity assigned to the disturbances. The prior parameter �0 was introduced by Geweke (1993)

Copyright 2008 John Wiley & Sons, Ltd. J. Appl. Econ. 23: 235–256 (2008)DOI: 10.1002/jae

Page 14: Using the variance structure of the conditional autoregressive spatial specification to model knowledge spillovers

248 O. PARENT AND J. P. LESAGE

to allow support for varying amounts of thickness in the tails of the disturbance distribution. Weconsider a value that allocates substantial prior weight to moderately fat-tailed distributions thatdeviate from the thin-tailed normal error distribution, specifically �0 D 0.04.

5. EMPIRICAL RESULTS

We begin our presentation of results by comparing the 12 models using both the DIC criterion andthe log marginals estimated using Chib and Jeliazkov’s method described in Parent and LeSage(2006). Estimation results are based on a simulated chain where the first 5000 samples are discardedas a ‘burn-in’ period, followed by 15,000 iterations. The model comparison results indicate thatthe most general models (those we labeled Models 1) are superior to the simpler models (labeledModels 2 through 4). Estimates and inferences are presented in Section 5.2 based on Model 1a,which has the highest log marginal likelihood.1

5.1. COMPARISON OF MODELS

Table II details the components of the DIC and log marginal likelihood of the 12 competing models.Comparing Models 3(d, e, f) with Models 4(d, e, f), which represent the same weighting schemes,indicates clearly that the homoscedastic normal distribution for the disturbances is inferior to therobust Student-t distribution.

Focusing on the spatial weight structures provides evidence of improvement in the fit of thesemodels as we move from strictly contiguous effects (c, f) to the introduction of transportationproximity (b, e) and further improvement for models based on technological proximity (a, d).

Table II. Comparison of models with marginal likelihood and DIC

Model Weight matrix Symmetric D D� � pD DIC ln(marg. lik.)

Value Ranking Value Ranking

Model 1a Techno. prox. No 256.5 �91.7 348.3 604.8 1 �411.1 1Model 1b Time trans. No 266.5 �85.1 351.6 618.1 2 �420.1 2Model 1c Contiguity No 282.1 �66.5 348.6 630.7 3 �421.7 3

Model 2a Techno. prox. No 471.5 284.1 187.4 658.9 4 �435.3 4Model 2b Time trans. No 463.5 268.0 195.5 659.0 5 �445.3 5Model 2c Contiguity No 464.5 278.5 186.0 660.5 6 �450.7 6

Model 3d Techno. prox. Yes 549.3 433.5 115.8 665.1 7 �481.2 7Model 3e Time trans. Yes 552.1 437.4 114.7 666.8 8 �482.9 8Model 3f Contiguity Yes 566.8 456.5 110.3 677.1 9 �486.5 9

Model 4d Techno. prox. Yes 622.1 543.6 78.5 700.6 10 �498.2 10Model 4e Time trans. Yes 623.3 545.2 78.1 701.4 11 �500.5 11Model 4f Contiguity Yes 634.0 559.5 74.4 708.5 12 �505.5 12

1 Following a model diagnostic suggestion of Geweke (2004), the general Model 1a was tested by simulating thejoint density f�y, �–and generating y from the full conditional f�yj � (given by the likelihood) at each pass of theMetropolis–Hastings algorithm. The resulting replications of indeed reproduced the prior, suggesting no fundamentalproblems with the sampling scheme.

Copyright 2008 John Wiley & Sons, Ltd. J. Appl. Econ. 23: 235–256 (2008)DOI: 10.1002/jae

Page 15: Using the variance structure of the conditional autoregressive spatial specification to model knowledge spillovers

SPATIAL ECONOMETRICS AND SPILLOVER EFFECTS 249

Models 3 restrict Models 2 by fixing the spatial dependence parameter � at the largest eigenvalueand restricting spatial effects to be symmetric. Table II reveals that model specifications withasymmetric spatial effects exhibits lower DIC and log marginal values. Introducing the outputgap between contiguous regions to capture unequal capacity for knowledge absorption increasesthe fit of models: from ��481.2� to ��435.3�, when considering technological proximity; from��482.9� to ��445.3�, for the case involving transportation proximity; and from ��486.5� to��450.7�, when only spatial contiguity effects are taken into account.

Bayes factors reported in Table III suggest no great difference between Models 3d and 3e,reflecting symmetric technological and transport proximity, respectively. In contrast, Bayes factorsfor the asymmetric cases, Models 2a and 2b, clearly support technological proximity over transportproximity.

We note that the goodness-of-fit measure represented by D in Table II is larger for Model 2a(471.5), Model 2b (463.5) and Model 2c (464.5) than Model 1a (256.5), Model 1b (266.5) andModel 1c (282.1), indicating that allowing for a more flexible specification involving the Student-t distribution provided a large improvement in fit of Models 1 relative to Models 2. (Smallervalues of D reflect a better fit.) Spatial heterogeneity that takes the form of outliers might beexpected to produce large systematic effects on the parameter estimates of the spatial model. Therobust approach of Models 1 to accommodate outliers in the error distribution resulted in a greatdeal of variation in the marginal likelihood estimates as the prior specification for the degrees offreedom parameter changed from one that placed prior weight on moderately fat tails to one thatplaced weight on thin tails.2 Using Chib and Jeliazkov’s method to produce marginal likelihoodestimates, the highest log marginal likelihood ��411.1� corresponded to the hyperparameter settingof �0 D 0.04, which allocates substantial prior weight to moderately fat-tailed distributions. Weused this prior specification to make comparisons of Model 1 with results from other models inthe sequel.

Table III. Estimated log marginal likelihoods (on the diagonal)

Mod.1a

Mod.1b

Mod.1c

Mod.2a

Mod.2b

Mod.2c

Mod.3d

Mod.3e

Mod.3f

Mod.4d

Mod.4e

Mod.4f

Mod.1a �411.1 (9) (10.6) (24.2) (34.2) (39.6) (70.1) (71.8) (75.4) (87.1) (89.4) (94.4)Mod.1b �420.1 (1.6) (15.2) (25.2) (30.6) (61.1) (62.8) (66.4) (78.1) (80.4) (85.4)Mod.1c �421.7 (13.6) (23.6) (29) (59.5) (61.2) (64.8) (76.5) (78.8) (83.8)Mod.2a �435.3 (10) (15.4) (45.9) (47.6) (51.2) (62.9) (65.2) (70.2)Mod.2b �445.3 (5.4) (35.9) (37.6) (41.2) (52.9) (55.2) (60.2)Mod.2c �450.7 (30.5) (32.2) (35.8) (47.5) (49.8) (54.8)Mod.3d �481.2 (1.7) (5.3) (17) (19.3) (24.3)Mod.3e �482.9 (3.6) (15.3) (17.6) (22.6)Mod.3f �486.5 (11.7) (14) (19)Mod.4d �498.2 (2.3) (7.3)Mod.4e �500.5 (5)Mod.4f �505.5

Note: The entries in the upper half are the log of the Bayes factor in favor of the row model versus the column model.

2 As proposed by Geweke (1993) three different prior specifications ��0 D 0.02, 0.04 and 0.2) were compared, suggestingthat a thinner-tailed prior ��0 D 0.02� for the disturbance distribution, as well as a fatter-tailed prior ��0 D 0.2�, bothdecreased the fit of model and altered the estimates.

Copyright 2008 John Wiley & Sons, Ltd. J. Appl. Econ. 23: 235–256 (2008)DOI: 10.1002/jae

Page 16: Using the variance structure of the conditional autoregressive spatial specification to model knowledge spillovers

250 O. PARENT AND J. P. LESAGE

5.2. Estimation Results

Table IV presents estimation results based on both WINBUGS and MATLAB algorithms formodels incorporating technological proximity effects.3 Model 1a is characterized by asymmetricspatial effects, with a Student-t distribution assigned to the disturbances, and an exponential priorfor the degrees of freedom �. Model 3d relies on symmetric spatial effects and a less flexiblespecification for the Student-t distribution based on � D 4, providing some indication of the impactof these model characteristics on posterior inferences regarding the covariates.

In the following discussion of the fixed effects estimates we relied on 5 and 95 percentage pointsof the highest posterior density intervals (HPDI) to draw inferences regarding whether the posteriormeans were different from zero. For both models, the variable EXP reflecting private investmentin R& D exerts a positive influence on regional innovative activity, having a coefficient value of�C0.50� for Model 1a and �C0.35� for Model 3d. The variable N Priv measuring R& D employ-ment is not different from zero with a coefficient of ��0.05� and �C0.02�, respectively. For bothmodels, the diversity variable was defined so that the negative values ��3.69� and ��2.68� provideevidence in favor of externalities suggested by Jacobs (1969) that arise from a diverse industrystructure. The negative value for this parameter indicates that areas with heterogeneous sectors ben-efit more from knowledge flows than regions where technology is more specialized. In other words,diversity at the regional level results in more innovative activity measured by patents granted.

5.3. Spatial Variability and Technological Proximity

We now turn to the spatially structured, �i, and unstructured, �i, components, where comparisonof their relative variances reveals some interesting features. In Model 3d, using draws from the

Table IV. Posterior quantities for Models 1a and 3d

Parameter Model 1a Model 3d

MATLAB WINBUGS MATLAB WINBUGS

Constant 1.02 1.02 1.42 1.41(0.16) (0.15) (0.16) (0.15)

EXP 0.50 0.49 0.35 0.35(0.05) (0.05) (0.09) (0.08)

N Priv �0.05 �0.05 0.02 0.03(0.05) (0.05) (0.06) (0.06)

Div �3.69 �3.58 �2.68 �2.79(0.52) (0.51) (0.65) (0.65)

�� 0.41 0.38 0.47 0.52(0.10) (0.13) (0.04) (0.06)

�� 0.01 0.01 0.78 0.55(0.00) (0.00) (0.07) (0.11)

� 0.18 0.19(0.02) (0.01)

� 7.84a 7.78b

(5.48) (10.6)

a Numerical standard error (NSE) more than 1%.b MC error more than 1%.

3 See Parent and LeSage (2006) for a discussion of the differences between these two algorithms.

Copyright 2008 John Wiley & Sons, Ltd. J. Appl. Econ. 23: 235–256 (2008)DOI: 10.1002/jae

Page 17: Using the variance structure of the conditional autoregressive spatial specification to model knowledge spillovers

SPATIAL ECONOMETRICS AND SPILLOVER EFFECTS 251

Gibbs sampler in WINBUGS, the heterogeneity component �� was found to have a posteriormean of 0.52 with HPDI for the 2.5 and 97.5 percentiles equal to (0.42–0.61). The spatiallystructured component �� had a posterior mean of 0.55 and HPDI interval of (0.35–0.76). Thisdifference in variability is more evident in the estimates produced using MATLAB where �� and�� are 0.47 and 0.78, respectively. This suggests that spatial proximity has a greater influenceon innovative activity than that attributed to unstructured heterogeneity. Focusing on Model 1a,the variance of spatial effects �C0.01� are different from zero, but smaller than the heterogeneitycomponent �C0.41�. This is due to the variance matrix ��2

�Mii� of the CAR prior, where the matrixMii D GDPi for Model 1a is larger than Mii D 1/ni in Model 3d.

Estimation of the parameter � measuring the strength of spatial dependence reveals positivespatial dependence. Since the inverse of the smallest and largest eigenvalues of M�1/2WM1/2 are �1

min D �0.39 and �1max D 0.20, the parameter estimate for � is near the upper bound determined

by the inverse of the largest eigenvalue.4

Spatial effects estimates from Model 1a are shown in Table V, where we see clear differencesin absolute magnitudes as we move from the core of Europe to peripheral regions. The map inFigure 2 presents the spatial effects estimates classified into positive, negative and not differentfrom zero based on posterior probabilities that the estimates are different from zero. The provincesof Emilia-Romagna (20-it, Bologna) in Italy, areas near Paris in France (14-fr, Hauts-de-Seine),the German Landers of Baden-Wurttenberg (19-de, Stuttgart) or Bayern (16-de, Mittlefranken),southern Netherlands (7-nl, Noord Brabant), Scotland (10-uk, eastern Scotland), and the centerof Britain (11-uk, Stoke-on-Trent, 21-uk, Swindon) stand out as having a positive spatial effects.Regions exhibiting negative spatial effects are in Ireland (3-ie, Border), southwestern France (1-fr,Lot-Garonne) and southern Italy (2-it, Napoli).

Figure 2 shows a fair amount of concentration in the geographical spread of innovativeactivity over Europe. This concentration of positive and negative spatial effects occurs in variousdimensions. Table V provides a comparison of the spatial effects estimates that we interpret asspillovers associated with innovation for Models 1 and Models 3.

Models 1c and 3f are based exclusively on spatial proximity, with the latter model containingsymmetric spatial effects, while Model 1c introduces asymmetric weights using the ratio of GDPbetween contiguous regions. Small innovative regions that exhibit spatial effects different from zeroin Model 3f are not different from zero in Model 1c, when they are near regions of high innovativeactivity. For example: Hartlepool and Stockton (contiguous to the high innovative activity region12-uk, Darlington) reveals large spatial effects of �C0.90� with a standard deviation of �C0.43�in Model 3f, which falls to �C0.68� with a standard deviation of �C0.39� in Model 1c. Moreover,some areas with high levels of innovative activity that are not different from zero in the symmetriccase become different from zero in Model 1c, which introduces a measure of regional capacity toreceive knowledge spillovers; for example, the Luxembourg region exhibits positive spatial effectsonly in the asymmetric model. For Model 1c, the estimate of the spatial effect for this country is�C0.98� with a standard deviation of �C0.44�.

Use of spatial proximity alone suggests a smoother, less polarized pattern of spatial spillovers.When transport proximity is used to augment spatial proximity, we find greater regional specificityin the spillover estimates (relative to spatial proximity alone). These presumably arise from theimproved measure of interregional connectivity that allows for greater variability in the spatial

4 This is a common result for CAR models (e.g., Besag, 2002, p. 1272; and Militino et al., 2004).

Copyright 2008 John Wiley & Sons, Ltd. J. Appl. Econ. 23: 235–256 (2008)DOI: 10.1002/jae

Page 18: Using the variance structure of the conditional autoregressive spatial specification to model knowledge spillovers

252 O. PARENT AND J. P. LESAGE

Table V. Estimates of the spatially structured parameter �i (selected regions shown)

RegionWeight matrices

Model 1a

sð(

GDPiGDPj

)1/2Model 1b

Tð(

GDPiGDPj

)1/2Model 1c(GDPiGDPj

)1/2Model 3d

sModel 3e

1/TTModel 3f

1

1-fr Lot-Garonne �1.05 �0.86 �1.14 �1.11 �1.06 �1.01(0.42) (0.45) (0.39) (0.33) (0.35) (0.34)

2-it Napoli �0.94 �1.00 �1.04 �1.59 �1.59 �1.59(0.43) (0.45) (0.38) (0.36) (0.33) (0.36)

3-ie Border �0.90 �0.70 �0.97 �1.10 �1.25 �1.10(0.44) (0.43) (0.41) (0.37) (0.39) (0.36)

4-fr Gironde �0.62 �0.65 �0.72 �0.76 �0.75 �0.75(0.37) (0.42) (0.36) (0.38) (0.39) (0.36)

5-fr Morbihan �0.70 �0.69 �0.69 �0.73 �0.76 �0.73(0.39) (0.43) (0.39) (0.36) (0.38) (0.36)

6-de Tubingen 0.40 0.16 0.38 0.66 0.71 0.66(0.38) (0.41) (0.39) (0.33) (0.36) (0.36)

7-nl N. Brabant 0.75 0.60 0.68 0.60 0.56 0.62(0.44) (0.33) (0.43) (0.36) (0.36) (0.39)

8-uk Gloucester. 0.75 0.51 0.73 0.76 0.79 0.76(0.35) (0.38) (0.35) (0.30) (0.29) (0.31)

9-it Parma 0.76 0.58 0.73 0.34 0.39 0.33(0.38) (0.41) (0.38) (0.31) (0.31) (0.31)

10-uk E.Scotland 0.78 0.39 0.76 0.63 0.69 0.60(0.38) (0.35) (0.38) (0.32) (0.35) (0.32)

11-uk Stoke-Trenta 0.79 0.50 0.76 0.63 0.61 0.63(0.36) (0.38) (0.36) (0.29) (0.29) (0.31)

12-uk Darlingtonb 0.83 0.74 0.80 0.95 0.94 0.91(0.45) (0.43) (0.41) (0.37) (0.35) (0.38)

13-uk Worcestershire 0.91 0.56 0.87 0.79 0.79 0.81(0.39) (0.39) (0.38) (0.31) (0.31) (0.33)

14-fr Hauts-de-Seine 0.95 0.48 0.90 0.60 0.52 0.64(0.45) (0.41) (0.45) (0.32) (0.28) (0.34)

15-de Dusseldorf 0.99 0.66 0.93 0.75 0.79 0.77(0.41) (0.44) (0.41) (0.31) (0.33) (0.34)

16-de Mtl-franken 1.02 0.72 0.95 0.90 0.94 0.92(0.40) (0.44) (0.40) (0.30) (0.33) (0.33)

17-de Oberbayern 1.05 0.60 0.98 0.93 0.97 0.95(0.44) (0.43) (0.43) (0.35) (0.38) (0.37)

18-it Treviso 1.14 0.93 1.10 0.81 0.78 0.79(0.43) (0.46) (0.41) (0.34) (0.33) (0.35)

19-de Stuttgart 1.15 0.77 1.07 0.97 1.00 1.01(0.43) (0.45) (0.44) (0.33) (0.33) (0.35)

20-it Bologna 1.38 1.17 1.35 0.83 0.87 0.84(0.54) (0.51) (0.48) (0.38) (0.36) (0.39)

21-uk Swindonc 1.54 0.94 1.47 0.96 1.07 1.00(0.46) (0.47) (0.46) (0.33) (0.35) (0.35)

a Stoke-on-Trent and Staffordshire CC.b Darlington and Durham CC.c Swindon and Wiltshire CC.

correlation structure that arises when transport proximity is used to augment the conventionalmeasure of spatial proximity.

Models 1b and 3e presented in Table V show asymmetric and symmetric spatial effects estimateswhen transport network proximity is taken into account. For example, Model 1b, which allows for

Copyright 2008 John Wiley & Sons, Ltd. J. Appl. Econ. 23: 235–256 (2008)DOI: 10.1002/jae

Page 19: Using the variance structure of the conditional autoregressive spatial specification to model knowledge spillovers

SPATIAL ECONOMETRICS AND SPILLOVER EFFECTS 253

Figure 2. Map of positive, negative and zero effects estimates

asymmetric spatial effects by measuring the output gap between contiguous areas, exhibits positivespatial effects �C1.18� for the East Lothian and Midlothian region, in eastern Scotland, whereasModel 3e, which includes symmetric effects of transportation proximity, implies estimates that arenot different from zero. This may be due to high infrastructure quality which strengthened spatialinteractions from the region with a low level of economic activity to the contiguous region withhigher activity. Similar differences can be found in northern Italy, southern Germany, and theNetherlands. It is of interest that these regions are known for their high-quality infrastructure andgood accessibility to other European regions. In addition, these regions are considered importantinnovation producers, lending credence to our estimation results. Turning to negative spatial effects,we observe from Table V that these are strengthened for isolated regions when transportationtime is introduced in the symmetric Models 3. In the asymmetric model, the magnitude of thesenegative effects is reduced for regions that exhibit high levels of innovative activity. For example,comparing Model 3f to Model 3e, introduction of the transportation proximity reinforces negative

Copyright 2008 John Wiley & Sons, Ltd. J. Appl. Econ. 23: 235–256 (2008)DOI: 10.1002/jae

Page 20: Using the variance structure of the conditional autoregressive spatial specification to model knowledge spillovers

254 O. PARENT AND J. P. LESAGE

effects from ��1.01� to ��1.06� for region 1-fr, Lot-et-Garonne in France and from ��1.10� to��1.25� for region 3-ie, Border in Ireland. Neighbors to these areas exert a negative impact ontheir innovation activity. This pattern is consistent with that found for the positive spillovers,suggesting that introducing information about transport proximity or accessibility between regionstriggers a polarization in spatial effects estimates. Moreover, introducing asymmetric spatial effectsreduces these negative spillovers for areas with higher innovative activity. For instance, negativespatial effects that appear in Model 3e in regions like 5-fr, Morbihan in western France vanish inModel 1b, suggesting that a benefit arises from neighboring areas.

Output gaps play an important role in the spillover process through their impact on absorptivecapacity. Considering the technological dimension represented in Models 1a and 3d, we findthat regions on approximately equal technology levels (southern Germany, southern Britain, theNetherlands and northern Italy) display the greatest level of spillovers. For instance, region 11-uk,Stoke-on-Trent and Staffordshire CC in England exhibits strong spatial effects �C0.79� arising fromtechnological proximity with neighboring areas like Warwickshire and West Midlands. Accordingto the International Patent Classification, these areas have respectively 51%, 51% and 56% ofpatents granted in section B (performing operations; transporting) and section F (mechanicalengineering). High specialization reinforces spatial spillovers through technological proximity.This remark also applies to other regions like 20-it, Bologna in northern Italy, where 45% ofpatents are granted in the sector of performing operations and transporting. The neighboring areasRavena and Modena also have innovation activity in this sector, accounting for 46% and 47% ofpatent activity. Focusing on the impact of the output gap between contiguous regions, asymmetriceffects are strengthened around large cities. The area of 19-de, Stuttgart, one of the main largercities in Germany, exhibits stronger spatial effects when introducing the asymmetric specification(from �C0.97� in Model 3d to �C1.15� in Model 1a), whereas some of its contiguous regionslike Tubingen exhibit effects that change from positive and significant to insignificant. The sameholds true for other regions contiguous to larger cities like London in England, or Parma in Italy.In recent years, an increasing concentration of innovation activities in and around major urbancenters has been noticed (Audretsch and Feldman, 1996).

It appears that in Europe the largest spillovers are for the most part taking place between alimited set of highly developed regions. This result is consistent with those found by Maursethand Verspagen (2002), and technology gap theories that posit technology diffusion is in no senseautomatic, but demands a certain level of economic development in addition to innovative efforts.Moreover, regions where technological proximity plays a more important role also tend to exhibita greater geographic concentration. Based on the model comparison and spatial effect estimates,it appears that the propensity for innovative activity to cluster spatially is more attributable tothe asymmetric influence of knowledge spillovers coming from regions endowed with high levelsof similar technological activities, and not merely geographic or transportation proximity. Thesefindings are consistent with those reported by Fischer et al. (2006) using patents citations forhigh-technology industries.

6. CONCLUSION

A Bayesian hierarchical model of knowledge spillovers was developed that specifies differentconnectivity structures between regions by relying on technological as well as transportation andgeographical proximity. This is in contrast to most spatial econometric models of knowledge

Copyright 2008 John Wiley & Sons, Ltd. J. Appl. Econ. 23: 235–256 (2008)DOI: 10.1002/jae

Page 21: Using the variance structure of the conditional autoregressive spatial specification to model knowledge spillovers

SPATIAL ECONOMETRICS AND SPILLOVER EFFECTS 255

spillovers that rely exclusively on geographic connectivity. Model comparisons led us to concludethat a model which combines geographic and technological proximity, takes into account theasymmetric output gap between contiguous regions, and allows for heterogeneity in the errorvariances was most consistent with the sample data information from our 323 European regions.Another innovative aspect of the model proposed here is the ability to accommodate non-constantvariance and outliers, which seems plausible given the high variability in patents granted over thesample of regions studied.

Empirical results lead to the conclusion that innovation in the European Union may becharacterized as one with centers of polarization where knowledge flows relatively freely, helpedby relatively small output gaps, and similar specialization patterns.

Although any measure of technological proximity is bound to be imperfect, our model-basedapproach provides a method that allows alternative proximity indexes in the spirit of Jaffe’sindex to be used in a spatial CAR specification of latent spatial effects. We use the spatial effectsestimates to analyze the role that geographical and technological knowledge spillovers may play inconcentration of innovative activity. Controlling for geographic concentration of production, spatialeffects estimates can be used to quantify the influence of technologically specialized clusters ofneighboring regions on innovation activity.

ACKNOWLEDGEMENTS

The authors acknowledge comments from P. Deschamps, F. Laisney, the participants at the thirdSpatial Econometrics Workshop in Strasbourg, 2004, the ESEM, Madrid, 2004, the InternationalWorkshop on Spatial Econometrics and Statistics, Rome, 2006, and seminar participants at theUniversity of Manchester, University of Toledo, and West Virginia University. We would also liketo thank the reviewers and co-editor John Rust for thoughtful comments and suggestions that haveimproved this work.

REFERENCES

Abreu M, de Groot HLF, Florax RJGM. 2005. Space and growth: a survey of empirical evidence andmethods. Region et Developpement 21: 13–44.

Anselin L, Varga A, Acs Z. 1997. Local geographic spillovers between university research and hightechnology innovations. Journal of Urban Economics 42: 422–448.

Audretsch DB, Feldman MP. 1996. R&D spillovers and the geography of innovation and production.American Economic Review 86: 630–640.

Autant-Bernard C. 2001. The geography of knowledge spillovers and technological proximity. Economics ofInnovation and New Technology 10: 237–254.

Berger JO, De Oliveira V, Sanso B. 2001. Objective Bayesian analysis of spatially correlated data. Journalof the American Statistical Association 96: 1361–1374.

Besag JE. 2002. Discussion: What is a statistical model? (Peter McCullagh). Annals of Statistics 30:1267–1277.

Besag JE, Kooperberg CL. 1995. On conditional and intrinsic autoregressions. Biometrika 82: 733–746.Chib S. 1995. Marginal likelihoods from the Gibbs sampler. Journal of the American Statistical Association

90: 1313–1321.Chib S, Jeliazkov I. 2001. Marginal likelihood from the Metropolis–Hastings output. Journal of the American

Statistical Association 96: 270–281.Cressie N. 1995. Bayesian smoothing of rates in small geographic areas. Journal of Regional Science 35:

659–673.

Copyright 2008 John Wiley & Sons, Ltd. J. Appl. Econ. 23: 235–256 (2008)DOI: 10.1002/jae

Page 22: Using the variance structure of the conditional autoregressive spatial specification to model knowledge spillovers

256 O. PARENT AND J. P. LESAGE

Fischer MM, Scherngell T, Jansenberger E. 2006. The geography of knowledge spillovers between high-technology firms in Europe: evidence from a spatial interaction modelling perspective. GeographicalAnalysis 38: 288–309.

Geweke J. 1993. Bayesian treatment of the independent Student-t linear model. Journal of Applied Econo-metrics 8: 19–40.

Geweke J. 2004. Getting it right: joint distribution tests of posterior simulators. Journal of the AmericanStatistical Association 99: 799–804.

Griliches Z. 1979. Issues in assessing the contribution of research and development to productivity growth.Bell Journal of Economics 10: 92–116.

Hall BH, Jaffe AB, Trajtenberg M. 2001. The NBER patent citation data file: lessons, insights and method-ological tools. NBER Working Paper no. 8498.

Jacobs J. 1969. The Economy of Cities. Vintage: New York.Jaffe AB. 1986. Technological opportunity and spillovers of R&D: evidence from firms’ patents, profits, and

market value. American Economic Review 76: 984–1001.Jaffe AB, Trajtenberg M, Henderson R. 1993. Geographic localization of knowledge spillovers as evidenced

by patent citations. Quarterly Journal of Economics 108: 576–598.Keller W. 2002. Geographic localization of international technology diffusion. American Economic Review

92(1): 120–142.Kim CW, Phipps TT, Anselin L. 2003. Measuring the benefits of air quality improvement: a spatial hedonic

approach. Journal of Environmental Economics and Management 45: 24–39.Maurseth PB, Verspagen B. 2002. Knowledge spillovers in Europe: a patent citations analysis. Scandinavian

Journal of Economics 104: 531–545.Militino AF, Ugarte MD, Garcia-Reinaldos L. 2004. Alternative models for describing spatial dependence

among dwelling selling prices. Journal of Real Estate Finance and Economics 29(2): 193–209.Parent O, LeSage JP. 2006. Using the variance structure of the conditional autoregressive spatial specification

to model knowledge spillovers. Available at SSRN: http://ssrn.com/abstractD924624 [1 September 2007].Smith TE, LeSage JP. 2004. A Bayesian probit model with spatial dependencies. In Advances in Econo-

metrics. Vol. 18: Spatial and Spatiotemporal Econometrics , LeSage JP, Pace RK (eds). Elsevier: Oxford;127–160.

Spiegelhalter DJ, Best NG, Carlin BP, van der Linde A. 2002. Bayesian measures of model complexity andfit. Journal of the Royal Statistical Society, Series B 64(4): 583–639.

Van Pottelsberghe de la Potterie B, Lichtenberg F. 2001. Does foreign direct investment transfer technologyacross borders? Review of Economics and Statistics 83: 490–497.

Wall MM. 2004. A close look at the spatial correlation structure implied by the CAR and SAR models.Journal of Statistical Planning and Inference 121(2): 311–324.

Copyright 2008 John Wiley & Sons, Ltd. J. Appl. Econ. 23: 235–256 (2008)DOI: 10.1002/jae