Quantitative 3-D analysis of GFAP labeled astrocytes from fluorescence confocal images

Preview:

Citation preview

C

Qfl

PYa

b

c

U

h

••••••

a

ARRAA

KAMALHQ

1

b

h0

Journal of Neuroscience Methods 246 (2015) 38–51

Contents lists available at ScienceDirect

Journal of Neuroscience Methods

jo ur nal home p age: www.elsev ier .com/ locate / jneumeth

omputational Neuroscience

uantitative 3-D analysis of GFAP labeled astrocytes fromuorescence confocal images

rathamesh M. Kulkarnia, Emily Bartonb, Michalis Savelonasa, Raghav Padmanabhana,anbin Lua, Kristen Trett c, William Shainc, J. Leigh Leasureb, Badrinath Roysama,∗

Department of Electrical and Computer Engineering, University of Houston, N308 Engineering Building 1, Houston, TX 77004-4005, United StatesDepartment of Psychology, University of Houston, 126 Heyne Building, Houston, TX 77204-5022, United StatesCenter for Integrative Brain Research, Seattle Children’s Research Institute, 1900 Ninth Avenue 10th Floor, Mail Stop JMB-10, Seattle, WA 98101-1309,nited States

i g h l i g h t s

Quantitative 3-D profiling of brain astrocytes from confocal fluorescence images.Useful for quantitative studies of astrocytes in health, injury, and disease.Identifies astrocyte nuclei and generates 3-D arbor reconstructions.Produces comprehensive arbor measurements.Performs a harmonic co-clustering of the cell population.Uses machine-learning to cope with biological and imaging variability.

r t i c l e i n f o

rticle history:eceived 31 December 2014eceived in revised form 13 February 2015ccepted 14 February 2015vailable online 5 March 2015

eywords:strocyte arbor reconstructionachine learning

strocyte quantification-measurearmonic co-clusteringuantitative arbor analytics

a b s t r a c t

Background: There is a need for effective computational methods for quantifying the three-dimensional(3-D) spatial distribution, cellular arbor morphologies, and the morphological diversity of brain astrocytesto support quantitative studies of astrocytes in health, injury, and disease.New method: Confocal fluorescence microscopy of multiplex-labeled (GFAP, DAPI) brain tissue is usedto perform imaging of astrocytes in their tissue context. The proposed computational method identifiesthe astrocyte cell nuclei, and reconstructs their arbors using a local priority based parallel (LPP) tracingalgorithm. Quantitative arbor measurements are extracted using Scorcioni’s L-measure, and profiled byunsupervised harmonic co-clustering to reveal the morphological diversity.Results: The proposed method identifies astrocyte nuclei, generates 3-D reconstructions of their arbors,and extracts quantitative arbor measurements, enabling a morphological grouping of the cell population.Comparison with existing methods: Our method enables comprehensive spatial and morphological profil-ing of astrocyte populations in brain tissue for the first time, and overcomes limitations of prior methods.

Visual proofreading of the results indicate a >95% accuracy in identifying astrocyte nuclei. The arborreconstructions exhibited 3.2% fewer erroneous jumps in tracing, and 17.7% fewer false segments com-pared to the widely used fast-marching method that resulted in 9% jumps and 20.8% false segments.Conclusions: The proposed method can be used for large-scale quantitative studies of brain astrocytedistribution and morphology.

. Introduction

Glia accounts for a large fraction of the cells in the mammalianrain, with astrocytes the most abundant cell type (Verkhratsky

∗ Corresponding author. Tel.: +1 713 743 4400; fax: +1 713 743 4444.E-mail address: broysam@central.uh.edu (B. Roysam).

ttp://dx.doi.org/10.1016/j.jneumeth.2015.02.014165-0270/© 2015 Elsevier B.V. All rights reserved.

© 2015 Elsevier B.V. All rights reserved.

and Butt, 2007). Astrocytes are critical to brain development, phys-iology, and pathology, including: (i) regulation of neuro, glio- andsynaptogenesis (Song et al., 2002); (ii) development and regulationof the blood–brain barrier (Alonso et al., 2010); (iii) responding to

brain insults through reactive astrogliosis (Zhao et al., 2003); (iv)response to diseases, such as HIV, depression, brain ischemia andedema, epilepsy, and dementia (Molofsky et al., 2012; Oberheimet al., 2008; Sidoryk-Wegrzynowicz et al., 2011); (v) reaction to

urosc

fCaoa2

slatblsa2rpb(cmcG

tWitpoimpaMeswaC(stchidba

wop(tthlBccbvaf

P.M. Kulkarni et al. / Journal of Ne

oreign objects including neural probes (Bjornsson et al., 2006;hen et al., 2012). In executing these functions, astrocytes undergolterations in structure, functional state, and relationship withther cells. There is growing interest in understanding the roles ofstroglia in health and disease (Norton et al., 1992; Ransom et al.,003; Kanski et al., 2013; López-Hidalgo and Schummers, 2014).

Astrocytes can be imaged three dimensionally in thick brain tis-ue sections by fluorescence confocal microscopy, commonly byabeling the intermediate filament protein GFAP (glial fibrillarycidic protein), and such images constitute our focus. GFAP hashe advantage of labeling the main processes of the majority ofrain astrocytes in most brain regions. However, it has important

imitations that must be kept in mind. Notably, there is no knowningle marker that labels all astrocytes with absolute reliability inll brain regions. Some studies (Bushong et al., 2002; Morrens et al.,012) have shown that GFAP also labels stem cells in some brainegions, and is not detectable in some other astrocytes. GFAP is notresent throughout the astrocyte (Sofroniew and Vinters, 2010),eing absent from the distal and finely branched arbor processesWilhelmsson et al., 2004; Haseleu et al., 2013), as well as astro-yte nuclei, and most cell somas (Fig. 1). With these caveats inind, we use the term “astrocyte” in this paper to refer to a “GFAP+

ell”, and expect that the reader is aware of the limitations ofFAP.

Despite its limitations, GFAP continues to be a workhorse, andhe labeling target of choice for diverse brain tissue imaging studies.

e are interested in identifying the locations of GFAP+ cell nuclein multiplex stained confocal microscope images (Fig. 2), recons-ructing the three-dimensional (3-D) structure of the basal GFAP+ortions of their arbors, and extracting quantitative measurementsf the arbors to support hypothesis-driven and exploratory stud-es. For example, such measurements are needed for profiling the

orphological diversity of astrocytes in normal tissue, and for com-aring normal and altered brain tissues quantitatively (Matyashnd Kettenmann, 2010). It is clear from Fig. 2 (and Figs. 1–3, andovie 1 in the Electronic supplement D) that astrocyte arbors

xhibit considerable heterogeneity in fibril thickness, overall arborize, the appearance of root points, and their spatial relationshipith cell nuclei. Other challenges include the presence of connected

rbors of nearby cells that must be correctly interpreted (A1 and1), and the relationships of astrocyte fibrils with blood vesselsB1). Some authors have suggested the use of additional astrocyte-pecific markers to overcome GFAP’s limitations, such as GLAST andhe calcium binding protein S100� to achieve more reliable and/oromplete astrocyte labeling. We recognize that most instrumentsave a limited number of imaging channels, and there are compet-

ng needs for channels. Accordingly, the methods described here areesigned to work with DAPI and GFAP labeling at a minimum, whileeing capable of taking additional markers into account whenevervailable.

These challenges have been addressed only partially in priorork. The first generation of methods for astrocyte morphol-

gy quantification employed comparatively rudimentary imagerocessing algorithms, and manual assistance. Bushong et al.2003) were among the first to use computational methods forhis purpose. Their method was mostly manual and the compu-ational part was limited to visualization and calculation of imageistograms. Narayan et al. (2007) proposed a method for calcu-

ating the number of astrocytes based on morphological analysis.enesova et al. (2009) employed edge-detection algorithms foromputing GFAP+ cell areas, followed by manual selection of astro-yte soma. One of the first automated methods was presented

y Hashemi et al. (2008). Their method is based on the con-erging squares algorithm for analyzing intracellular signaling instrocytes. Recently, Pirici et al. (2009) proposed the use of theractal dimension on 2-D binarizations of astrocyte images with

ience Methods 246 (2015) 38–51 39

the goal of performing arbor-based morphological classificationof astrocytes. However, their method is sensitive to thresholdparameter adjustments and does not exploit the 3-D arbor mor-phology. Bjornsson et al. (2008) proposed the use of angularvariance and nuclear proximity for localizing the root points, andtheir work inspired the present work. Suwannatat et al. (2012)introduced a semi-automated reconstruction method that usesprobabilistic maps derived from intensity-weighted random walks,starting from manually initialized root points. The target cellsare identified based on maximum probability, whereas uncertainregions are resolved by manual editing. Although this methodprobabilistically assigns a cell for each pixel, it requires man-ual selection of root points and potentially, large-scale editingof the arbor tracing results. Budde and Frank (2012) estimatedpiece-wise orientation histograms using structure tensor analy-sis. While their method can be easily extended to obtain 3-Dorientation estimates, it does not provide pixel-wise orienta-tion estimates that are needed for generating accurate arborreconstructions.

This paper employs active machine learning-based methods forcoping with image variability and an arbor reconstruction methodprovides topological guarantees and is designed for large-scale use.We also present an unsupervised co-clustering method is describedfor large-scale profiling of the morphological heterogeneity ofastrocytes (Lu et al., 2014). Table 1 summarizes the contribution ofthis paper relative to the prior literature that specifically addressesthe problem of astrocyte quantification.

2. Materials and methods

2.1. Tissue preparation and imaging

All experiments were conducted in full compliance with theNational Institute of Health Guide for the Care and Use of LaboratoryAnimals, and applicable institutional protocols. Our computationalmethod was evaluated on images from three different laboratories.In this paper, we primarily report results from the data acquired in apreviously published study of binge alcohol exposure and voluntaryexercise (Maynard and Leasure, 2013). Following intracardial per-fusion with saline and 4% paraformaldehyde, brains were removedand post-fixed overnight. They were then stored at 4 ◦C in 30%sucrose until sectioned (50 �m) on a freezing microtome. Sectionswere stored in cryprotectant in 96-well microtiter plates at −20 ◦Cuntil further processing. Three serial sections from the medial pre-frontal cortex of 24 rats were then multiplex labeled to reveal cellnuclei (DAPI), microglia (Iba-1), vessels (SMI-71), neurons (NeuN),and astrocytes (GFAP). The tissue was rinsed in 0.1 M tris-bufferedsaline (TBS) three times at room temperature for 10 min each, thenblocked for 60 min in 3% normal donkey serum (Sigma–Aldrich, MO,USA). Following the blocking step, the tissue was incubated at 4 ◦Cfor 72 h in primary antibodies (rabbit anti-Iba1, Wako ChemicalsUSA, VA, USA, 1:10,000; guinea pig anti-NeuN, EMD Millipore, MA,1:2000; goat anti-GFAP, Santa Cruz Biotechnology Inc., CA, USA,1:100). The tissue was then rinsed twice in TBS for 15 min each,and then blocked for 15 min in 3% normal donkey serum. Follow-ing this, the sections were incubated for 2 h at room temperature ina cocktail of secondary antibodies (donkey anti-rabbit Alexa 488,Life Technologies, NY, USA, 1:250; donkey anti-guinea pig Alexa594, Jackson ImmunoResearch, PA, USA, 1:250; donkey anti-goatAlexa 633, Life Technologies, NY, USA, 1:250). Finally, sections were

washed three times in TBS, and then treated for 5 min in DAPI (LifeTechnologies, NY, USA). The sections were then given four finalrinses, mounted onto SuperFrost Plus slides and coverslipped usingProlongGold (Life Technologies, NY, USA).

40 P.M. Kulkarni et al. / Journal of Neuroscience Methods 246 (2015) 38–51

Fig. 1. Illustrating GFAP labeling of brain astrocytes. (A) Maximum-intensity projection of a confocal image of a GFAP+ astrocyte from the rat medial pre-frontal cortex. Thesecells exhibit a stellate morphology with multiple arbors emanating from a central basal region (shown contoured). We refer to the approximate centroid of this region asthe root point (indicated by the black dot). (B) Multiplex staining of a freshly dissociated astrocyte from mouse cortex illustrates the asymmetric spatial relationship of theroot point to the cell nucleus (blue: Bis:bisbenzimidine), and the partial labeling provided by GFAP (green: GFAP) in the context of the fine peripheral astrocyte processes(red: ezrin) (Panel B image courtesy of Dr. Amin Derouiche, University of Frankfurt, Germany). (For interpretation of the references to color in this figure legend, the readeris referred to the web version of this article.)

Fig. 2. Illustrating the challenges associated with astrocyte quantification. (A) Maximum intensity projection of a 3-D multi-channel image (size, 387.07 × 387.07 × 50 �m,red: GFAP, yellow: Iba1, magenta: NeuN, and blue: DAPI). (B) Another example of a multi-channel dataset (size, 268.06 × 268.06 × 200 �m, green: SMI-71). (C, D) GFAP andDAPI channels from (A) and (B) respectively. (A1–C1) Magnified views of the regions indicated by dashed boxes illustrating overlap of astrocyte arbors. These figures illustratevariability in fibril thickness, root point appearance, and arbor size (arrows), structural similarity and close proximity of astrocyte nuclei with other cell types (A and B), andcontact of astrocyte arbors (inset boxes) with other astrocytes (C1), other cell types (A1), and blood vessels (B1). (For interpretation of the references to color in this figurelegend, the reader is referred to the web version of this article.)

P.M. Kulkarni et al. / Journal of Neuroscience Methods 246 (2015) 38–51 41

Table 1A summary of the prior work on astrocyte quantification. Comparisons with the proposed method are made with respect to the relevant parameters.

Algorithm Method Rootdetection

Arborreconstruction

2-D/3-D Nucleidetection

Arbormorphologyquantification

Computationallyexpensive?

Butt et al. (1994) Manual No No 2-D No Yes NABushong et al. (2003) Manual No No 2-D No Yes NANarayan et al. (2007) Thresholding No No 2-D No No NoHashemi et al. (2008) Marching squares No Yes 2-D Yes No YesPirici et al. (2009) Thresholding No No 2-D No Yes NoBenesova et al. (2009) Edge detection No Yes 3-D Yes No NoBjornsson et al. (2008) Angular variance Yes Yes 3-D Yes No YesSuwannatat et al.

(2012)Random walks No Yes 2-D No No Yes

Budde and Frank Structure tensor analysis No No 3-D No Yes No

art5c1eap0ae

2

iasbsdptt2cicefioq

iii(

L

wa�tew

(2012)

The proposed method Root detection and LPP Yes Yes

Eight confocal image tiles, forming a 2 × 4 montage covering 775 �m × 1550 �m area, were imaged for each tissue section,esulting in 24 tiles for each animal. The tiles were imaged sequen-ially in five fluorescence channels using the 405 nm, 488 nm,60 nm, 594 nm, and 633 nm laser lines of a Leica SP8 uprightonfocal microscope with a 40× oil immersion objective, using a

�m step size. The image dimensions were 1024 × 1024 × 52 vox-ls for tissue region of size 387.50 �m × 387.50 �m × 50 �m. Twodditional optical slices were collected to cover the slices com-letely. The acquisition speed was 600 Hz, and the zoom factor was.75, so the lateral voxel size is 378 nm. The tiles were set to havepproximately 10% spatial overlap, and the z-stacks were collectedncompassing the entire thickness of the tissue.

.2. Computational image analysis methods

Fig. 3 summarizes the computational pipeline for astrocytemage analysis. After image acquisition, each channel is saved asn individual TIFF format file. The individual optical slices aremoothed by median filtering with a 2 × 2 pixel window, followedy illumination correction using the rolling-ball filter with a radiusetting of 50 pixels (Sternberg, 1983). The next step is to detect andelineate all cell nuclei from the DAPI channel using a previouslyublished method (Al-Kofahi et al., 2010). The next step is to isolatehe sub-population of astrocyte nuclei using a statistical classifierhat can be trained efficiently from examples (Padmanabhan et al.,014). At a minimum, the method requires a two-channel imageonsisting of GFAP, and a nuclear label. When additional cell-typendicating markers are available (e.g., microglia marker Iba-1), thelassifier can utilize expression levels of these additional mark-rs over each nucleus to improve the classification accuracy. Thenal steps of the computational pipeline consist of reconstructionf astrocyte cell arbors, morphological feature computation anduantitative arbor analytics.

The first step for analyzing the GFAP signal is to detect pointsn the image that correspond reliably to astrocyte fibers keepingn mind their diverse spatial scales (fiber thickness). For this, themages are filtered using the multi-scale Laplacian of GaussianLoG) filter given by:

oG� =(

x2 + y2 + z2

�4− 3

�2

)e

− x2+y2+z2

2�2 , � ∈ (�min, �max), (1)

here (x, y, z) are the Cartesian coordinates of voxels in the image,nd the parameter � determines the spatial scale of the filter, with

min and �max being the minimum and maximum expected spa-

ial scales, respectively. For the data presented in this paper, thexpected number of scales were set to � = 2l such that l = 0.5 + m,here m is in the range of 1–3. The interest points belonging to

3-D Yes Yes Yes

low intensity regions are eliminated by using the coverage metric,C�LoG given by:

C�LoG =∑

x∈F �I(x, y, z) + (1 − �)V(x, y, z)∑x∈I�I(x, y, z) + (1 − �)V(x, y, z)

, (2)

where F is a set of candidate interest points, I(x, y, z) is the GFAPintensity and V(x, y, z) is the vesselness value given by Eq. (6), � is ascalar between (0, 1) that allows us to weight the relative contrib-utions of the intensity and vesselness terms. In this work, the valueof � was empirically set to 0.01. The coverage metric in Eq. (2) isinspired by prior work (Abdul-Karim et al., 2005) for enabling auto-mated selection of the threshold, �LoG which is applied to the outputof Eq. (1) for eliminating interest points that do not lie on the astro-cyte arbors or roots. To achieve this, C�LoG is iteratively computed byvarying �LoG in steps of 0.005 until C�LoG falls in a pre-determinedbound of �C = (0.0003, 0.003). These values were determined byapplying Eq. (2) to a set of images that represent the expected vari-ability in imaging quality and were found to be consistent acrossall images reported in this work. Intuitively, C�LoG corresponds tothe amount of image foreground “covered” by the current set ofinterest points and is therefore expected to be a tightly boundednumber across images with variable image content (GFAP+ signal)and imaging quality. We refer to the resulting points from Eq. (2) as“GFAP+ interest points”. Fig. 4A shows an example of GFAP+ interestpoints, in which the 3-D confocal stack is presented as a maximumintensity projection with the GFAP signal rendered in red, the DAPIchannel is rendered in blue, and the GFAP+ interest points are over-laid in yellow. Inset Panel A1 is an enlargement of the boxed region,and illustrates the ability of this method to detect GFAP+ processesof varying thickness. Fig. 4 in the Electronic supplement D furtherdemonstrates the result of this step on multiple images.

The GFAP+ interest points are next filtered to isolate the pointsthat are associated with the basal “root” portions of the astrocytearbors. These regions have a variable appearance in images, andtherefore a combination of image cues must be utilized to detectthem reliably. Visually, they are characterized by relatively thickerprocesses, higher average GFAP intensity, a more ball-like appear-ance compared to fibers, proximity to a (DAPI+) nucleus, and highorientation diversity among the emanating fibers. We derive aset of quantitative measurements that capture these visual char-acteristics. For each GFAP+ interest point, we compute its localspatial scale, fiber orientation diversity, GFAP intensity statistics,local shape, and spatial associations with other available imagingchannels, as described further below. A machine learning algorithm

is then used to identify the root points from these measurements(Padmanabhan et al., 2014).

The spatial scale of each GFAP+ interest point is estimatedby fitting a parametric active contour (sphere) with a constant

42 P.M. Kulkarni et al. / Journal of Neuroscience Methods 246 (2015) 38–51

the co

ipfifiwtuh

wei�

Fp(nptSa

Fig. 3. An overview of the main steps in

ntensity model as described in (Chan and Vese, 2001). For com-uting measurements of local fiber orientations, we rely on Gaborlters (Kalliomäki and Lampinen, 2007). In order to quantify theber orientation diversity, scale-adaptive orientation histograms,hich are similar in principle to the intensity-based orientation dis-

ribution functions described in Mukherjee (2011), are computedsing 3-D steerable logarithmic Gabor filters. A single bin of thisistogram, � , is written mathematically as follows:

�,� = 1K

∑�

∣∣IF (ˇ) × �,�(ˇ)∣∣ , � → ()fx, fy, fz, (3)

here � indicates the spatial scale, and � indicates the spatial ori-ntation. The denominator term K is a normalization constant thats proportional to the sampling density of the frequency space, and

corresponds to a specific spatial frequency (fx, fy, fz). IF denotes the

ig. 4. Illustrating the steps involved in astrocyte root detection. (A) Maximum intensity points based on scale, shape, texture, orientation diversity, and proximity to nuclei, rearrows). (C) Cell classification derived using the filtered interest points and other intrinsieurons, yellow square: other cell types), shown on a single representative optical slice foints (white points). (D) Astrocyte nuclei and root points (yellow) with the arrow pointinracing. Interest/root points in (A, B and D) are dilated for clarity. (A1–D1) Magnified viewcale bars in (A and A1) apply to all corresponding panels. (For interpretation of the referrticle.)

mputational pipeline for GFAP analysis.

Fourier transform of the input image, and �,� is the rotated log-arithmic Gabor kernel corresponding to scale � and orientation �.These histograms are computed with a kernel size equal to the esti-mated spatial scale, such that each bin corresponds to the averagestrength of the filter response along the local orientation. For com-putational efficiency, the image convolutions are implemented inthe Fourier domain with pre-computed rotation kernels. A rich setof orientation diversity features are derived from the orientationhistograms including the average orientation, spread of the orien-tations, total energy of the orientation histograms, and the numberand strength of prominent orientations. To this end, the averageorientation is estimated by the arithmetic mean, spread of orien-

tations is estimated by the standard deviation, energy is estimatedby the L1 norm whereas the prominent orientations are identifiedas the local maxima of the histogram with a specified minimumpeak separation and minimum peak height. For the computation

rojection of a 3-D image with interest points (yellow) overlaid. (B) Filtered interestsults in preservation of points that are proximal to the multi-scale basal regionsc and associative features (red square: astrocytes, red circle: microglia, blue square:rom (A). In addition to the five channels, the figure also shows the filtered interestg to a missed detection. The root points form the required input for astrocyte arbors of the boxed region in each image illustrate the corresponding results in detail.

ences to color in this figure legend, the reader is referred to the web version of this

urosc

osw

sa1f

H

wwmp“

R

V

]if

wˇtwnmptnsid

intptmpdctCehfp

itctola(lo

P.M. Kulkarni et al. / Journal of Ne

f local maxima of the orientation histograms, the minimum peakeparation was set to 40◦ and minimum normalized peak heightas set to 50%.

In addition to features based on fiber orientation diversity, localhape-based features of the fibers at each GFAP+ interest pointre computed using a Hessian matrix based method (Frangi et al.,998). The Hessian matrix at spatial scale � is given by the followingormula:

�(x) = ∇2(I(x, y, z) ∗ G�(x, y, z)), (4)

here I is the input image and G� is the Gaussian convolution kernelith standard deviation �. The eigenvalues of this matrix, denoted

1, 2, and 3 are used to compute the standard set of Hessian basedeasures of the local shape of the fibers close to the GFAP+ interest

oints, referred to as “ballness” denoted RBall, “plateness” RPlate andvesselness” V, and written mathematically as follows:

Ball = |1|√∣∣23

∣∣ , RPlate =∣∣2

∣∣∣∣3

∣∣ . (5)

The “vesselness” value V, is given by the following formula:

� =

⎧⎨⎩

[1 − exp

(−R2

Plate

2˛2

)][exp

(−R2

Ball

2ˇ2

)][1 − exp

(− S2

22

)

0

here S =√

21 + 2

2 + 23 and |1| ≤ |2| ≤ |3|. The parameters ˛,

and are constants that indicate the weight accorded to eacherm in the product. The constants ˛, ̌ were set to 50% while as set to 0.25% of the maximum intensity value in the GFAP chan-el image. In addition, local GFAP intensity statistics such as theean and variance are computed within a spatial neighborhood

roportional to the estimated scale. Finally, cross-channel associa-ions are computed for each interest point by first delineating theuclear boundaries in the DAPI channel using a graph-cuts basedegmentation method (Al-Kofahi et al., 2010) and then comput-ng the distance to the closest nucleus using the Voronoi-basedistance transform (Maurer et al., 2003).

We are interested in identifying the small subset of the GFAP+nterest points that are astrocyte root points. We are simulta-eously interested in identifying the DAPI+ cell nuclei that belongo astrocytes, based on the available image cues. For this, we pro-ose a three-step approach, as illustrated in Fig. 4. In the first step,he GFAP+ interest points (Panel A) are analyzed based on the afore-

entioned features using a machine learning classifier, and interestoints that are distal from the basal “root” regions of astrocytes areiscarded (Panel B). In the second step, all available fluorescencehannels are utilized by the machine learning classifier to identifyhe types of cells that each DAPI+ cell nucleus corresponds to (Panel). This enables us to further narrow down the set of GFAP+ inter-st points to identify the most likely astrocyte root points with aigh confidence (Panel D). These points form the starting points

or reconstructing the astrocyte arbors. The following paragraphsrovide further details.

Machine learning and classification are ordinarily complex tasksn their own right, requiring expert parameter tuning. In addi-ion, when analyzing large datasets with thousands of samples,hoosing the appropriate training samples to train an algorithmo classify cells is a skill-intensive task. With the intent of makingur method widely usable, we adopted a recently reported activeearning based classifier algorithm that is designed to maximize

utomation, and minimize parameter tuning and training effortPadmanabhan et al., 2014). The active learning algorithm ana-yzes the set of multivariate data points representing the featuresf the GFAP+ interest points, and selects a parsimonious subset

ience Methods 246 (2015) 38–51 43

2, 3 < 0,

otherwise

⎫⎬⎭ (6)

of five data points that are presented to the user for visual label-ing into two categories: interest points that belong to the basal“root” regions, and other interest points. These points are chosenso as to maximize the information gained (Settles, 2012). Once theuser has labeled the chosen data points, the active learning algo-rithm re-computes the decision boundary separating the classesand computes the information gain based on the examples labeledthus far. It also determines the features that are most informa-tive for the classification problem and requests the next roundof points for human labeling based on what it has learned so far.This sequential procedure converges rapidly, and stops when theinformation gain across successive iterations reaches a plateauindicating that classification boundaries can be identified to withina specified confidence level. This active learning algorithm is basedon the D-optimal experimental design criterion (Padmanabhanet al., 2014), and it learns a multi-class logistic classifier with L1-norm based regularization. It not only leads to efficient learningof the decision boundary, but also helps in selecting an optimalsubset of the relevant features. In our experience, this algorithmis extremely versatile and usable even with its default internal

settings, requires no additional parameter settings from the user,and overall requires very few examples to be labeled manually.Fig. 4B shows the filtered interest points selected by this classifier,and the arrows indicate examples where the classifier successfullyselected the interest points that belong to the basal region. Clearly,this method is able to retain the GFAP+ interest points proximalto the cell nuclei. Figs. 5–7 in the Electronic Supplement D furtherillustrate this result by providing additional results for detection offiltered interest points.

The next step is to identify the DAPI+ cell nuclei that belong toGFAP+ cells. We take advantage of the fact that astrocyte nucleiare present in close proximity of the basal regions. Accordingly, wecompute the distances of each of the filtered GFAP+ interest pointsto the nearest nucleus. From this, we compute the total number offiltered GFAP+ interest points that belong in a spatial neighbor-hood that is proportional to the nuclear size equal to 1.5 timesthe nuclear diameter, and the minimum, maximum, mean andvariance of the distance from each nucleus to the filtered interestpoints within this neighborhood. In addition, we compute measure-ments of cell nuclei, including their location, chromatin texture, andshape, following previously published methods (Bjornsson et al.,2008). Finally, the other available fluorescent signals (typ. Microgliamarker Iba-1, and neuron marker NeuN) that are present in a pre-defined peri-nuclear region are utilized when available. In this case,the width (typ. 8–10 voxels) of the peri-nuclear region is set by theuser based on prior knowledge of the expected proximity betweenthe cell types corresponding to every channel. The combined setof morphological and associative features for each nucleus allowus to re-train and use the same active learning algorithm, but forthe purpose of identifying cell types. The classification results arerecorded in a table, and displayed as color-coded dots in Fig. 4C(red square: astrocytes, red circle: microglia, blue square: neu-rons, yellow square: other cell types). The white points in Panel

C1 illustrate how the filtered interest points contribute towardastrocyte nuclei detection by providing a summarized cue fromthe GFAP channel. Movie 2 in the Electronic Supplement D furtherdemonstrates the result for astrocyte nuclei detection. Finally, for

4 urosc

eppptti

mfnea2oitOldatcttttct2oal

waioabatpGcgtsatta

tGittpsaasc

a

Trace Selection: The best local traces (i.e., the traces belonging toa single cell) that minimize the above cost for each cell, are selectedfor simultaneous propagation within the parallel tracing algorithm.For the parallel implementation, up to N + 1 threads (T1 to TN+1) are

4 P.M. Kulkarni et al. / Journal of Ne

ach astrocyte nucleus that is detected in the previous step, a rootoint is selected by choosing the closest filtered GFAP+ interestoint. The result, as shown in Fig. 4D with the root points dis-layed in yellow, is used as an input for the arbor reconstructiono follow (described further below). Figs. 8–9 in the Elec-ronic supplement D further demonstrates this result on anothermage.

The detected root points form the starting points for auto-ated arbor reconstruction. Several methods have been proposed

or automated tracing of tubular/curvilinear structures, especially,eurites and glial processes (Law and Chung, 2008, 2010; Sironit al., 2014; Breitenreicher et al., 2013; Wang et al., 2011; Basnd Erdogmus, 2011; Rodriguez et al., 2009; Türetken et al., 2012,013; Xiao and Peng, 2013). The vast majority of tracing meth-ds are aimed at tracing single neurons accurately, and wheret is practical to perform detailed proofreading of the recons-ructions (Meijering, 2010; Wang et al., 2011; Gala et al., 2014).n the other hand, astrocyte tracing requires reconstruction of

arge ensembles of cells with varying morphologies, and whereetailed proofreading of each cellular process would be unafford-ble. Tracing performance expectations are also different fromhe automated neuron tracing literature. For reconstructing GFAP+ells, it is more important to achieve large-scale automation of theracing, achieve certain structural guarantees, for example, thathe reconstructions will not have any loops/cycles so they can areopologically representable as trees, and to correctly disambiguatehe processes of neighboring cells. Given the sheer numbers ofells involved (1000’s), and our end goal of extracting quantita-ive arbor measurements for large-scale arbor analytics (Lu et al.,014), combined with the previously noted limitations of GFAP,ur expectations on the pixel accuracy of tracing processes areppropriately less stringent compared to the neuron tracing prob-em.

These tradeoffs inspired us to develop a novel algorithm, thate term the local priority-based parallel tracing (LPP) algorithm,

s described below. In the language of graph theory, this algorithms based on modeling the arbors as a forest of tree structures, withne tree per cell. The roots of the individual trees in this forestre the root points obtained from the previous steps. The arborranches of all trees are traced automatically and concurrently forll the cells in the image, starting from the respective root points. Inhis regard, the GFAP+ interest points form an initial population ofoints that mostly (but not always) lie on the arbor processes. WeakFAP+ interest points are filtered out based on a standard spheri-al saliency cost (Chan and Vese, 2001; Mukherjee, 2011) which isiven by pi sal = �f − �b, for an interest point pi where �f and �b arehe estimated mean foreground and background intensity corre-ponding to a fitted sphere. Interest points with negative saliencyre discarded. The remaining interest points provide a conserva-ive and sparse sampling of astrocyte processes, and the automatedracing algorithm is designed to leverage these points efficiently,nd in parallel.

The task of reconstructing the astrocyte arbors proceeds alonghe following steps. First, we start with the dense set of LoG basedFAP+ interest points as described above, and trace multiple cells

n parallel to the extent possible with the number of computa-ional threads/cores available on the computer. Within every cell,he algorithm proceeds sequentially, based on a local cost-basedriority. As the arbor tracing proceeds, we merge traces using aet of rules guided by a framework, in which the tracing processesre guided by numerical cost functions that embody our knowledgebout astrocyte arbor morphology. Finally, we compute a minimum

panning tree corresponding to the arbor reconstruction for eachell as the final output. These details are explained below.

Tracing Cost: A secondary population of interest points is gener-ted using the orientation histograms mentioned previously. The

ience Methods 246 (2015) 38–51

cost for generating a new population of interest points (i.e., addinga new interest point, p) is defined as:

�astro = �curv + �length(�sal + �vessel + �scale). (7)

The individual components of this cost are designed based onobservations of astrocyte arbor morphology. To this end, �curv is acurvature-based cost, which is derived based on the observationthat astrocyte arbors generally have a low curvature; �length is alength-based cost term that favors longer traces over shorter ones.The terms �sal and �vessel are spherical saliency and vesselness-based costs respectively, that are designed to discourage theaddition of interest points with low spherical saliency and ves-selness values. These two costs are designed to encourage thealgorithm to trace tubular structures within the image. Finally,�scale is a scale-based cost term that discourages the addition ofinterest points with abrupt variations in scale (thickness). This costterm is based on the observation that the thickness of astrocytearbors monotonically decreases away from the root point. Conse-quently, �scale also contributes in preventing the algorithm frommerging traces into adjacent cells. For a set of adjacent inter-est points pi and pj belonging to � (with �max = 20), where � isthe number of successors of p that account toward the cost, theaforementioned costs are formulated as follows. The curvature-based cost �curv is given by �curv =

√(1/�)

∑i,j∈� cos−1 �pi, �pj − �� ,

where �� is the average angle included between successiveinterest points, while the length-based cost �length is given by�length = (1/�). The saliency and vesselness-based costs are givenby �sal =

∑� − log pi sal, and �vessel =

∑� − log pi vessel respectively,

where pi sal and pi vessel are respectively the spherical saliency andvesselness values corresponding to the point pi. Finally, the scalebased cost is defined as �scale = (1/�)

∑�

∣∣piscale− pjscale

∣∣, wherepiscale

and pjscaleare the scales corresponding to pi and pj, respec-

tively. In summary, the mathematical formulations for the tracingcost are designed for generating reliable reconstruction of astrocytearbors based on known characteristics of astrocyte fibrils.

Fig. 5. Flowchart summary of the algorithm for astrocyte arbor tracing. The boxeslabeled “Hit” imply an intersection of a trace with a previously traced segment.

P.M. Kulkarni et al. / Journal of Neuroscience Methods 246 (2015) 38–51 45

F jectionc ex anf r in th

uOusm

sattailt

mp�rn

Ffaapsct

ig. 6. Sample astrocyte arbor reconstruction results. (A–D) Maximum intensity proolors, one for each cell). The proposed tracing method is able to cope with complrom nuclei detection (dashed arrows). (For interpretation of the references to colo

sed, where N is the total number of cells to be traced in the image.ne thread is reserved for each cell, whereas the population of allnlabeled interest points at any given time is managed using theingle thread. A combined history of past traces and their labels isaintained, using a suitable data structure.Trace Merging: Traces are merged and relabeled when they inter-

ect another labeled or unlabeled trace. If a labeled trace intersectsnother labeled trace, the respective traces are terminated. This ishe case when a trace segment from one cell begins to “carry over”o a neighboring cell. In all other cases, the traces under consider-tion are merged and relabeled. Thus, as the algorithm progresses,nterest points are moved from the unlabeled thread to one of theabeled threads. The relabeling is performed sequentially for eachhread.

Trace Termination: The algorithm terminates when the accu-ulated cost of all labeled and unlabeled traces reaches a

re-determined value (empirically set to 3 in our experiments)max, or when the maximum number of iterations have beeneached (empirically set to 50,000 in our experiments), or wheno thread has any interest point left to trace further.

Fig. 5 is a flowchart summary of the proposed algorithm.ig. 6 demonstrates astrocyte arbor reconstruction results onour different images (Panels A–D) using the proposed tracingpproach. Every cell is coded with a different color and the detectedstrocyte nuclei are shown in blue. This figure shows that the pro-

osed method is able overcome the challenges mentioned earlier,pecifically, it is able to reconstruct astrocyte arbors exhibitingomplex and multi-scale morphologies (solid arrows), and alsoolerate errors (false negatives) from nuclei detection (dashed

s of confocal images with detected astrocyte nuclei (blue) and arbor traces (randomd multi-scale arbor processes (solid arrows), and tolerates errors (false negatives)is figure legend, the reader is referred to the web version of this article.)

arrows). Movie 3 in the Electronic supplement D further illustratesthe tracing results by showing a 3-D visualization for a single astro-cyte cell.

3. Experimental evaluation results

3.1. Experimental evaluations

The LPP method was applied to the aforementioned rat braindatasets and integrated into FARSIGHT (Bjornsson et al., 2008),an open source image analysis toolkit. All computations wereperformed on a 64-bit Dell Precision desktop computer with IntelXeon® X5677 3.47 GHz processor and 72 GB RAM. The overallruntime for the algorithm (starting with preprocessing and endingwith arbor tracing) was about 25–30 min for a single confocalimage of size, 1024 × 1024 × 52 voxels. The output of the arborreconstruction method consists of standard SWC files that encodethe tree-based morphological structure for every cell (Cannonet al., 1998). In order to assess the accuracy of the proposedmethod, the results produced by the automated algorithms wereproofread manually using 3-D visualization and editing tools inFARSIGHT, and the Neuromantic system was used for manualtracing of some arbors (Bjornsson et al., 2008; Luisi et al., 2011;Myatt et al., 2012), The visually detected errors were corrected byediting, and the trail of edits were recorded. Once the human user

is visually satisfied with the proofreading results, they are consid-ered acceptable for quantitative analysis. Given the large numberof astrocytes in a field, it is only practical to proofread a randomsubpopulation of cells. The recorded trail edits are interpreted to

46 P.M. Kulkarni et al. / Journal of Neuroscience Methods 246 (2015) 38–51

Fig. 7. Performance of the proposed method (LPP) compared to the fast-marching method (FMM). (A) Accuracy measurements (percentage of true positives) for astrocytenuclei detection across 20,000 cells from 14 datasets. The average accuracy was found to be 98.8%. (B) The proposed method, on average, resulted in fewer jumps (3.2%)c ewer

g e overn

mtoesatpwa

tsaiyvcsmcseor

ompared to FMM (9.0%). (C) The proposed method also resulted in, on average, fenerated by evaluating 100 randomly selected cells. These results demonstrate thuclei detection and arbor reconstruction.

easure of the accuracy of the computational method. We refero this as an edit-based strategy for evaluating the performancef astrocyte nuclei classification and arbor reconstruction (Lint al., 2007; Tyrrell et al., 2007; Bjornsson et al., 2008). Edit-basedtrategies are intended for operational use of automated imagenalysis systems on a large scale since the manual effort scales ashe low error rate of the automated algorithms, and the processrovides visual confirmation of correctness. They are appropriatehen a comparison of the automatically generated traces against

full manual reconstruction of the arbors is prohibitive.In order to evaluate the accuracy of astrocyte nuclei detec-

ion, one image was randomly selected from each of 14 randomlyelected datasets out of the 24, where a “dataset” corresponds toll the images acquired from a single animal. In total, the selectedmages contained about 20,000 cells. The automated image anal-sis results were inspected and edited by a neuroscientist until aisually acceptable accuracy was reached. The editing operationsonsist of assigning the correct cell type for the wrongly clas-ified nuclei. The fact that GFAP does not stain astrocyte nucleiakes the visual correlation of a specific nucleus to a set of arbors

hallenging. To mitigate this issue, the expert was provided with

imultaneous visualization of all four fluorescence channels forach image. This helped in eliminating incorrect decisions aboutther cell types, which may occur in close vicinity of astrocyte basalegions.

missing segments (17.7%) compared to FMM (20.8%). The results in B and C wereall superior performance of the proposed multi-stage approach for both astrocyte

The editing operations were performed using the FARSIGHTNucleus Editor software (Bjornsson et al., 2008; Al-Kofahi et al.,2010). At the end of this process, the accuracy of astrocyte nucleidetection was computed as:

Accuracy = TP + TNTotal # cells

, (8)

where TP and TN are the true positives and true negatives respec-tively, and the astrocyte nuclei class is considered to be positive. Ourobservations of the performance of this algorithm on the reporteddata suggests that there were relatively much more instances of anuclei falsely identified as astrocyte nuclei compared to vice versa.Therefore we only used the TP count in computing the overall accu-racy in Eq. (8). Fig. 7A illustrates this accuracy measurement. Theaverage accuracy was found to be 98.8%, suggesting reliable detec-tion of astrocyte cell nuclei. The accuracy depends upon severalfactors. First, the accuracy of delineating nuclei can degrade wheneither the DAPI staining is weak, nuclei are tightly clustered, andwhen the chromatin is loosely textured making it difficult to discernthe boundaries. Second, the accuracy depends upon effective train-ing of the active machine algorithm. Although this algorithm selects

the samples automatically, the quality of the user’s responses canbe deficient, especially in regions where the astrocyte fibers aredense and fenestrated in complex ways and when visual judgmentof the root points becomes difficult. Our studies focused on cortical

P.M. Kulkarni et al. / Journal of Neuroscience Methods 246 (2015) 38–51 47

F andlinF ll panem kgrout

ibdf

tmmfbR(2h2mfssm

ig. 8. Visual close-up comparison of astrocyte tracing methods to illustrate the hMM. (R2 and R4) Examples of tracing using LPP. The scale bar in (R1) applies to aerging between arbor traces for neighboring cells (dashed ovals), and into the bac

he reader is referred to the web version of this article.)

mages, in which cells are not too densely packed. Our method cane expected to be less accurate (due to dependence on accurateelineation of cell nuclei) in regions where cells are densely packed,or example, the Dentate Gyrus or the Nucleus Accumbens.

In evaluating the arbor reconstruction results, we comparedhe proposed method to a recently published fast marching-based

icroglia tracing (FMM) approach (Xu et al., 2013). We chose thisethod as a baseline, based on the following considerations: first,

ast marching (Sethian, 1996; Sethian, 1999) based methods haveeen extensively applied for solving vessel tracing (Liao et al., 2012;ouchdy and Cohen, 2012) and neuronal fiber tracing problemsWang et al., 2011; Liao et al., 2012; Mukherjee and Stepanyants,012; Gala et al., 2014) alike. Second, and more importantly, theyave also been adapted for microglia tracing (Rouchdy et al., 2011,008; Wang et al., 2011; Xu et al., 2013), which shares some com-onalities with the astrocyte tracing problem described here, and

or which software implementations are available for compari-on. In order to perform the comparison, 100 cells were randomlyelected. In this case, the editing was performed by using the Neuro-antic software (Myatt et al., 2012) by deleting false segments (i.e.,

g of neighboring cell arbors (LPP vs. FMM). (R1 and R3) Examples of tracing usingls. In comparison to FMM, the proposed method is better able to avoid erroneousnd (solid ovals). (For interpretation of the references to color in this figure legend,

jumps) and adding missing segments until a visually acceptableaccuracy was reached. During this process, the number of incorrectjumps and missing segments were recorded for every cell. Fig. 7Band C provide a visual comparison of the performance of arborreconstruction obtained by applying LPP and FMM-based tracing.It can be noticed that LPP results in fewer jumps (3.2%) and fewermissing segments (17.7%) compared to the FMM approach, whichresulted in 9.0% jumps and 20.8% false segments. Fig. 8 demon-strates the qualitative superiority of the LPP method (rows 2 and 4)in comparison with the FMM tracing method (rows 1 and 3). Theovals in Fig. 8 indicate areas where the FMM derived traces havemerged with adjacent cell arbors (dashed ovals) or background(solid ovals), whereas LPP is able to avoid such errors. One knownlimitation of our method is that it will not reconstruct arbors forcells whose root point is missed.

3.2. Quantitative arbor analysis

Although the vital roles of astrocytes in brain development,physiology and pathology are receiving growing recognition

48 P.M. Kulkarni et al. / Journal of Neuroscience Methods 246 (2015) 38–51

Fig. 9. Quantitative analysis of GFAP+ cells from a binge alcohol study with 24 animals, four groups (24 fields, 387.07 �m × 387.07 �m × 50 �m were imaged from eachanimal), using co-clustering of L-measure data. (A) Heat map rendering of the co-clustering result (each row corresponds to an individual cell, each column represents af izontp rbor mr this fi

(rtKtna2nt

eature). Four clusters of morphological cell-types were identified (circled in the horopulation under study consists of cells which have a wide variability in terms of aelatively simple structures in (G4). (For interpretation of the references to color in

Ransom et al., 2003; Sofroniew and Vinters, 2010), muchemains unknown about their quantitative architecture, especiallyhe heterogeneity in astrocyte arbor morphology (Matyash andettenmann, 2010). In order to overcome these barriers, it is essen-

ial to first quantify the arbor morphology. Unfortunately, no singleumber is capable of describing a complex three-dimensional cell

rbor. For this reason, we adopt the L-measure (Scorcioni et al.,008), a method that was originally proposed for quantifyingeuronal morphology. The L-measure is a high-dimensional mul-ivariate descriptor consisting of over 100 diverse measurements

al tree). (G1–G4) representative cells from each group. This figure indicates that theorphology; starting with complex morphology with multiple branches in (G1) to

gure legend, the reader is referred to the web version of this article.)

for each cell, including measurements of the soma and the arbors,which are computed from the arbor reconstructions. The arborreconstructions are first recorded into the standard SWC file for-mat (Cannon et al., 1998), from which the L-measure is computedfor each cell.

With the large number of cells that are present in brain tis-

sue, and the high-dimensionality of the L-measure’s feature space,the analysis of the combined data from all of the cells is a non-trivial task. The recently reported harmonic co-clustering (Luet al., 2014) is a powerful method for unsupervised (exploratory)

P.M. Kulkarni et al. / Journal of Neuroscience Methods 246 (2015) 38–51 49

Table 2Features selected by the co-clustering algorithm. Results indicate that the astrocyte cells in the given population vary largely in terms of their size, shape and arbor complexity.

Group Surface area (�m2) Volume (�m3) # Segments # Stems # Branch points # Bifurcations Skewness

1 201 223 44 7 17 15 72.12 139 119 23 5 9 8 41.83 63 72 8 1 4 3 27.44 18 26 0 0 0 0 0

Table 3Astrocyte cell distribution by group. The population under study is seen to consistmostly of cells with complex arbor morphologies (Group 1).

Group 1 Group 2 Group 3 Group 4 Total

qtoueetgWtdrottts

utgsc(govba

lpa

sap

p

wddk

p

wi

Table 4Inter and intra-cluster dispersion values for the four cell groups. Low intra-clusterdispersion indicates strong within-cluster homogeneity while high inter-clusterdispersion indicates strong between cluster heterogeneity.

Group 1 Group 2 Group 3 Group 4

Group 1 1.12 3.01 17.36 29.81Group 2 3.01 0.79 8.47 15.08

Population 13,840 4283 4018 8137 30,278Percentage 45.7% 14.1%% 13.3% 26.9% 100.0%

uantitative arbor analysis. It simultaneously groups the rows andhe columns of the L-measure data table at multiple scales. This notnly provides an approach for exploratory analysis of the cell pop-lation, but also identifies features or groups of features that mostffectively distinguish the identified groups of cells. Additionally, itmploys a non-linear stochastic diffusion distance metric instead ofhe traditional Euclidean distance metric to cope with the hetero-eneity and high-dimensionality of the L-measure feature-space.e applied harmonic co-clustering to the astrocyte arbor recons-

ructions obtained from the 24 datasets used in this study. Fig. 9Aemonstrates the co-clustering results in the form of a heat-mapepresentation where each row corresponds to a cell, with a totalf about 30,000 cells in this study; and each column correspondso a feature from the L-measure collection. The horizontal tree onhe left side of the heat-map illustrates the grouping structure ofhe cell population, whereas the tree on top illustrates the groupingtructure of feature correlations.

The co-clustering reveals that the GFAP+ cells in the populationnder study exhibit four major groups, highlighted by circles inhe horizontal tree in Fig. 9A. Representative cells from these fourroups are shown in columns (G1–G4). These four groups repre-ent cells with variable morphological complexity, with the mostomplex arbors in Group 1 (G1) and the least complex in Group 4G4). Table 2 lists the mean values of the relevant features for eachroup. It can be observed that the four groups are distinct in termsf the overall sizes of the cells (as indicated by the surface area andolume), the complexity of their arbor morphologies (as indicatedy the number of segments, stems, branch points, and bifurcations)nd their shapes (as indicated by the skewness values).

Table 3 lists the proportions of the four groups within the popu-ation under study. It can be observed that the GFAP+ astrocyte cellopulation consists largely of cells with high complexity (Group 1)nd low complexity (Group 4).

In order to quantify the performance of co-clustering, a set oftatistical dispersion indices are computed, by the co-clusteringlgorithm based on the diffusion distance. The within-cluster dis-ersion index is defined as:

intra,k =∑i=1

Nk(dik − rk)2

rk, (9)

here pintra,k is the intra cluster dispersion of cluster k, rk is mean ofistances between all data points and the centroid of cluster k, andik is the distance between data point i and the centroid of cluster. Similarly, the inter-cluster dispersion index is defined as:∑i=1

N (diq − rpq)2 +∑i=1

N (dip − rqp)2

inter,pq = q p

rp + rq, (10)

here pinter,pq is the inter cluster dispersion of cluster p and q, rpq

s the mean distance between all data points in cluster p and the

Group 3 17.36 8.47 0.71 6.95Group 4 29.81 15.08 6.95 0.43

centroid of cluster q and rqp is the mean distance between all datapoints in cluster q and the centroid of cluster p. Table 4 shows theintra and inter-cluster dispersion indices computed for the fourgroups. The intra-cluster indices with low values indicate strongintra-cluster homogeneity, while high values of the inter-clusterdispersion indicates strong cross-cluster heterogeneity.

Finally, we note that unlike microglia, the morphological analy-sis of astrocytes is an emerging area of research (Zhang and Barres,2010; Anderson et al., 2014) and new morphological categoriesare still being discovered (Matyash and Kettenmann, 2010). Con-sequently, the interpretation of cell types is largely dependentupon the experimental hypothesis being tested. To this end, theproposed co-clustering method provides a powerful quantitativetool for investigating morphological heterogeneity of astrocyte cellpopulations.

4. Conclusions and discussion

The proposed method is specifically intended for analyzingGFAP-labeled astrocytes, and for supporting the quantitation needsof hypothesis-driven, or discovery/screening oriented studies.Measurements of astrocyte numbers and location, stem arbor mor-phology, and morphological diversity can be generated from 3-Dconfocal images by our method. An advantage of our method isthat it is modular – the components are independently useful. Forexample, the ability to identify astrocyte nuclei using the availableimage cues provides a method for quantifying cell locations, cellcounts, and spatial cell distributions. The automated arbor recons-tructions, despite the limitations of GFAP labeling, are valuable forstudies that require analysis of arbor morphology. For quantifyingarbors, use of the L-measure provides a more comprehensive solu-tion that may be needed for a particular study, and the user canchoose to select a subset of the measurements from the L-measurelibrary. The harmonic co-clustering method is a natural approachfor profiling the morphological heterogeneity of astrocytes andshould be regarded as a first and convenient step, especially sinceit is built into the FARSIGHT toolkit. Additional statistical analy-sis can be performed on the quantitative measurements, since themeasurements can be exported easily to a spreadsheet.

The use of active machine learning algorithms allows a user totrain the system to operate on novel datasets with modest effort.To date, our method has been evaluated on image data from threedifferent laboratories using two different confocal systems, and is

expected to perform well on images from other laboratories. Inour experience, our method is ideal when the confocal images areacquired at a sufficiently high axial resolution to preserve the con-tinuity of processes across optical slices. If additional fluorescent

5 urosc

mIlemt

aitldoeuair

bapnweeoiImmoifsspvofigitrdpviat

wsdsi(sp

A

s

0 P.M. Kulkarni et al. / Journal of Ne

arkers are available, for example, cell type markers like NeuN,ba1, endothelial barrier antigen (EBA), and S100�, the machine-earning algorithm is capable of taking them into account quiteasily. In addition, the proposed method can be used as part of aore comprehensive analysis of brain cell organization following

he method described by Bjornsson et al. (2006).Our arbor reconstruction algorithms are designed for scalability

nd accurate reconstruction of astrocyte stem arbors while cop-ng effectively with natural variability, with the implicit guaranteehat the reconstructions will have a tree topology, and be free ofoops. This enables us to compute L-measure data on a large scaleirectly without any editing. When analyzing fields with hundredsr thousands of cells, the practicality of detailed proofreading andditing of traces must be considered. In principle, it is possiblesing our method to proofread and edit the reconstruction of eachnd every process. However, we consider this level of detail to bempractical and unnecessary for most studies, and instead proof-ead a random subset of cell arbor reconstructions.

Our experimental results demonstrate that our method is capa-le of coping with the biological variability and visual confoundsssociated with GFAP+ cells, including the complex, multi-scalerocesses, structural similarity and close proximity of astrocyteuclei with other cell types, and connections of astrocyte arborsith other cells. Moreover, we employed an efficient edit-based

valuation to show that the our method achieves an accuracyxceeding 95% for astrocyte nuclei detection, whereas it is capablef reconstructing arbors with an occurrence of jumps and miss-ng segments accounting for less than 4% and 18%, respectively.n this sense, it significantly outperforms the FMM-based tracing

ethod, which results in more than 9% occurrence of jumps andore than 20% occurrence of missing segments. As noted above,

ur arbor tracing algorithm is initiated at root points, and its trac-ng progress is driven by the combination of geometric and imageactors described in Eq. (6). We consider this to be a good approachince it avoids tracing “orphan” fibrils that cannot be tied to apecific root. For this reason, it is not possible to define a sim-le image-based limit to fibril detection at the scale of individualoxels. Analysis of the very distal processes is beyond the scopef the present study. As the tracing algorithm progresses awayrom the starting root points, the complexities and ambiguitiesncrease and we have chosen to terminate the tracing rather thanenerate false traces. The multi-factor tracing cost (Eq. (6)) wasntroduced in order to cope with these complexities. Importantly,he trace termination criterion is designed to stop arbor tracingather than generate false traces. In addition, distal processes areifficult to follow manually due to their three-dimensional com-lexity and density, and therefore the tracing results are difficult toerify based on GFAP staining alone. In the future, we expect thatmaging improvements (higher spatial resolution and contrast, andvailability of additional markers) will allow more distal processeso be traced and verified more reliably.

All components of astrocyte quantification presented in thisork are implemented in C++ and Python using standard open-

ource libraries, ITK, VTK and OpenMP and are integrated in andistributed through the open source FARSIGHT bio-image analy-is toolkit (www.farsight-toolkit.org). The Electronic Supplementsnclude the software with detailed instructions (A), sample imageB), sample output (C), and additional figures and movies (D). Theource code can be downloaded from the toolkit website, and com-iled for other computing platforms.

cknowledgements

The authors wish to thank Drs. Lawrence Carin at Duke Univer-ity for guidance on the active learning method, and Dr. Ronald

ience Methods 246 (2015) 38–51

Coifman at Yale University for the harmonic co-clustering method.This work was supported by DARPA Grant N66001-11-1-4015.

Appendix A. Supplementary data

Supplementary data associated with this article can be found,in the online version, at http://dx.doi.org/10.1016/j.jneumeth.2015.02.014.

References

Abdul-Karim M-A, Roysam B, Dowell-Mesfin NM, Jeromin A, Yuksel M, Kalya-naraman S. Automatic selection of parameters for vessel/neurite segmentationalgorithms. IEEE Trans Image Process 2005;14(9):1338–50.

Al-Kofahi Y, Lassoued W, Lee W, Roysam B. Improved automatic detection andsegmentation of cell nuclei in histopathology images. IEEE Trans Biomed Eng2010;57(April (4)):841–52.

Alonso A, Reinz E, Jenne JW, Fatar M, Schmidt-Glenewinkel H, Hennerici MG, et al.Reorganization of gap junctions after focused ultrasound blood-brain barrieropening in the rat brain. J Cereb Blood Flow Metab 2010;30(July (7)):1394–402.

Anderson MA, Ao Y, Sofroniew MV. Heterogeneity of reactive astrocytes. NeurosciLett 2014;565(April):23–9.

Bas E, Erdogmus D. Principal curves as skeletons of tubular objects: locallycharacterizing the structures of axons. Neuroinformatics 2011;9(September(2–3)):181–91.

Benesova J, Hock M, Butenko O, Prajerova I, Anderova M, Chvatal A. Quantificationof astrocyte volume changes during ischemia in situ reveals two populationsof astrocytes in the cortex of GFAP/EGFP mice. J Neurosci Res 2009;87(January(1)):96–111.

Bjornsson CS, Lin G, Al-Kofahi Y, Narayanaswamy A, Smith KL, Shain W, et al.Associative image analysis: a method for automated quantification of 3Dmulti-parameter images of brain tissue. J Neurosci Methods 2008;170(May(1)):165–78.

Bjornsson CS, Oh SJ, Al-Kofahi YA, Lim YJ, Smith KL, Turner JN, et al. Effects of insertionconditions on tissue strain and vascular damage during neuroprosthetic deviceinsertion. J Neural Eng 2006;3(September (3)):196–207.

Breitenreicher D, Sofka M, Britzen S, Zhou SK. Hierarchical discriminative frame-work for detecting tubular structures in 3D images. Inf Process Med Imaging2013;23(January):328–39.

Budde MD, Frank JA. Examining brain microstructure using structure tensor analysisof histological sections. Neuroimage 2012;63(October (1)):1–10.

Bushong EA, Martone ME, Ellisman MH. Examination of the relationship betweenastrocyte morphology and laminar boundaries in the molecular layer of adultdentate gyrus. J Comp Neurol 2003;462(July (2)):241–51.

Bushong EA, Martone ME, Jones YZ, Ellisman MH. Protoplasmic astrocytes inCA1 stratum radiatum occupy separate anatomical domains. J Neurosci2002;22(January (1)):183–92.

Butt AM, Colquhoun K, Tutton M, Berry M. Three-dimensional morphology ofastrocytes and oligodendrocytes in the intact mouse optic nerve. J Neurocytol1994;23(8):469–85.

Cannon RC, Turner DA, Pyapali GK, Wheal HV. An on-line archive of reconstructedhippocampal neurons. J Neurosci Methods 1998;84(October (1–2)):49–54.

Chan TF, Vese LA. Active contours without edges. IEEE Trans Image Process2001;10(January (2)):266–77.

Chen MJ, Kress B, Han X, Moll K, Peng W, Ji RR, et al. Astrocytic CX43 hemichannelsand gap junctions play a crucial role in development of chronic neuropathic painfollowing spinal cord injury. Glia 2012;60(November (11)):1660–70.

Frangi AF, Niessen WJ, Vincken KL, Viergever MA. Multiscale vessel enhancementfiltering. Med Image Comput Comput Assist Interv 1998;1496:130–7.

Gala R, Chapeton J, Jitesh J, Bhavsar C, Stepanyants A. Active learning of neu-ron morphology for accurate automated tracing of neurites. Front Neuroanat2014;8(May):37.

Haseleu J, Anlauf E, Blaess S, Endl E, Derouiche A. Studying subcellular detail in fixedastrocytes: dissociation of morphologically intact glial cells (DIMIGs). Front CellNeurosci 2013;7(May):54.

Hashemi M, Buibas M, Silva GA. Automated detection of intercellular signaling inastrocyte networks using the converging squares algorithm. J Neurosci Methods2008;170(May (2)):294–9.

Kalliomäki I, Lampinen J. On steerability of Gabor-type filters for feature detection.Pattern Recognit Lett 2007;28(June (8)):904–11.

Kanski R, van Strien ME, van Tijn P, Hol EM. A star is born: new insights into themechanism of astrogenesis. Cell Mol Life Sci 2013;71(February (3)):433–47.

Law MWK, Chung ACS. Three dimensional curvilinear structure detection usingoptimally oriented flux. Computer Vision–ECCV; 2008. p. 368–82.

Law MWK, Chung ACS. An Oriented Flux Symmetry based Active Contour Modelfor Three Dimensional Vessel Segmentation. In: The 11th European Conferenceon Computer Vision, ECCV, (ECCV’10), Hersonissos, Heraklion, Crete, Greece,

September 5-11; 2010. p. 720–34, LNCS 6313.

Liao W, Wörz S, Rohr K. Vessel segmentation using an iterative fast marchingapproach with directional prior. In: Haynor DR, Ourselin S, editors. SPIE MedImaging, 8314. International Society for Optics and Photonics; 2012. p. 831426,February.

urosc

L

L

L

L

M

M

M

M

M

M

M

M

M

N

N

O

P

P

R

R

R

R

Int Symp Biomed Imaging 2013:1356–9.

P.M. Kulkarni et al. / Journal of Ne

in G, Al-Kofahi Y, Tyrrell JA, Bjornsson C, Shain W, Roysam B. Automated 3D quan-tification of brain tissue at the cellular scale from multi-parameter confocalmicroscopy images. In: IEEE Int Symp Biomed Imaging: From Nano to Macro;2007. p. 1040–3.

ópez-Hidalgo M, Schummers J. Cortical maps: a role for astrocytes? Curr OpinNeurobiol 2014;24(February (1)):176–89.

u Y, Carin L, Coifman R, Shain W, Roysam B. Quantitative arbor analytics: unsu-pervised harmonic co-clustering of populations of brain cell arbors based onL-measure. Neuroinformatics 2014(August).

uisi J, Narayanaswamy A, Galbreath Z, Roysam B. The FARSIGHT trace editor: anopen source tool for 3D inspection and efficient pattern analysis aided editingof automated neuronal reconstructions. Neuroinformatics 2011;9(September(2–3)):305–15.

atyash V, Kettenmann H. Heterogeneity in astrocyte morphology and physiology.Brain Res Rev 2010;63(May (1–2)):2–10.

aurer CR, Qi R, Raghavan V. A linear time algorithm for computing exact Euclideandistance transforms of binary images in arbitrary dimensions. IEEE Trans PatternAnal Mach Intell 2003;25(February (2)):265–70.

aynard ME, Leasure JL. Exercise enhances hippocampal recovery following bingeethanol exposure. PLOS ONE 2013;8(September (9)):e76644.

eijering E. Neuron tracing in perspective. Cytometry Part A 2010;77(March(7)):693–704.

olofsky AV, Krenick R, Ullian EM, Tsai HH, Deneen B, Richardson WD, et al. Astro-cytes and disease: a neurodevelopmental perspective. Genes Dev 2012;26(May(9)):891–907.

orrens J, Van Den Broeck W, Kempermann G. Glial cells in adult neurogenesis. Glia2012;60(February (2)):159–74.

ukherjee A. [Ph.D. dissertation] Methods for automated analysis of curvilinearstructures in three-dimensional and spatio-temporal microscopy images usingorientation distribution functions [Ph.D. dissertation]. New York, USA: Rensse-laer Polytechnic Institute; 2011.

ukherjee A, Stepanyants A. Automated reconstruction of neural trees using frontre-initialization. SPIE Med Imaging 2012;8314(February):1–6.

yatt DR, Hadlington T, Ascoli GA, Nasuto SJ. Neuromantic – from semi-manualto semi-automatic reconstruction of neuron morphology. Front Neuroinform2012;6(March):4.

arayan PJ, Gibbons HM, Mee EW, Faull RL, Dragunow M. High throughput quantifi-cation of cells with complex morphology in mixed cultures. J Neurosci Methods2007;164(August (2)):339–49.

orton WT, Aquino DA, Hozumi I, Chiu FC, Brosnan CF. Quantitative aspects ofreactive gliosis: a review. Neurochem Res 1992;17(September (9)):877–85.

berheim NA, Tian GF, Han X, Peng W, Takano T, Ransom B, et al. Loss of astro-cytic domain organization in the epileptic brain. J Neurosci 2008;28(March(13)):3264–76.

admanabhan RK, Somasundar VH, Griffith SD, Zhu J, Samoyedny D, Tan KS, et al. Anactive learning approach for rapid characterization of endothelial cells in humantumors. PLOS ONE 2014;9(March (3)):e90495.

irici D, Mogoanta L, Margaritescu O, Pirici I, Tudorica V, Coconu M. Fractal analysisof astrocytes in stroke and dementia. Rom J Morphol Embryol 2009;50(August(3)):381–90.

ansom B, Behar T, Nedergaard M. New roles for astrocytes (stars at last). TrendsNeurosci 2003;26(October (10)):520–2.

odriguez A, Ehlenberger DB, Hof PR, Wearne SL. Three-dimensional neuron tracingby voxel scooping. J Neurosci Methods 2009;184(October (1)):169–75.

ouchdy Y, Cohen LD, Pascual O, Bessis A. Segmentation of microglia from confocal

microscope images combining the fast marching method with Harris points. In:Microsc Image Anal with Appl Biol Med Image Comput & Comput Assist IntervWorkshop; 2008. p. 1–5.

ouchdy Y, Cohen LD. Retinal blood vessel segmentation using geodesic votingmethods. IEEE Int Symp Biomed Imaging 2012;74:4–7.

ience Methods 246 (2015) 38–51 51

Rouchdy Y, Cohen LD, Pascual O, Bessis A. Minimal path techniques for auto-matic extraction of microglia extensions. Int J Comput Vis Biomech 2011;3:5–42.

Scorcioni R, Polavaram S, Ascoli GA. L-Measure: a web-accessible tool for the analy-sis, comparison and search of digital reconstructions of neuronal morphologies.Nat Protoc 2008;3(April (5)):866–76.

Sethian JA. A fast marching level set method for monotonically advancing fronts.Proc Natl Acad Sci U S A 1996;93(4):1591–5.

Sethian JA. Level set methods and fast marching methods: evolving interfaces incomputational geometry, fluid mechanics, computer vision, and materials sci-ence. 2nd ed. New York, NY, USA: Cambridge University Press; 1999. p. 3.

Settles B. Active learning. In: Brachman R, Cohen WW, Stone P, editors. Synthesis lec-tures in artificial intelligence and machine learning. San Rafael, CA, USA: Morgan& Claypool; 2012. p. 1–114, 6 (1).

Sidoryk-Wegrzynowicz M, Wegrzynowicz M, Lee E, Bowman AB, Aschner M. Roleof astrocytes in brain function and disease. Toxicol Pathol 2011;39(January(1)):115–23.

Sironi A, Lepetit V, Fua P. Multiscale centerline detection by learning a scale-spacedistance transform. Comput Vis Pattern Recognit 2014(June):2697–704.

Sofroniew MV, Vinters HV. Astrocytes: biology and pathology. Acta Neuropathol2010;119(1):7–35.

Song H, Stevens CF, Gage FH. Astroglia induce neurogenesis from adult neural stemcells. Nature 2002;417(May (6884)):39–44.

Sternberg SR. Biomedical image processing. Computer 1983;16(1):22–34.Suwannatat P, Luna G, Lewis GP, Fisher SK, Hollerer T. Scalable interactive analysis

of retinal astrocyte networks. IEEE Symp Biol Data Vis 2012, October.Türetken E, Benmansour F, Fua P. Automated Reconstruction of Tree Structures using

Path Classifiers and Mixed Integer Programming. IEEE Conference on ComputerVision and Pattern Recognition (CVPR), Providence, Rhode Island, IEEE Confer-ence on Computer Vision and Pattern Recognition, 2012.

Türetken E, Becker C, Glowacki P, Benmansour F, Fua P. Detecting Irregular Curvilin-ear Structures in Gray Scale and Color Imagery using Multi-Directional OrientedFlux. International Conference on Computer Vision (ICCV), Sydney, Australia,2013.

Tyrrell JA, di Tomaso E, Fuja D, Tong R, Kozak K, Jain RK, et al. Robust 3D mod-eling of vasculature imagery using superellipsoids. IEEE Trans Med Imaging2007;26(2):223–37.

Verkhratsky A, Butt A. Glial neurobiology. 1st ed. West Sussex, England: John Wiley& Sons; 2007.

Wang Y, Narayanaswamy A, Roysam B. Novel 4D open-curve active contour andcurve completion approach for automated tree structure extraction. ComputVis Pattern Recognit 2011a:1105–12.

Wang Y, Narayanaswamy A, Tsai CL, Roysam B. A broadly applicable 3D neuron trac-ing method based on open-curve snake. Neuroinformatics 2011b;9(September(2–3)):193–217.

Wilhelmsson U, Li L, Pekna M, Berthold CH, Blom S, Eliasson C, et al. Absence ofglial fibrillary acidic protein and vimentin prevents hypertrophy of astrocyticprocesses and improves post-traumatic regeneration. J Neurosci 2004;24(May(21)):5016–21.

Xiao H, Peng H. APP2: automatic tracing of 3D neuron morphology based onhierarchical pruning of a gray-weighted image distance-tree. Bioinformatics2013;29(June (11)):1448–54.

Xu Y, Savelonas M, Qiu P, Trett K, Shain W, Roysam B. Unsupervised inference ofarbor morphology progression for microglia from confocal microscope images.

Zhang Y, Barres BA. Astrocyte heterogeneity: an underappreciated topic in neurobi-ology. Curr Opin Neurobiol 2010;20(October (5)):588–94.

Zhao X, Ahram A, Berman RF, Muizelaar JP, Lyeth BG. Early loss of astrocytes afterexperimental traumatic brain injury. Glia 2003;44(November (2)):140–52.

Recommended