Robust low-overlap 3-D point cloud registration for ... · Robust low-overlap 3-D point cloud registration for outlier rejection John Stechschulte, 1Nisar Ahmed,2 and Christoffer

Robust low-overlap 3-D point cloud registration for outlier rejection

John Stechschulte,1 Nisar Ahmed,2 and Christoffer Heckman1

Abstract— When registering 3-D point clouds it is expectedthat some points in one cloud do not have correspondingpoints in the other cloud. These non-correspondences are likelyto occur near one another, as surface regions visible fromone sensor pose are obscured or out of frame for another.In this work, a hidden Markov random field model is usedto capture this prior within the framework of the iterativeclosest point algorithm. The EM algorithm is used to estimatethe distribution parameters and learn the hidden componentmemberships. Experiments are presented demonstrating thatthis method outperforms several other outlier rejection methodswhen the point clouds have low or moderate overlap.

I. INTRODUCTION

Depth sensing is an increasingly ubiquitous technique forrobotics, 3-D modeling and mapping applications. Struc-tured light sensors, lidar, and stereo matching produce pointclouds, or 3-D points on surfaces. Algorithmic registrationof point clouds is needed to use the resulting data, oftenwithout the aid of other measurements. Iterative closest point(ICP) [1] is commonly used for this purpose, although thistime-tested approach has its drawbacks, including failurewhen registering clouds with low overlap [2], [3]. However,many applications of 3-D registration would be more efficientwith low overlap clouds, such as combining partial scansinto a complete 3-D model. To address this gap, we presenta probabilistic model using a hidden Markov random field(HMRF), shown in Fig. 1, for inferring via the EM algorithmwhich points lie in the overlap. A successful alignment ofclouds with low overlap is shown in Fig. 2.

ICP attempts to recover the optimal transformation to aligntwo point clouds. The algorithm recovers the transformationthat moves the “free” point cloud onto the “fixed” point cloudby iteratively:

1) finding the closest point in the fixed cloud to each pointin the free cloud;

2) discarding some of these matches as outliers; and3) computing and applying the rigid transform that op-

timally aligns the remaining points (the inliers), min-imizing some measure of the nearest point distancesfound in step 1;

until convergence. The present work focuses on step 2 of theabove sequence.

1 Department of Computer Science, University of Col-orado, Boulder, [email protected],[email protected]

2 Department of Aerospace Engineering, University of Colorado, Boul-der, [email protected]

JS was supported by the U.S. Department of Defense (DoD) through theNational Defense Science & Engineering Graduate Fellowship (NDSEG)program while conducting this work. CH was supported by DARPA awardno. N65236–16–1–1000.

z13

y13 z9

y9 z5

y5 z1

y1

z14

y14 z10

y10 z6

y6 z2

y2

z15

y15 z11

y11 z7

y7 z3

y3

z16

y16 z12

y12 z8

y8 z4

y4

Fig. 1. Graphical model for nearest fixed point distance, shown for a 4×4grid of pixels in the free depth map. At pixel i, the Zi ∈ {±1} value isthe unobserved inlier/outlier state, and Yi ∈ R≥0 is the observed distanceto the closest fixed point.

Which error measure is minimized in step 3 has been con-sidered extensively [4], and is explored further in Section II.There are closed-form solutions [5] that minimize the point-to-point distance, including using the singular value decom-position [6] and the dual number quaternion method [7].The present work minimizes the sum of squared point-to-point distances, for simplicity, although it is fundamentallyindependent of choice of error metric.

A crucial step in ICP is the rejection of outliers (step 2),generally resulting from non-overlapping volumes of spacebetween two measurements. The original ICP formulation [1]does not discard any points and simply incurs error foreach outlier. A proliferation of strategies have been proposedfor discarding outliers [3], [8], [9], [10], [11], [12]. Thesemethods threshold the residual distance from free points tothe nearest point in the fixed cloud, but do not consider thespatial relations amongst points in the free cloud, and whatinformation might be gained by considering whether a point’sneighbors are inliers or outliers. In the present work we pro-pose an alternative method where the distribution of residualdistances is modeled as a mixture of two distributions: aGaussian for inliers and a logistic distribution for outliers.Using a technique from image segmentation, a point’s inlieror outlier state is modeled as being influenced by the stateof its neighbors through a hidden Markov random field—an expectation we refer to as the “neighbor prior.” The EMalgorithm, with the use of a mean field approximation, allowsfor inference of the hidden state.

By employing the neighbor prior, the HMRF model caninfer which points are outliers in high- and low-overlapcloud pairs. Although exact inference for an MRF model isintractable in applications of reasonable size, the mean fieldapproximation provides sufficient accuracy at a reasonablecomputational expense. Further justification and developmentof the neighbor prior is discussed at length in Sections II

Fig. 2. A successful registration with only 36% overlap. The blue points arethe fixed cloud, and the green and red points together are the free cloud,which has been registered. The green points are inliers and the red areoutliers, as inferred by the HMRF method.

and III. Sections IV and V present experiments and resultson two challenging benchmark dense 3D datasets, showingthis method outperforms two state-of-the-art methods, par-ticularly on point clouds with low overlap. To the authors’knowledge, this is the first example of an HMRF model beingemployed for 3-D registration, and these promising resultsshow that the model is a viable and attractive alternative toother more complex probabilistic scan matching techniques.In particular, the HMRF’s relative simplicity and ease ofdeployment make it scalable and adaptable to differentenvironments where little or no prior knowledge or trainingdata are available.

II. RELATED WORK

Several attempts have been made to improve registra-tion performance in the low-overlap regime. The hybridgenetic/hill-climbing algorithm of Silva, et al. [13] showssuccess with overlaps down to 55% but at the great compu-tational expense of a stochastic global search. Good low-overlap performance is claimed in [14], which defines a“direction angle” on points and then aligns clouds in rotationusing a histogram of these, and in translation using correla-tions of 2-D projections; although not described as such, aManhattan world assumption is made. Rotational alignmentis recovered using extended Gaussian images in [2], andrefined with ICP, showing success with overlap as lowas 45%. HMRF ICP is simpler than these methods, andgenerally performs well at overlaps that are still lower.

Robust statistics or improved error metrics can make ICPmore robust to outliers. The point-to-plane [15] metric, whichmeasures the distance projected onto the surface normal ofthe fixed point, is frequently used and more robust in the caseof limited overlap [16]. Sparse norms are used within ICPin [17]. These additions are mostly orthogonal to the work

presented, and so could be implemented within HMRF ICP,possibly resulting in further performance improvements.

Improvements to ICP have been achieved through betterrejection of outlier correspondences—a thread of researchthat the current work fits into. Several methods work bythresholding the residual distances, and differ simply in howthe threshold is determined: fixed distance [8], residual per-centile [3], standard deviations from the mean residual [9],or median absolute deviations from the median residual [10].These methods inherently assume a large overlap fraction,so are brittle to overlap variations. Zhang [11] presentsan adaptive threshold tuned with a distance parameter thatdoes not have direct physical significance. Fractional ICPdynamically adapts the fraction of point correspondencesto be used [18], although this method explicitly assumes alarge fraction of overlap, with a penalty for smaller overlapfractions. Finally, there are a few outlier-rejection methodsthat cannot be summarized as simple thresholds. Enqvist,et al. [12] leverage the fact that distances between corre-sponding points within a cloud will be invariant under rigidmotion and find the largest set of consistent correspondencesto identify inliers. In [19], points are classified as inliers orone of three classes of outliers: occluded, unpaired (outsidethe frame), or outliers (sensor noise). The current work ismore closely related to the simple threshold methods, butuses a probabilistic model that better captures the expectedresidual distribution.

In contrast to the current method, EM-ICP [20], [21]employs EM to recover point correspondences. The corre-spondence is the hidden variable and the E-step computesassignment probabilities, and the implied rigid transforma-tion is recovered in the M-step. This approach assumes thatevery point in the free cloud corresponds to some point inthe fixed cloud, although it may be straightforward to includean “unassigned” value to the hidden variable states.

Since ICP requires an informed initialization, significantwork has been devoted to achieving global registration orfinding an approximate alignment from any initial state,which is then refined with ICP. In Super 4PCS [22], [23],an initial alignment is found by matching sets of 4 coplanarpoints, using ratios invariant to rigid transformation; goodperformance with low overlap is claimed, and is used asa basis for comparison in this work. Fast global registrationbetween two or more point clouds is achieved in [24] by find-ing correspondences only once based on point features, thenusing robust estimators to mute the impact of spurious cor-respondences. Branch-and-bound algorithms can guaranteeglobal optimality, and several variations have been applied tothe registration problem to search over transformations [25]or over point or feature correspondences [26]. With theexception of Super 4PCS, low-overlap performance is notaddressed in these works, and many require final refinement,thereby offering an opportunity to apply HMRF ICP.

Finally, Ramos, et al. [27] and Sun, et al. [28] useconditional random fields to discriminatively match 2-D lidarscans in an ICP setting with impressive and reliable results.Although extending the method to 3-D registration is theoret-

ically straightforward, the computational cost would increasesignificantly. These methods use an AdaBoost classifier tocombine several derived features, which requires training theclassifier with ground truth alignments. The classifier maynot generalize well to new observations that are not drawnfrom the same distribution as the training data. In contrast,the generative neighbor prior leveraged in this work is adirect consequence of sensor geometry, and thus does notrequire training data.

Few existing methods specifically address the problem ofaligning clouds with low overlap—more often, high overlapis assumed. By using an appropriate probabilistic model thatcaptures the neighbor prior, this assumption need not bemade, and the resulting method works equally well with highand low overlap.

III. PROBABILISTIC NEIGHBOR PRIOR MODEL

Let Bi ∈ R4 be a point in the free cloud in homogeneouscoordinates. In the first step of ICP, we find the closest pointto Bi in the fixed cloud, which we will denote Cj ∈ R4.The distance between them, yi = ‖Bi−Ci‖, is the observedresidual. Y will denote all Yi random variables, with specificinstantiations represented as y and yi, respectively.

Many sources of depth data generate data with a 2-Dlattice structure—the pixel grid—in which each pixel hasfour nearest neighbors. We exploit these neighbor relationsto model the distribution of the closest point distances Y .Neighbor relations can also be defined on unstructured pointclouds, as discussed below, although a grid topology is moreintuitive.

Given observed distances y, we wish to decide whichpoints are inliers. Our prior beliefs are: (i) inliers willgenerally lie closer to their respective closest point thanoutliers; and (ii) neighbors of inliers are likely inliers, andneighbors of outliers are likely outliers—the neighbor prior.To capture these priors, we model the distribution of Y asa mixture of two distributions, one for inliers and one foroutliers, where a point’s mixture membership is dependenton its nearest neighbors. That is, we capture the second priorusing a hidden Markov random field on the inlier/outlier stateof a point.

A. Probabilistic model

The graphical model for data with a grid topology is shownin Fig. 1. The distribution of Y conditionally depends on thehidden field Z, which is Gibbs distributed with a parameterβ that controls the strength of the neighbor influence. TheGibbs distribution is calculated based on the energy of agiven configuration, PG(Z) = W−1 exp(−H(Z)), whereW is a normalization term called the “partition function”,W =

∑z exp(−H(z)).

We use the energy function H(z) = −β∑i′∼i wi,i′zizi′

where wi,i′ is the edge weight, and β ≥ 0 is a parametercontrolling the interaction strength. If all neighbor relationsare considered equally, such as in a grid topology with noprivileged edges, the weights are all 1. More generally, theycan capture the strength of a particular neighbor interaction.

Function ICP(Initial transform Tinit, clouds B and C,field parameter β, thresholds):

KDtree = BuildKdTree(C)T = TinitB = T ×BI,y = KDtree.NearestNeighbors(B)Initialize z to 1 except for highest 10% of y, which

are initialized to -1.do

doθ = M-step(y, z)z = E-step(y, z, θ, β)

while some z value changed signTstep = localize(B,C, z)T = Tstep × TB = Tstep ×BI,y = KDtree.NearestNeighbors(B)

while Tstep sufficiently largereturn T

Function E-step(y, z, θ, β):Calculate update, z ← E[Z|y, z, θ]return z

Function M-step(y, z):Calculate MLE of θ from complete data log

likelihood, assuming E[Z] = zreturn θ

Algorithm 1: The full HMRF ICP algorithm. Pyramidinghas been omitted for clarity, as well as iteration limits onloops.

Note that calculation of W , and therefore exact calculationof PG(z), requires a sum over all possible configurations, sois exponential in the number of nodes. For a nontrivial fieldthis is intractable. We avoid this issue with the mean fieldapproximation [29], which assumes a fixed configuration z =E[PG(Z|β)] and approximates the Gibbs distribution withindependent components conditioned on z,

PG(Z|β) ≈∏i

Pmf(Zi|β, z). (1)

As the components are independent, it is no longer necessaryto exhaust over z configurations. The components alsodepend only on local information:

Pmf(zi|β, z) =exp (β

∑i′∼i wi,i′zizi′)

exp (β∑i′∼i wi,i′(+1)zi′) + exp (β

∑i′∼i wi,i′(−1)zi′)

.

(2)We assume that inliers are normally distributed and out-

liers are logistically distributed (as discussed further below).The approximate complete data likelihood can be written,

fmf(Y ,Z|β, θ, z) =∏i

Pmf(zi|β, z)

(N(yi|µ+1, σ+1))(1+zi)/2(L(yi|µ−1, s−1))

(1−zi)/2, (3)

with parameters β, µ−1, s−1, µ+1, σ+1. We will use θ torepresent all Gaussian and logistic parameters, that is, θ ={µ−1, s−1, µ+1, σ+1}.

B. Applying the EM algorithm

The maximum likelihood model parameters and hiddenstate are estimated using the EM algorithm, similarly tothe image segmentation model presented in [30], [31], [32].In the E-step, the current estimate of the normal and lo-gistic parameters, along with the current mean field andthe observed residuals, are used to find the expected valueof the hidden field. Then, in the M-step, the estimates forthe normal and logistic parameters are updated using themaximum likelihood estimates from the observed data andthe expected value of the hidden field. The full HMRF ICPalgorithm is shown in Algorithm 1.

The complete data log likelihood can be bounded be-low by taking the expectation with respect to the hiddenZ values, due to Jensen’s inequality. Because of linearityof expectation, and since zi appears linearly in the com-plete data log likelihood, this amounts to replacing zi withE[zi|yi, z, θ, β] = 2P (zi = 1|yi, z, θ, β) − 1 throughoutEquation 3. The EM algorithm iteratively calculates theseprobabilities in the E-step, and then chooses parameters tomaximize the resulting expected likelihood in the M-step.

1) The E-step: The mean field approximation allows fora closed-form E-step, where we calculate,

P (zi = +1|yi, z, θ, β) ∝

exp

(β∑i′∼i

wi,i′ zi′

)1√

2πσ2+1

exp

((yi − µ+1)

2

2σ2+1

)(4)

P (zi = −1|yi, z, θ, β) ∝

exp

(−β∑i′∼i

wi,i′ zi′

)exp

(−yi−µ−1

s

)s(1 + exp

(−yi−µ−1

s

))2(5)

The necessary normalization factor is the same for the twocalculations, so is simply their sum.

2) The M-step: Now, the expected hidden values can beused to update the mean field and problem parameters inthe M-step using maximum likelihood estimates. Updatingthe mean field is simply adopting the expected zi valuescalculated in the E-step. The MLEs of the normal and logisticdistribution parameters are calculated in the normal way, butthe data are weighted by the probabilities that they camefrom the given distribution. For the inliers,

n+1 =∑i

(1 + zi

2

)(6)

µ+1 =1

n+1

∑i

1 + zi2

yi (7)

σ+1 =

√1

n+1

∑i

1 + zi2

y2i − µ2+1. (8)

For the outliers,

n−1 =∑i

(1− zi

2

)(9)

µ−1 =1

n−1

∑i

1− zi2

yi (10)

s−1 =

√3

π

√1

n−1

∑i

1− zi2

y2i − µ2−1. (11)

C. Fitting the outlier distribution

It was observed that some outlier distributions had multiplemodes or heavy tails, as distant regions were observed. Tobetter model this distribution, outlier residuals for 100 pairsof frames from each of the three datasets described in Sec-tion IV were analyzed. A Kolmogorov-Smirnov statistic wasused to compare the empirical CDF with the normal distribu-tion and six heavy-tail distributions: the gamma, logistic, log-logistic, log-normal, Rayleigh, and Student’s t distributions.No distribution outperforms all others in all cases, but thelogistic distribution shows good average performance and issimple to estimate within the EM framework. Nevertheless,the worst cloud pairs still fail to align—in particular, thosewith multimodal outlier residuals. A mixture model wouldlikely be more effective in these cases, but would also besignificantly more computationally expensive.

D. Accelerating EM convergence

The EM algorithm is accelerated via a pyramid method:we downsample the Z and Y fields, allow EM to convergefor this smaller model, then upsample the converged Z toinitialize the larger model. For grid topologies, each pyramidlevel has a quarter the number of pixel sites as the levelbelow it (half in each dimension). For unstructured clouds,discussed below, each pyramid level has half the numberof points as the level below it. Although pyramiding isa common technique in computer vision, such as in thepyramidal Lucas Kanade optical flow algorithm [33], thisis the first application to initializing a Markov random field.

E. Unstructured clouds

The pixel grid provides an intuitive structure on whichto define the Markov random field, but the method canbe extended to any point cloud by defining neighbor rela-tions independent of underlying topology. We measure the“neighborliness” of two points by their distance, and applya Gaussian kernel, exp

(−‖Bi −Bj‖/2σ2

), to generate an

appropriate edge weight. The σ parameter is set to half theaverage nearest-neighbor distance.

Calculating weights for every pair of points would requireN2 work, but only the nearest neighbors to a given pointwill have non-negligible values, by design. Thus, neighborweights are only calculated for the k-nearest neighbors ofeach point, with 6 ≤ k ≤ 10 giving satisfactory experimentalresults. Applying the pyramid acceleration to this neighbor-hood structure, however, is not straightforward. The obviousdownsampling and upsampling methods available with a gridcannot be used for an arbitrary graph. Instead, we sample half

Fig. 3. Successful registration of two lidar spins, taken from the KITTIodometry dataset [34]. On the left, the initial registration from the RTK GPSposes. On the right, the registration refined by HMRF ICP. No informationabout the data topology was used.

the points at random to build a new layer on the pyramid.The new layer needs neighbor weights, as well, which wecalculate by squaring the neighbor matrix for the lower level,with diagonal elements set to 0. This captures the two-hopweight between points in the lower neighbor matrix. Finally,once EM converges in a pyramid layer, the resulting Z fieldis upsampled using the lower-level neighborhood matrix toinitialize the points that were not sampled in the higher-levelneighborhood.

Figure 3 shows the successful alignment of two lidarspins, taken from the KITTI odometry dataset [34]. Theground truth poses in this dataset are from RTK GPS andIMU measurements, which have high accuracy over longsequences, but are not sufficiently accurate for the purposeof our experiments, described below.

IV. EXPERIMENTS

The HMRF method is compared with Go-ICP [35] andSuper 4PCS [23], as well as ICP with five other outlierrejection methods: (1) no outlier rejection, (2) keeping 90%of the points with smallest residuals, (3) keeping pointswhose residuals were less than 2.5 standard deviations abovethe mean, (4) keeping points whose residuals were lessthan 5.2 median absolute deviations above the median (theso-called “X84” criterion), and (5) the dynamic thresholdfrom Zhang [11]. All of the ICP implementations (includingHMRF ICP) use the point-to-point sum of squares distancemetric.

The method is applied to several datasets: the sharksequence is a tabletop scene, taken by an Asus Xtion Pro;the remaining experiments were run on ten publicly availableRGB-D SLAM datasets [36] from the TUM Computer VisionGroup (specifically, those datasets in the “Handheld SLAM”category). Ground truth poses are known for both sequences.The sensor in the desk and room sequences is trackedusing an external motion tracking system. The poses in theshark sequence are those estimated in the tracking stage of

InfiniTAM[37] during the reconstruction of the scene. Thecamera calibrations are known and used to appropriatelyunproject the depth maps and generate point clouds. In theshark sequence, 100 pairs of frames were selected at random,but stratified to include a variety of overlap ratios. Similarly,90 frames were selected from the TUM datasets: one eachfrom each dataset and overlap decile, except 0–10%. Theoverlap is estimated using the ground truth poses and therelative distance to each free point’s nearest neighbor in itsown cloud versus in the fixed cloud. The threshold for thiscomparison was chosen to err on the side of overestimatingthe overlap ratio.

The frame pairs were aligned using ground truth poses,and translated so that the coordinate origin was at thecentroid of the fixed frame. Then, the free cloud alignmentwas perturbed by rotating 18 degrees about a random axisthrough the origin. The same initialization was used for allmethods for a given frame pair (including the global meth-ods, Go-ICP and Super 4PCS). HMRF ICP was configuredwith 4 pyramid levels. Before the first ICP transformationwas calculated, EM was limited to 150 iterations at eachpyramid level, although this limit was rarely met; afterapplying the first transformation, EM was limited to 5 stepsat each pyramid level. All experiments were executed inMATLAB on a workstation with 8 Intel R© Xeon R© E5620CPUs at 2.40GHz, and with 48 GB RAM. The Go-ICPand Super 4PCS implementations are available from theauthors [35], [23]. Both are written in C++ and are single-threaded. Parameters for these methods were chosen basedon the provided demos, and their execution time was limitedto 120 seconds. The Super 4PCS algorithm requires a priorestimate of the cloud overlap, which was provided from theoverlap estimate described above. All code can be found athttps://github.com/JStech/ICP.

V. RESULTS

The results were noisy and no method was consistentlybest. Complete results are available in the supplementary ma-terials. HMRF ICP performed best among the ICP variants,so we compare it to Go-ICP and Super 4PCS here.

Fig. 4 shows summary plots of the rotation error, trans-lation error, and elapsed time for each method. At highoverlaps, Go-ICP often recovered the transformation mostaccurately. However, its performance deteriorated quickly asthe overlap decreased: only 7 Go-ICP tests with overlapsbelow 70% ran to completion within the time limit andreturned transformations. Super 4PCS performed well atmoderate overlaps, and often recovered the most accuratetransformation in rotation and translation for overlaps around50%. At very low overlaps (below 20%), no method per-formed reliably. Super 4PCS was very fast at moderate-to-high overlaps, even running single-threaded. However, ithad a higher variance than HMRF ICP, in both accuracyand elapsed time. Note that the elapsed time of HMRFICP cannot be compared directly with that of Go-ICP andSuper 4PCS, as HMRF ICP was implemented entirely inMATLAB (and was thus able to take advantage of built-in

0 0.2 0.4 0.6 0.8 1

0

0.5

1

1.5Sh

ark

scen

eE

rror

(met

ers)

Translation error

0 0.2 0.4 0.6 0.8 1

0

0.5

1

1.5

Err

or(r

adia

ns)

Rotation error

0 0.2 0.4 0.6 0.8 1

0

20

40

Ela

psed

time

(sec

)

Elapsed time

0 0.2 0.4 0.6 0.8 1

0

0.5

1

1.5

Overlap fraction

TU

Msc

enes

Err

or(m

eter

s)

0 0.2 0.4 0.6 0.8 1

0

0.5

1

1.5

Overlap fraction

Err

or(r

adia

ns)

0 0.2 0.4 0.6 0.8 1

0

20

40

Overlap fraction

Ela

psed

time

(sec

)

Go-ICP Maximum

Super 4PCSQ3MedianQ1

HMRF Minimum

Fig. 4. Error and elapsed time plots for shark scene (top) and all TUM scenes (bottom). For each method, the data are aggregated by overlap decile, andthe minimum, first quartile, median, third quartile, and maximum are shown. A method is only plotted if it ran successfully for at least half of the casesin the decile: hence the truncated Go-ICP and Super 4PCS plots.

multithreading optimizations), whereas Go-ICP and Super4PCS were single-threaded in C++ (although then had theadvantage of being compiled). However, the performancetrends can still be reliably understood: all methods performmore slowly as overlap fraction drops, although HMRF ICP’sperformance does not deteriorate as quickly as the othermethods.

Fig. 5. Example Y , z(1), and z(375) (at convergence). This frame has 36%overlap with the fixed frame. The left image shows the observed distanceto the nearest fixed point in the initial configuration; yellow is farther, blueis nearer, and the white areas are unobserved. In the right two images, bluepixels are outliers and red pixels are inliers, green pixels are unobserved.

VI. DISCUSSION AND CONCLUSIONS

The good performance of HMRF ICP at low overlaps canbe understood by considering the example frame with 36%overlap shown in Fig. 5. The three images all represent valuesbefore the first transformation is applied to the free cloud:the first is the residual distance to the nearest fixed point, thesecond is the initial setting of the z field, and the third is theconverged z field before the first iteration of ICP. The HMRFmodel flexibly adapts to the small proportion of inlier points,in particular it eliminates outliers across the top of the image.

The pixels that are unobserved (that is, the sensor returns nodepth measurement) occupy 31% of the image. In the initialz field, 90% of observed pixels are considered inliers, andthe remaining 10% are considered outliers. After 375 initialEM iterations, the z field has converged, and now has only54% inliers and 46% outliers. By eliminating these outliersbefore the first transformation is calculated, divergence fromthe nearby optimum is avoided. An example alignment isshown at https://youtu.be/w4eVOgd7Zes.

Despite the use of the logistic distribution to modeloutliers, there were still cases where alignment failed becauseof multimodal outlier residuals. To address this failure mode,a still better representation of the residual distributions foroutliers is necessary. For instance, modeling the outliers asa Gaussian mixture of several components could allow thefield to fit the distant outliers with one Gaussian, and thenearer outliers as another. Using robust norms [17], [38] andthe point-to-plane metric [15] would also make HMRF ICPmore robust to imperfect estimations of which points lie inthe overlap of the two clouds.

The HMRF model for the overlap of point clouds beingaligned via ICP has demonstrated advantages at low overlapwithout sacrificing performance at high overlap. The HMRFmodel captures the neighbor prior and describes observedinlier/outlier behavior well, and so can adapt to the particularclouds being aligned. This would prove useful in the con-struction of models from 3-D scanner data, as fewer scanswould be required, or aligning depth readings at a low framerate, allowing greater differences between frames.

REFERENCES

[1] P. J. Besl, N. D. McKay, et al., “A method for registration of 3-D shapes,” IEEE Transactions on Pattern Analysis and MachineIntelligence, vol. 14, no. 2, pp. 239–256, 1992.

[2] A. Makadia, A. Patterson, and K. Daniilidis, “Fully automatic registra-tion of 3D point clouds,” in Computer Vision and Pattern Recognition,2006 IEEE Computer Society Conference on, vol. 1, pp. 1297–1304,IEEE, 2006.

[3] D. Chetverikov, D. Svirko, D. Stepanov, and P. Krsek, “The trimmediterative closest point algorithm,” in Pattern Recognition, 2002. Pro-ceedings. 16th International Conference on, vol. 3, pp. 545–548, IEEE,2002.

[4] S. Rusinkiewicz and M. Levoy, “Efficient variants of the ICP algo-rithm,” in 3-D Digital Imaging and Modeling, 2001. Proceedings.Third International Conference on, pp. 145–152, IEEE, 2001.

[5] D. W. Eggert, A. Lorusso, and R. B. Fisher, “Estimating 3-D rigid bodytransformations: a comparison of four major algorithms,” MachineVision and Applications, vol. 9, no. 5-6, pp. 272–290, 1997.

[6] K. S. Arun, T. S. Huang, and S. D. Blostein, “Least-squares fittingof two 3-D point sets,” IEEE Transactions on Pattern Analysis andMachine Intelligence, no. 5, pp. 698–700, 1987.

[7] M. W. Walker, L. Shao, and R. A. Volz, “Estimating 3-D locationparameters using dual number quaternions,” CVGIP: Image Under-standing, vol. 54, no. 3, pp. 358–367, 1991.

[8] G. Turk and M. Levoy, “Zippered polygon meshes from range images,”in Proceedings of the 21st Annual Conference on Computer Graphicsand Interactive Techniques, pp. 311–318, ACM, 1994.

[9] T. Masuda, K. Sakaue, and N. Yokoya, “Registration and integrationof multiple range images for 3-D model construction,” in PatternRecognition, 1996., Proceedings of the 13th International Conferenceon, vol. 1, pp. 879–883, IEEE, 1996.

[10] A. Fusiello, U. Castellani, L. Ronchetti, and V. Murino, “Modelacquisition by registration of multiple acoustic range views,” ComputerVisionECCV 2002, pp. 558–559, 2002.

[11] Z. Zhang, “Iterative point matching for registration of free-form curvesand surfaces,” International Journal of Computer Vision, vol. 13, no. 2,pp. 119–152, 1994.

[12] O. Enqvist, K. Josephson, and F. Kahl, “Optimal correspondencesfrom pairwise constraints,” in Computer Vision, 2009 IEEE 12thInternational Conference on, pp. 1295–1302, IEEE, 2009.

[13] L. Silva, O. R. P. Bellon, and K. L. Boyer, “Precision range imageregistration using a robust surface interpenetration measure and en-hanced genetic algorithms,” IEEE Transactions on Pattern Analysisand Machine Intelligence, vol. 27, no. 5, pp. 762–776, 2005.

[14] Y. Ma, Y. Guo, J. Zhao, M. Lu, J. Zhang, and J. Wan, “Fast andaccurate registration of structured point clouds with small overlaps,”in The IEEE Conference on Computer Vision and Pattern Recognition(CVPR) Workshops, June 2016.

[15] Y. Chen and G. Medioni, “Object modeling by registration of multiplerange images,” in Robotics and Automation, 1991. Proceedings., 1991IEEE International Conference on, pp. 2724–2729, IEEE, 1991.

[16] J. Salvi, C. Matabosch, D. Fofi, and J. Forest, “A review of recentrange image registration methods with accuracy evaluation,” Imageand Vision Computing, vol. 25, no. 5, pp. 578–596, 2007.

[17] S. Bouaziz, A. Tagliasacchi, and M. Pauly, “Sparse iterative closestpoint,” in Computer Graphics Forum, vol. 32, pp. 113–123, WileyOnline Library, 2013.

[18] J. M. Phillips, R. Liu, and C. Tomasi, “Outlier robust ICP forminimizing fractional RMSD,” in 3-D Digital Imaging and Modeling,2007. 3DIM’07. Sixth International Conference on, pp. 427–434,IEEE, 2007.

[19] T. Masuda and N. Yokoya, “A robust method for registration andsegmentation of multiple range images,” Computer Vision and ImageUnderstanding, vol. 61, no. 3, pp. 295–307, 1995.

[20] S. Granger and X. Pennec, “Multi-scale EM-ICP: A fast and robust ap-proach for surface registration,” in European Conference on ComputerVision, pp. 418–432, Springer, 2002.

[21] J. Hermans, D. Smeets, D. Vandermeulen, and P. Suetens, “Robustpoint set registration using EM-ICP with information-theoreticallyoptimal outlier handling,” in Computer Vision and Pattern Recognition(CVPR), 2011 IEEE Conference on, pp. 2465–2472, IEEE, 2011.

[22] D. Aiger, N. J. Mitra, and D. Cohen-Or, “4-points congruent sets forrobust pairwise surface registration,” ACM Transactions on Graphics(TOG), vol. 27, no. 3, p. 85, 2008.

[23] N. Mellado, D. Aiger, and N. J. Mitra, “Super 4PCS fast globalpointcloud registration via smart indexing,” in Computer GraphicsForum, vol. 33, pp. 205–215, Wiley Online Library, 2014.

[24] Q.-Y. Zhou, J. Park, and V. Koltun, “Fast global registration,” inEuropean Conference on Computer Vision, pp. 766–782, Springer,2016.

[25] C. Papazov and D. Burschka, “Stochastic global optimization forrobust point set registration,” Computer Vision and Image Understand-ing, vol. 115, no. 12, pp. 1598–1609, 2011.

[26] N. Gelfand, N. J. Mitra, L. J. Guibas, and H. Pottmann, “Robust globalregistration,” in Symposium on Geometry Processing, vol. 2, p. 5, 2005.

[27] F. T. Ramos, D. Fox, and H. F. Durrant-Whyte, “CRF-matching: Con-ditional random fields for feature-based scan matching,” in Robotics:Science and Systems, 2007.

[28] Z. Sun, J. Van de Ven, F. Ramos, X. Mao, and H. Durrant-Whyte,“Inferring laser-scan matching uncertainty with conditional randomfields,” Robotics and Autonomous Systems, vol. 60, no. 1, pp. 83–94,2012.

[29] D. Chandler, Introduction to modern statistical mechanics. OxfordUniversity Press, 1987.

[30] Z. Kato and T.-C. Pong, “A Markov random field image segmentationmodel for color textured images,” Image and Vision Computing,vol. 24, no. 10, pp. 1103–1114, 2006.

[31] J. Besag, “On the statistical analysis of dirty pictures,” Journal ofthe Royal Statistical Society. Series B (Methodological), pp. 259–302,1986.

[32] G. Celeux, F. Forbes, and N. Peyrard, “EM procedures using meanfield-like approximations for Markov model-based image segmenta-tion,” Pattern Recognition, vol. 36, no. 1, pp. 131–144, 2003.

[33] J.-Y. Bouguet, “Pyramidal implementation of the affine Lucas Kanadefeature tracker,” Intel Corporation, vol. 5, no. 1-10, p. 4, 2001.

[34] A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomousdriving? the kitti vision benchmark suite,” in Conference on ComputerVision and Pattern Recognition (CVPR), 2012.

[35] J. Yang, H. Li, D. Campbell, and Y. Jia, “Go-ICP: a globally optimalsolution to 3D ICP point-set registration,” IEEE Transactions onPattern Analysis and Machine Intelligence, vol. 38, no. 11, pp. 2241–2254, 2016.

[36] J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D. Cremers, “Abenchmark for the evaluation of RGB-D SLAM systems,” in Proc.of the International Conference on Intelligent Robot Systems (IROS),Oct. 2012.

[37] O. Kahler, V. A. Prisacariu, C. Y. Ren, X. Sun, P. Torr, and D. Murray,“Very high frame rate volumetric integration of depth images onmobile devices,” IEEE Transactions on Visualization and ComputerGraphics, vol. 21, no. 11, pp. 1241–1250, 2015.

[38] P. Mavridis, A. Andreadis, and G. Papaioannou, “Efficient sparse ICP,”Computer Aided Geometric Design, vol. 35, pp. 16–26, 2015.

Documents

Robust low-overlap 3-D point cloud registration for ... · Robust low-overlap 3-D point cloud registration for outlier rejection John Stechschulte, 1Nisar Ahmed,2 and Christoffer