11
Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography Vol. 37, No. 5, 331-341, 2019 https://doi.org/10.7848/ksgpc.2019.37.5.331 Updating Smartphone’s Exterior Orientation Parameters by Image-based Localization Method Using Geo-tagged Image Datasets and 3D Point Cloud as References Wang, Ying Hsuan 1) ·Hong, Seunghwan 2) ·Bae, Junsu 3) ·Choi, Yoonjo 4) ·Sohn, Hong-Gyoo 5) Abstract With the popularity of sensor-rich environments, smartphones have become one of the major platforms for obtaining and sharing information. Since it is difficult to utilize GNSS (Global Navigation Satellite System) inside the area with many buildings, the localization of smartphone in this case is considered as a challenging task. To resolve problem of localization using smartphone a four step image-based localization method and procedure is proposed. To improve the localization accuracy of smartphone datasets, MMS (Mobile Mapping System) and Google Street View were utilized. In our approach first, the searching for candidate matching image is performed by the query image of smartphone’s using GNSS observation. Second, the SURF (Speed-Up Robust Features) image matching between the smartphone image and reference dataset is done and the wrong matching points are eliminated. Third, the geometric transformation is performed using the matching points with 2D affine transformation. Finally, the smartphone location and attitude estimation are done by PnP (Perspective- n-Point) algorithm. The location of smartphone GNSS observation is improved from the original 10.204m to a mean error of 3.575m. The attitude estimation is lower than 25 degrees from the 92.4% of the adjsuted images with an average of 5.1973 degrees. Keywords : Smartphone, Image-based Localization, MMS (Mobile Mapping System), Google Street View 331 ISSN 1598-4850(Print) ISSN 2288-260X(Online) Original article 1. Introduction Recently, smartphones have become an important platform for sharing information. Current smartphones are equipped with high resolution cameras, IMU (Inertial Measurement Unit), GNSS (Global Navigation Satellite System), and other sensors with new releases. Since the wide availability of sensor-rich environments, smartphone changes not only the way people communicate but also the way people interacts with the society. Mobile crowdsensing technology that uses ubiquitous mobile devices to collect information about human activity and surrounding environment from integrated sensors on a mobile device has been growing. Information can be collected in the format of images, texts, and videos by smartphones. Moreover, the smartphone with GNSS and IMU sensors can offer location information Received 2019. 09. 24, Revised 2019. 10. 07, Accepted 2019. 10. 11 1) Dept. of Civil and Environmental Engineering, Yonsei University (E-mail: [email protected]) 2) Member, Stryx Inc. (E-mail: [email protected]) 3) Member, Dept. of Civil and Environmental Engineering, Yonsei University (E-mail: [email protected]) 4) Member, Dept. of Civil and Environmental Engineering, Yonsei University (E-mail: [email protected]) 5) Corresponding Author, Member, Dept. of Civil and Environmental Engineering, Yonsei University (E-mail: [email protected]) This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http:// creativecommons.org/licenses/by-nc/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Updating Smartphone’s Exterior Orientation Parameters by

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and CartographyVol. 37, No. 5, 331-341, 2019https://doi.org/10.7848/ksgpc.2019.37.5.331

Updating Smartphone’s Exterior Orientation Parameters by Image-based Localization Method Using Geo-tagged Image

Datasets and 3D Point Cloud as ReferencesWang, Ying Hsuan1)·Hong, Seunghwan2)·Bae, Junsu3)·Choi, Yoonjo4)·Sohn, Hong-Gyoo5)

Abstract

With the popularity of sensor-rich environments, smartphones have become one of the major platforms for obtaining and sharing information. Since it is difficult to utilize GNSS (Global Navigation Satellite System) inside the area with many buildings, the localization of smartphone in this case is considered as a challenging task. To resolve problem of localization using smartphone a four step image-based localization method and procedure is proposed. To improve the localization accuracy of smartphone datasets, MMS (Mobile Mapping System) and Google Street View were utilized. In our approach first, the searching for candidate matching image is performed by the query image of smartphone’s using GNSS observation. Second, the SURF (Speed-Up Robust Features) image matching between the smartphone image and reference dataset is done and the wrong matching points are eliminated. Third, the geometric transformation is performed using the matching points with 2D affine transformation. Finally, the smartphone location and attitude estimation are done by PnP (Perspective-n-Point) algorithm. The location of smartphone GNSS observation is improved from the original 10.204m to a mean error of 3.575m. The attitude estimation is lower than 25 degrees from the 92.4% of the adjsuted images with an average of 5.1973 degrees.

Keywords : Smartphone, Image-based Localization, MMS (Mobile Mapping System), Google Street View

331

ISSN 1598-4850(Print)ISSN 2288-260X(Online) Original article

1. Introduction

Recently, smartphones have become an important platform for sharing information. Current smartphones are equipped with high resolution cameras, IMU (Inertial Measurement Unit), GNSS (Global Navigation Satellite System), and other sensors with new releases. Since the wide availability of sensor-rich environments, smartphone changes not

only the way people communicate but also the way people interacts with the society. Mobile crowdsensing technology that uses ubiquitous mobile devices to collect information about human activity and surrounding environment from integrated sensors on a mobile device has been growing.

Information can be collected in the format of images, texts, and videos by smartphones. Moreover, the smartphone with GNSS and IMU sensors can offer location information

Received 2019. 09. 24, Revised 2019. 10. 07, Accepted 2019. 10. 111) Dept. of Civil and Environmental Engineering, Yonsei University (E-mail: [email protected])2) Member, Stryx Inc. (E-mail: [email protected])3) Member, Dept. of Civil and Environmental Engineering, Yonsei University (E-mail: [email protected]) 4) Member, Dept. of Civil and Environmental Engineering, Yonsei University (E-mail: [email protected]) 5) Corresponding Author, Member, Dept. of Civil and Environmental Engineering, Yonsei University (E-mail: [email protected])

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography, Vol. 37, No. 5, 331-341, 2019

332

and the user’s perspective of the text, image, and video. Even though most of the smartphones have been equipped with GNSS receiver, the location accuracy of smartphone is limited especially when GNSS receiver is located between or within urban areas. The overall accuracy of smartphone’s GNSS observation accuracy is about 12.5 m in the outdoor environment (Zandbergen and Barbeau, 2011). Sometimes the inaccurate location and perspective information of smartphone make the information collected by smartphone less useful.

Considering smartphones have been equipped with high-resolution camera, image-based localization becomes an alternative solution to correct the location and attitude of the smartphone. Image-based localization has been applied in the various fields of study, including robot localization (Agarwal et al., 2015), visual SLAM (Simultaneous Localization And Mapping) (Fuentes-Pacheco et al., 2015), augmented reality (Schönberger et al., 2018), and user localization(Liu et al., 2012; Wu et al., 2018). Image-based localization usually performs in two steps : (1) place recognition and (2) calculate the location and attitude of the camera by PnP (Perspective-n-Point) algorithm. Place recognition, a task of determining the location depicted in a query image by retrieving a given geo-tagged image database, is a challenging task for the success of the image-based localization. In previous researches of place recognition, geo-tagged images have played an important role. The dataset such as INRIA Holidays dataset (Jégou et al., 2008), San Francisco street view image datasets (Sattler et al., 2015; Jiang et al., 2012; Liu et al., 2012; Sattler et al., 2016; Kim et al., 2017), and Google Street View Image datasets were used (Zamir and Shah 2010; Agarwal et al., 2015; Verstockt et al., 2015; Sadeghi, et al., 2016). However, when it comes to image-based localization by street view images, the low spatial distribution and the geopostional accuracy of street view image datasets limit the accuracy of localization results, which is about 12m in the case of Google Street View images (Salarian et al., 2015).

To improve the accuracy of localization of smartphone’s, the utilization of mobile mapping technology is proposed in this study. MMS (Mobile Mapping System) can not only offer street view images in both high resolution and high spatial density but also 3D point cloud information from LiDAR (Light Detection and Ranging). The reference data utilized

in this study is collected by Leica Pegasus: Two Ⅱ including street view of images and 3D point cloud information. Since the IOPs (Interior Orientation Parameters) of the smartphone camera may be different from the specification offered by the manufacturer, the calibration of the smartphone camera is performed before capturing images. The proposed image-based localization is performed in four parts: automatic searching for the candidate image query, image matching between the smartphone image and reference image, geometric transformation estimation, and the improvement of smartphone position and attitude estimation.

To show the potential of applying our proposed image-based localization method to the publically available street view image dataset, Google Street View images are also used as a reference dataset. The structure of the paper is as follows: Section 2 describes the proposed smartphone image-based localization method. Section 3 describes the experiments and results.

2. Image-based Localization

The proposed image-based localization method uses two different kinds of geo-tagged reference images: one from street view image collected by MMS and the other from Google Street View images. The query images are taken from the smartphone camera. The image-based localization of smartphone is performed in four steps in this study. First, the matching candidate image is queried from the reference image dataset. Second, image matching is done by the SURF (Speed-Up Robust Features) algorithm. Third, a 2D affine transformation is applied for geometric transformation between the smartphone image and reference image. Finally, PnP algorithm is used for refining the smartphone position and attitude estimation. The procedure of the proposed image-based localization method is depicted in Fig. 1.

Fig. 1. Workflow of the proposed image-based localization method

Updating Smartphone’s Exterior Orientation Parameters by Image-based Localization Method Using Geo-tagged Image Datasets and 3D Point Cloud as References

333

2.1 Searching for matching candidate image

In this study, the smartphone image is used as a query image. In order to query the matching candidate image, the position and attitude information from the GNSS and IMU sensors while taking pictures are used. Using the information the best candidate for matching images of the reference dataset with the query image is selected automatically.

2.2 SURF image matching

Image matching is considered as the most essential for performing image-based localization. In this study, the SURF algorithm is used to perform image matching. SURF algorithm is done by three parts including feature point detection, feature point description, and feature point matching (Bay et al., 2008).

The steps for SURF are as follows. First, the integral image is computed for obtaining 2nd order derivative approximation for the part of feature point detection. Then, the feature points are detected by calculating the determinant of the Hessian matrix from the previously obtained integral image. To guarantee scale-invariant and rotation invariant, non-maximum suppression method is applied and Haar wavelet response is calculated. In order to create feature descriptor, 20×20 region is obtained and separated into 4×4 sub-region for each feature point. Haar wavelet response of sub-region is calculated and create 4D descriptor. Since there are 16 sub-regions, a length of 64 descriptor vector is created for each feature point. Finally, the Euclidean distance is applied to matching the feature point between two images.

2.3 Geometric transform estimation

In order to obtain geometric transformation estimation more correctly, the RANSAC (RANdom SAmple Consensus) algorithm is applied to eliminate the wrong matching point from the previous image matching result of SURF. Threshold used for RANSAC is the difference in distance between matching points in each image. Geometric transform between the query image and reference image can be estimated by the inlier matching point pairs in two image pairs. The geometric transform is done by 2D affine transformation including four elements: rotation, scale, shear, and translation. 2D affine transformation can be expressed as Eq. (1).

cos sin sin cos

Error ofRawGNSS Observation Manual P nP Method Coordinate Raw GNSS Observation Coordinate

Error ofP roposedM ethod Manual P nP Method Coordinate P roposed Method Coordinate

(1)where,

cos sin sin cos

Error ofRawGNSS Observation Manual P nP Method Coordinate Raw GNSS Observation Coordinate

Error ofP roposedM ethod Manual P nP Method Coordinate P roposed Method Coordinate

the image coordinate,

cos sin sin cos

Error ofRawGNSS Observation Manual P nP Method Coordinate Raw GNSS Observation Coordinate

Error ofP roposedM ethod Manual P nP Method Coordinate P roposed Method Coordinate

the angle rotation,

cos sin sin cos

Error ofRawGNSS Observation Manual P nP Method Coordinate Raw GNSS Observation Coordinate

Error ofP roposedM ethod Manual P nP Method Coordinate P roposed Method Coordinate

the scale factor along the

cos sin sin cos

Error ofRawGNSS Observation Manual P nP Method Coordinate Raw GNSS Observation Coordinate

Error ofP roposedM ethod Manual P nP Method Coordinate P roposed Method Coordinate

axis,

cos sin sin cos

Error ofRawGNSS Observation Manual P nP Method Coordinate Raw GNSS Observation Coordinate

Error ofP roposedM ethod Manual P nP Method Coordinate P roposed Method Coordinate

the shear factor along the

cos sin sin cos

Error ofRawGNSS Observation Manual P nP Method Coordinate Raw GNSS Observation Coordinate

Error ofP roposedM ethod Manual P nP Method Coordinate P roposed Method Coordinate

axis,

cos sin sin cos

Error ofRawGNSS Observation Manual P nP Method Coordinate Raw GNSS Observation Coordinate

Error ofP roposedM ethod Manual P nP Method Coordinate P roposed Method Coordinate

the displacement along the

cos sin sin cos

Error ofRawGNSS Observation Manual P nP Method Coordinate Raw GNSS Observation Coordinate

Error ofP roposedM ethod Manual P nP Method Coordinate P roposed Method Coordinate

axis.

2.4 Estimation of smartphone position and attitude

In this study, PnP algorithm, a method of estimating position and attitude of a calibrated camera, is applied. To apply the PnP algorithm, a set of three-dimension (3D) points in world coordinate and their corresponding points’ image coordinate should be known. Also, the smartphone camera is calibrated before performing PnP algorithm. The model of the PnP algorithm for the camera is shown in Fig. 2 and the mathematic model is expressed in Eqs. (2) and (3).

Fig. 2. Model of PnP algorithm

cos sin sin cos

Error ofRawGNSS Observation Manual P nP Method Coordinate Raw GNSS Observation Coordinate

Error ofP roposedM ethod Manual P nP Method Coordinate P roposed Method Coordinate

(2)

cos sin sin cos

Error ofRawGNSS Observation Manual P nP Method Coordinate Raw GNSS Observation Coordinate

Error ofP roposedM ethod Manual P nP Method Coordinate P roposed Method Coordinate

(3)

where,

cos sin sin cos

Error ofRawGNSS Observation Manual P nP Method Coordinate Raw GNSS Observation Coordinate

Error ofP roposedM ethod Manual P nP Method Coordinate P roposed Method Coordinate

is a scale factor for image point,

cos sin sin cos

Error ofRawGNSS Observation Manual P nP Method Coordinate Raw GNSS Observation Coordinate

Error ofP roposedM ethod Manual P nP Method Coordinate P roposed Method Coordinate

a set of 2D image coordinate points,

cos sin sin cos

Error ofRawGNSS Observation Manual P nP Method Coordinate Raw GNSS Observation Coordinate

Error ofP roposedM ethod Manual P nP Method Coordinate P roposed Method Coordinate

a corresponding set of 3D world coordinate points,

cos sin sin cos

Error ofRawGNSS Observation Manual P nP Method Coordinate Raw GNSS Observation Coordinate

Error ofP roposedM ethod Manual P nP Method Coordinate P roposed Method Coordinate

the matrix of camera’s interior orientation parameters including camera’s focal length (

cos sin sin cos

Error ofRawGNSS Observation Manual P nP Method Coordinate Raw GNSS Observation Coordinate

Error ofP roposedM ethod Manual P nP Method Coordinate P roposed Method Coordinate

,

cos sin sin cos

Error ofRawGNSS Observation Manual P nP Method Coordinate Raw GNSS Observation Coordinate

Error ofP roposedM ethod Manual P nP Method Coordinate P roposed Method Coordinate

), principle point (

cos sin sin cos

Error ofRawGNSS Observation Manual P nP Method Coordinate Raw GNSS Observation Coordinate

Error ofP roposedM ethod Manual P nP Method Coordinate P roposed Method Coordinate

,

cos sin sin cos

Error ofRawGNSS Observation Manual P nP Method Coordinate Raw GNSS Observation Coordinate

Error ofP roposedM ethod Manual P nP Method Coordinate P roposed Method Coordinate

), and skew parameter (

cos sin sin cos

Error ofRawGNSS Observation Manual P nP Method Coordinate Raw GNSS Observation Coordinate

Error ofP roposedM ethod Manual P nP Method Coordinate P roposed Method Coordinate

),

cos sin sin cos

Error ofRawGNSS Observation Manual P nP Method Coordinate Raw GNSS Observation Coordinate

Error ofP roposedM ethod Manual P nP Method Coordinate P roposed Method Coordinate

the desired 3D rotation matrix, and

cos sin sin cos

Error ofRawGNSS Observation Manual P nP Method Coordinate Raw GNSS Observation Coordinate

Error ofP roposedM ethod Manual P nP Method Coordinate P roposed Method Coordinate

the desire 3D translation vector of the camera, respectively.

In the PnP algorithm, when there are three sets of image point coordinates and 3D world point coordinates, four sets of possible solutions are calculated (Fischler and Bolles,

Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography, Vol. 37, No. 5, 331-341, 2019

334

1981). The fourth set of image point coordinate and 3D world point coordinate are used to find the best solution, which has the smallest error of reprojection, from the four possible solutions. However, the incorrect pose may be estimated if there are outliers in the set of points. In order to make the position and attitude estimation of the camera become more robust, RANSAC is used for eliminating the outliers in the set of points.

In this study, the 2D image coordinate can be obtained by the inlier matching points in the smartphone image. To obtain corresponding 3D points in world coordinate, MMS 3D point cloud information collected by LiDAR is projected on the smartphone image by using 2D affine transformation estimation. Using the result of inlier matching points the MMS point cloud coordinate is also obtained automatically since the MMS point cloud is projected on MMS image. With the correspondence points of the 3D world coordinate and two-dimension (2D) image coordinate, the smartphone position and attitude can be estimated.

2.5 Evaluation

The result of the proposed image-based localization method is evaluated by using the manually selected GCPs (Ground Control Points), the reference locations are obtained by manual PnP method. 3D point cloud data acquired from MMS was used as GCPs in PnP. The error of raw GNSS observation is defined by the absolute difference between the localization result of manual PnP method and raw GNSS observation (Eq. (4)). The error of the proposed image-based localization method is defined by the absolute difference between the localization result of the manual PnP method and the result of our proposed method (Eq. (5)).

cos sin sin cos

Error ofRawGNSS Observation Manual P nP Method Coordinate Raw GNSS Observation Coordinate

Error ofP roposedM ethod Manual P nP Method Coordinate P roposed Method Coordinate

(4) (5)

The performance of the proposed image-based localization method were separated into three different levels: better, similar, and bad (Fig. 3). When the error of raw GNSS observation is bigger than the error of the propose method, we call it as better performance. When the error of the proposed method is slightly greater than the error of raw

GNSS observation, and the error of the proposed method compared to the error of raw GNSS observation does not exceed 2.5 m, we call it as similar performance. The other results of proposed method contain big error, we call it as bad performance.

Fig. 3. Three different performance levels of our proposed method

3. Experiment and Result

3.1 Smartphone query image acquisition and

camera calibration

The query images of smartphone are collected by Samsung Galaxy S9 plus. The detail specifications of Samsung Galaxy S9 plus is summarized in Table 1. To compare the difference of position and attitude estimation after applying the proposed image-based localization method, an Android-based application is created for capturing images and sensors’ observations simultaneously. The observations obtained from IMU and positioning sensors in the smartphone including acceleration, rotation, angular velocity, GNSS location from satellites and network location from Wi-Fi and cell towers. The interface and the self-defined 3-axis coordinate system of the our own application are shown in Fig. 4.

Table 1. Specification of Samsung Galaxy S9 Plus (Samsung US, 2019)

Item ContentsSize 15.81 ×7.38 ×0.85 cm

Weight 189 g

Camera Sensors

Pixel Size 1.4 µmFocal Lens 4.3 mm

Field of View 77°

IMU Sensor

Accelerometer LSM6DSL 3D AccelerometerGyroscope LSM6DSL 3D Gyroscope

Magnetometer AK09916C Magnetic Sensor

GNSS Position Accuracy GPS, GLONASS, GALILEO

Updating Smartphone’s Exterior Orientation Parameters by Image-based Localization Method Using Geo-tagged Image Datasets and 3D Point Cloud as References

335

Fig. 4. Interface and 3-axis coordinate system of smartphone application (Wang, 2019)

The query images of smarphone are collected with three assumptions: (1) the user faces the screen and hold the smartphone vertically about a height of 1.5 m, (2) GNSS signal is always available, and (3) the query image overlaps with the reference image. The query images of smartphone, including 152 images taken by the smartphone camera and 152 sets of simultaneously recorded IMU and GNSS sensors’ raw observations, are collected in the region of Yonsei University and Yeonhui-Dong. The spatial distribution of smartphone query images are shown in Fig. 5.

Fig. 5. Spatial distribution of query images of smartphone

To obtain better results by the proposed image-based

localization method, smartphone camera calibration is done by MATLAB’s Single Camera Calibration App (Mathworks, 2019). Table 2 summarizes the results of the IOPs’ of smartphone camera after camera calibration.

Table 2. IOPs of Smartphone camera’s after calibrationContent Value

Focal Length

cos sin sin cos

Error ofRawGNSS Observation Manual P nP Method Coordinate Raw GNSS Observation Coordinate

Error ofP roposedM ethod Manual P nP Method Coordinate P roposed Method Coordinate

= 4.4953 mm

cos sin sin cos

Error ofRawGNSS Observation Manual P nP Method Coordinate Raw GNSS Observation Coordinate

Error ofP roposedM ethod Manual P nP Method Coordinate P roposed Method Coordinate

= 4.4746 mmDisplacement of Principal Point

X = 0.0304 mmY = -0.0636 mm

3.2 Reference image

The proposed method applied to two different geo-tagged reference images datasets: street view image collected by mobile mapping system and Google Street View images. The reference dataset of mobile mapping system is collected by Leica Pegasus Ⅱ in the Yonsei University and Yeonhui-Dong. The reference datasets of Google Street View images are obtained from Google Maps Platform in free by Google Street View API (Google, 2018). The collected Google Street View images are in the region of Yeonhui-ro.

3.2.1 Reference dataset of mobile mapping

system

MMS is a new technology that collects three-dimensional point cloud information and geo-tagged street view level images from the road environment efficiently. MMS has become more popular since the need for high-resolution 3D map is increasing. 3D information is produced by two-dimensional laser scanner mounted on the system and the movement of vehicle produce the third dimension information. This mobile mapping system is equipped with a 2D LiDAR, six road cameras, and IMU sensors. The MMS image dataset, collected from 6 road cameras, contains 10,343 images (4.12 GB). The MMS point cloud dataset is collected from 2D LiDAR and the number of points is 477,556,491 (15.1 GB).

3.2.2 Reference dataset of Google Street

View images

Since Google Street View image is not offered inside the

Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography, Vol. 37, No. 5, 331-341, 2019

336

Yonsei University, the Google Street View images in the region of Yeonhui-ro are only used in this study. To acquire the Google Street View image from Google Street View API, four parameters including image location, resolution, field of view, and heading direction are required. The resolution of Google Strweet view image collected is 640 × 640. The field of view is 70 degrees. The street view image is obtained in different heading direction every 30 degrees in the same location when every street view image is taken by Google Street View car. In Fig. 6(a), an example set of Google Street View images in different heading observation is shown when Google Street View car takes street view image in the same location. In Fig. 6(b), the spatial distribution of Google Street View image dataset is shown.

Fig. 6. (a) An example set of Google Street View Images; (b) Spatial distribution of Google Street View Images as a

reference dataset

3.4 Result and evaluation

3.4.1 Reference dataset of MMS

1) Image matching with MMS reference image datasetImage matching between the query image of smartphone

and the reference image of MMS is done by the following procedure. First, the GNSS observation of a smartphone query image is used for selecting the candidate MMS reference images. This is done by calculating the geometric distance between the location of a smartphone image and MMS images. MMS collected street view images by 6 road cameras at the same time. In this study, the three closest sets of MMS images are selected as candidate MMS images, taking into account calculation times and smartphone GNSS observation errors. Then, a candidate MMS image with the most matching point is chosen from 18 candidate MMS images for estimating geometric transformation between two different images. However, image matching result with wrong matching points

leads to wrong geometric transform estimation. The RANSAC algorithm is applied to remove the wrong matching points before estimating geometric transformation. The 10 pixels with the best results were used as the threshold for RANSAC. Fig. 7(a) shows the result of image matching by SURF algorithm. Fig. 7(b) shows the image matching result after eliminating wrong matching points by RANSAC algorithm.

Fig. 7. (a) Image matching result by SURF algorithm; (b) Image matching result after eliminating wrong matching

point by RANSAC algorithm

2) Image-based localization with MMS reference image dataset

Since the proposed image-based localization method is performed in automatic, the images without enough matching points are failed to estimate the location and attitude of the smartphone camera. 145 images’ (95.4%) location and attitude are obtained successfully by the proposed method in total 152 smartphone query images. According to the evaluation method mentioned in Section 2.5, 126 images (86.9%) have better performance after applying the proposed method. The position accuracy of the proposed image-based localization method is 1.496 m, 2.119m, 1.670m in X, Y, Z direction, respectively while the accuracy of GNSS observations in the same 126 images is 3.534 m, 3.543 m, 8.024 m in X, Y, Z direction. The position accuracy improves from an average error of 10.204m to 3.575 m. In the remaining images, 7 images gave similar performance while only 12 images have bad performance. Table 3 compares the localization result of the proposed image-based localization method and smartphone’s raw GNSS observation.

Updating Smartphone’s Exterior Orientation Parameters by Image-based Localization Method Using Geo-tagged Image Datasets and 3D Point Cloud as References

337

In addition to the location, the attitude of the smartphone camera is also estimated by the proposed image-based localization method. Since the image contents could exist in any direction, the heading direction that smartphone was facing is also important information. However, the heading measurement from the smartphone is not reliable since the accuracy of IMU sensor on smartphone is low. The example of inaccurate smartphone heading observation is shown in Fig. 8.

Fig. 8. Inaccuracy of smartphone heading direction

The evaluation of attitude estimation is also done the same process mentioned in Section 2.5 and the same set of manually selected GCPs. Since the smartphone heading observations are no longer reliable, the results are only separated in good performance and bad performance. By

comparing with the ground truth information, we called the result as good performance when the error of result is smaller than 25 degrees. 164 images have errors less than 25 degrees and the average error is 5.9173 degrees. The remaining 11 images result in big error. Moreover, there are only 18 images have errors less than 25 degrees when the heading information is measured directly from the smartphone. Table 4 summarizes the heading estimation performance with the proposed method and smartphone observations.

3) DiscussionBy applying the proposed image-based localization

method, there are still some images with poor result when the location and attitude are estimated. The images that failed to obtained the location and attitude information are mostly result in less correct matching point and the low quality of MMS reference image. Also, the characteristics of the image that obtained bad location and attitude results can be summarized as follows. There are three major problems that lead to the wrong estimation of location and attitude: (1) wrong matching points, (2) bad distribution of matching points, and (3) unstable natural features are considered as matching points. Fig. 9 shows the example image sets with the wrong estimation of location and attitude.

Table 3. Localization result of proposed image-based localization method and smartphone’s raw GNSS observation

Average Error of Image-based Localization (m)

Average Error of Smartphone Observation (m)

X Y Z Distance X Y Z DistanceBetter Performance 1.496 2.119 1.670 3.575 3.534 3.543 8.024 10.204

Similar Performance 5.530 7.047 6.101 12.925 5.642 4.216 7.637 11.430Better + Similar 1.798 2.378 1.903 4.067 3.645 3.578 8.004 10.269

Bad Performance 6.873 6.396 8.658 15.454 1.987 2.823 5.653 7.450Average Calculate Time 17.124 sec -

Table 4. Heading estimation result of image-based localization and smartphone’s raw GNSS observation

Image-based Localization (degree) Smartphone Observation (degree)Min Max Average Min Max Average

Good Performance 0.0375 23.5561 5.9173 2.3147 178.4380 86.1789Bad Performance 25.7257 118.185 48.0888 2.1562 140.6191 90.0195

Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography, Vol. 37, No. 5, 331-341, 2019

338

Fig. 9. Image sets with wrong estimation of location and attitude

3.4.2 Google Street View reference image dataset

1) Image matching with Google Street View reference image dataset

The proposed image-based localization method is also utilized Google Street View reference image dataset since

the usage of the proposed method should not be limited to MMS reference dataset. However, the reference dataset is relatively smaller since the Google Street View API does not offer a street view level image inside the Yonsei University. Therefore, the smartphone query images become 83 images taken in Yeonhui-Dong. When the proposed image-based localization method is applied on Google Street View image reference dataset, image matching is also done by SURF algorithm, and the wrong matching point is also eliminated by applying the RANSAC algorithm. Fig. 10(a) shows the result of image matching by SURF algorithm. Fig. 10(b) shows the image matching result after eliminating wrong matching point by RANSAC algorithm.

2) Image-based localization with Google Street View reference image dataset

Since the corresponding location and heading direction of the image is offered in every Google Street View image as mentioned in Section 3.2.2, the location and attitude of the smartphone can be adjusted by the proposed image-based localization method when the image matching is done. There are 67 images (80.7%) out of 83 images that perform image matching successfully. However, the accurate location and attitude of the smartphone can not be obtained since the 3D coordinate information of matching points is not offered in Google Street View reference image dataset. Fig. 11 shows the comparison of localization error by Google Street View reference image dataset and smartphone raw GNSS observations when comparing to MMS ground truth information.

Fig. 11. Comparison of localization error by Google Street

View reference image dataset and smartphone raw GNSS observations

On the other hand, the adjusted heading result can be shown in both visual and numerical aspect. The result of

Fig. 10. (a) Image matching result by SURF algorithm; (b) Image matching result after eliminating wrong matching

point by RANSAC algorithm

ferenc

Updating Smartphone’s Exterior Orientation Parameters by Image-based Localization Method Using Geo-tagged Image Datasets and 3D Point Cloud as References

339

heading adjustment is shown by the visual aspect in Fig. 12. In Fig. 12, the first row images are the smartphone image, the second-row images are the corresponding Google Street View images acquired by smartphone sensors’ observations. However, the first and second rows images have total different contents since the heading observation of smartphone contains errors. The third row of images is retrieved from Google Street View images with adjusted heading direction after applying the proposed method. Significant improvements of heading direction are shown by comparing the smartphone images and the retrieved Google Street View images. To show the improvement of heading direction numerically, Fig 13. compares the error of heading direction by Google Street View image-based localization and smartphone compass observations when comparing with MMS ground truth information.

Fig. 12. Heading direction adjustment by Google Street View image-based localization

Fig. 13. Comparison of heading direction error by Google Street View reference image dataset and smartphone

heading observation

3) DiscussionThe proposed method adjusted the location and heading

direction by Google Street View image reference dataset. The adjusted location is not ideal since Google Street View API provides a street view image almost every 12 m while the accuracy of smartphone GNSS raw observation may be smaller (Salarian et al., 2015). However, the overall improvements of the adjusted heading direction are significant after applying the proposed method.

4. Conclusion

A new image-based localization method and procedure is proposed to correct smartphone location and attitude in this study. The proposed method applied to MMS reference dataset and Google Street View image reference dataset. The proposed image-based localization contains four steps: (1) matching candidate image query, (2) SURF image matching, (3) geometric transform estimation, (4) smartphone position and attitude estimation.

In the first experiment, the following contributions can be concluded when MMS reference dataset is applied. First, the localization result of smartphone from MMS reference dataset is improved to a mean error of 3.575m from 10.204m when comparing to the accuracy of smartphone GNSS raw observation. Second, a significant improvement if attitude estimation is also shown after applying the proposed method. 92.4% of attitude estimations have errors less than 25 degrees in the whole smartphone query image dataset. The mean error is 5.9173 degrees while the mean error of smartphone compass observation is 86.1789 degrees.

In the second experiment, Google Street View image dataset is utilized as reference dataset to substitute the MMS

Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography, Vol. 37, No. 5, 331-341, 2019

340

reference dataset. By performing the proposed image-based localization, 80.7% of smartphone images’ the location and heading direction are adjusted. The accuracy of localization by Google Street View image is similar to the smartphone GNSS raw observation since the low spatial distribution of Google Street View images. However, the heading direction is adjusted ideally in both visual and numerical point of view.

The results of the experiment using the MMS reference dataset were very good. However, there is a disadvantage in that MMS reference dataset is not easily accessible. In this case, it was possible to adjust the heading direction of the smartphone camera through the accessible Google Street View images.

For future work, we plan to apply the proposed image-based localization method to a different kind of data type such as video, CCTV in larger experiment areas and embed the proposed method into the smartphone application.

Acknowledgment

This research was supported by a grant (2019-MOIS32-015) of Disaster-Safety Industry Promotion Program funded by Ministry of Interior and Safety (MOIS, Korea).

Reference

Agarwal, P., Burgard, W., and Spinello, L. (2015), Metric localization using google street view, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3111-3118.Bay, H., Ess, A., Tuytelaars, T., and Van Gool, L. (2008),

Speeded-up robust features (SURF), Computer vision and image understanding, Vol. 110, No. 3, pp. 346-359.

Fischler, M.A., and Bolles, R.C. (1981), Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography, Communications of the ACM, Vol. 24, No. 6, pp. 381-395.

Fuentes-Pacheco, J., Ruiz-Ascencio, J., and Rendón-Mancha, J.M. (2015), Visual simultaneous localization and mapping: a survey, Artificial Intelligence Review, Vol. 43, No. 1, pp. 55-81.

Google. (2019). Street View API - Developer Guide, https://

developers.google.com/maps/documentation/streetview/intro (last date accessed : May 22, 2019)

Jégou, H., Douze, M., and Schmid, C. (2008), Hamming embedding and weak geometry consistency for large scale image search-extended version, European Conference on Computer Vision, pp. 304-317.

Kim, H.J., Dunn, E., and Frahm, J.M. (2017), Learned contextual feature reweighting for image geo-localization, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3251-3260.

Leica Geosystems. (2019). Leica Pegasus: Two Mobile Sensor Platform | Leica Geosystems, https://leica-geosystems.com/products/mobile-sensor-platforms/capture-platforms/leica-pegasus_two (last date accessed : March 29, 2019)

Li, Y., Snavely, N., Huttenlocher, D., and Fua, P. (2012), Worldwide pose estimation using 3d point clouds, European conference on computer vision, Berlin, Heidelberg, pp. 15-29.

Liu, H., Mei, T., Luo, J., Li, H., and Li, S. (2012), Finding perfect rendezvous on the go: accurate mobile visual localization and its applications to routing, Proceedings of the 20th ACM international conference on Multimedia, pp. 9-18.

Mathworks (2019), Single Camera Calibrator App, https://kr.mathworks.com/help/vision/ug/single-camera-calibrator-app.html (last date accessed: August 10, 2019)

Sadeghi, H., Valaee, S., and Shirani, S. (2016), 2DTriPnP: A robust two-dimensional method for fine visual localization using Google streetview database, IEEE Transactions on Vehicular Technology, Vol. 66, No. 6, pp. 4678-4690.

Salarian, M., Manavella, A., and Ansari, R. (2015), Accurate localization in dense urban area using google street view images, 2015 SAI intelligent systems conference (IntelliSys), pp. 485-490.

Samsung US. (2019). Samsung Galaxy S9 & S9+ Specifications - S9 Specs & Features | Samsung US., https://www.samsung.com/us/smartphones/galaxy-s9/specs/ (last date accessed: June 10, 2019)

Sattler, T., Havlena, M., Radenovic, F., Schindler, K., and Pollefeys, M. (2015), Hyperpoints and fine vocabularies for large-scale location recognition, Proceedings of the IEEE International Conference on Computer Vision, pp.

Updating Smartphone’s Exterior Orientation Parameters by Image-based Localization Method Using Geo-tagged Image Datasets and 3D Point Cloud as References

341

2102-2110.Sattler, T., Havlena, M., Schindler, K., and Pollefeys, M.

(2016), Large-scale location recognition and the geometric burstiness problem, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1582-1590.

Schönberger, J.L., Pollefeys, M., Geiger, A., and Sattler, T. (2018), Semantic visual localization, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6896-6906.

Verstockt, S., Gerke, M., and Kerle, N. (2015), Geolocalization of Crowdsourced Images for 3-D Modeling of City Points of Interest, IEEE geoscience and remote sensing letters, Vol. 12, No. 8, pp. 1670-1674.

Wang, Y.H. (2019), Estimation of Object Location from Smartphone Images and Positioning Sensors Using MMS Datasets as Reference, Master’s thesis, Yonsei University, Seoul, Korea, 32p.

Wu, T., Liu, J., Li, Z., Liu, K., and Xu, B. (2018), Accurate smartphone indoor visual positioning based on a high-precision 3D photorealistic map, Sensors, Vol. 18, No. 6, pp. 1974.

Zamir, A.R., and Shah, M. (2010), Accurate image localization based on google maps street view, European Conference on Computer Vision, Berlin, Heidelberg, pp. 255-268.

Zandbergen, P.A., and Barbeau, S.J. (2011), Positional accuracy of assisted GPS data from high-sensitivity GPS-enabled mobile phones, The Journal of Navigation, Vol. 64, No. 3, pp. 381-399.