REGISTRATION TECHNOLOGIES and THEIR CLASSIFICATION IN AUGMENTED REALITY THE KNOWLEDGE-BASED Registration, COMPUTER VISION-BASED REGISTRATION and TRACKER-BASED REGISTRATION TECHNOLOGY

8/19/2019 REGISTRATION TECHNOLOGIES and THEIR CLASSIFICATION IN AUGMENTED REALITY THE KNOWLEDGE-BASED Regis…

1/11


2/11


3/11

International Journal of Computer Science, Engineering and Applications (IJCSEA) Vol.6, No.1, February 2016

25

4.2 Camera Model:

Registration based on camera calibration is a common method of computer vision-basedregistration technology. This method puts some special markers in the real environment firstly,

recognizes them bycomputer vision, and then computes the position and direction of cameracorresponding to markers, and finally computes the exact position and direction of virtual objects.Camera model can be separated into fr ame diff erential techni que , detection and tracking offeature poin ts and calculation of camera position inf ormation [4].

4.2.1 Frame Differential Technique

The frame differential technique, based on image processing, is used to test the accurate locationof objects using two images captured with a camera. It detects the changed parts using thedifferences in pixels between the two frames. This method is not difficult and it is appropriate forreal-time use at construction sites [4].

4.2.2 Detection And Tracking Of Feature Points

As the features in the captured images must be detected in the same background image in eachframe, it is generally more useful to detect corner points that have a large differential coefficient

because brightness changes sharply. In this paper we described three methods for Detection andtracking of feature points. These are Shi-Tomasi method, Harris corner detection method, andThe Lukas-Kanade Tracker (LKT)[4].

Figure.1. Geometric relationship between features and an image taken by camera [4]

4.2.2.1 Shi-Tomasi Method

Shi-Tomasi method to detect feature points that are useful for tracking, such as corner points.Shi-Tomasi corner detection method determines that a pixel is a good object for tracking if thesmaller of the two Eigen values of the hessian matrix is greater than the specified critical value. Inorder to extract the geometric measurements of the detected feature point, real number-typecoordinates must be used instead of whole number-type coordinates. When peaks in an image aresearched, they are rarely located at the center of the pixels. To detect them, sub-pixel corner


4/11


26

detection is used. The coordinates of peaks that exist between pixels are found by fitting withcurves, such as parabolas [4].

4.2.2.2 Harris corner detection method

Harris corner detection method [4] uses the second derivatives of the image brightness. A pixel inthe image is regarded as a corner point if all the Eigen values of a hessian matrix consisting ofsecond derivatives at the pixel are large. As second derivative images do not have values atuniform gradient, they are useful for detecting corners.

4.2.2.3 Lukas-Kanade Tracker (LKT)

The Lukas-Kanade Tracker (LKT)[4] is used to track the detected feature points. This is a sparseoptical flow method that only uses the local information obtained from a small window thatcovers predefined pixels; the points to be tracked are specified beforehand. LKT is based on threeassumptions: brightness constancy, temporal persistence , and spatial coherence [4],

4.2.2.3.1 Brightness constancy

Brightness constancy assumes that the brightness values of specific continuous pixels are constantin frames of different times. Thus, the partial differential value for the time axis t t in expression(1) is 0 shown below [4]. By tracking the areas that have the same brightness value in this way,the speed between consecutive frames can be calculated. Ix, Iy, and It are the partialdifferentiation results of x, y, and time axes, respectively. u and v are the coordinate changes inthe x and y axes, respectively.

4.2.2.3.2 Temporal Persistence

Temporal persistence assumes that the pixels around a specific pixel that has a movement changeover time will have consistent changes of coordinates. As two variables (u and v) cannot becalculated with one function in expression (1), it is assumed that 25 neighbor pixels have thesame change. Then the values of u and v can be calculated using the least square method, asshown in expression (2) shown bellow [4]. The computation speed of LKT is fast because it uses

a small local window area under the assumption of temporal persistence that coordinate changesare not large compared to time changes. However, its disadvantage is that it cannot calculatemovements larger than the small local window. To address this problem, the Gaussian image

pyramid is used. A Gaussian image pyramid is created from the original image, tracking startsfrom a small stratum, and tracking changes are gradually accumulated to larger strata. Thus, thechanges of sharp coordinates can be detected even if a window of limited size is used[5][6].


5/11


27

4.2.3 Camera Position Information

Three-dimensional object coordinates are calculated by back-projection of two-dimensional (2D)coordinates to a three-dimensional (3D) space. The proposed system is preceded by aninitialization step in which the 2-D coordinates of the feature points detected from the capturedimages are converted to the coordinates of a 3D space. As the coordinates detected from thecamera input images are 2D, it is assumed that the depth(z-axis) is zero when they are convertedto a 3D space. One problem is that if the detected feature pixels are not located on the same plane,errors may be generated. Because of this, the proposed method requires the precondition that it iscarried out for relatively flat background images, such as desk top images. Then, the camera

position information is calculated through the relationship of the 2D coordinates of the feature

points acquired from consecutive frames and the 3-D coordinates acquired from the initializationstep. As shown in Figure 2[4], assuming that the homogeneous coordinates of the 3D coordinates

M = ( X, Y,Z ) and the 2D image coordinates m = ( x, y) are ¯ M = [ XY Z 1]T and ¯ m = [ XY Z 1]T , the projection relationship between them are defined as expression (3) by the 3X4 camera matrix . λis the scale variable of the projection matrix ¯ P, and R is the 3X3 matrix by the rotationaldisplacement of the camera. Furthermore, r is the Ith column of the matrix R , and t is the 3X1translation vector that signifies the camera movement. In addition, the 3X3 matrix K is a non-singular matrix, indicating the camera correction matrix that has the intrinsic parameters of thecamera as its elements, and it is generally defined as follows. In the matrix of expression (4), fxand fy are the scale values in the coordinate axis directions of the image, and s is the skew

parameter of the image. Furthermore, ( x0, y0) is the principal point of the image. To obtain thecamera matrix in expression (4), a separate camera correction is generally needed. In this study,the camera matrix was obtained using the camera correction method proposed by Zhang

Figure 2. Projective geometry of input image sequences.


6/11


28

The camera correction method of Zhang calculates the Image of Absolute Conic (IAC) ω, whichis projected to the images using the invariance of isometric transformation, which is one of thecharacteristics of absolute points on an infinite plane. Then the intrinsic parameter matrix of thecamera is obtained from the relationship ω−1 = KKT. Therefore, in order to implement Zhangsmethod, three or more images for the same plane, which have different directions and locations,are required.

5. TRACKER -BASED R EGISTRATION TECHNOLOGY

Tracking technologies may be grouped into three categories [7]:

active-target passive-target and inertial

5.1 Active-target

Active-target systems incorporate powered signal emitters and sensors placed in a prepared andcalibrated environment. Examples of such systems use magnetic, optical, radio, and acoustic

signals.

Limitation: The signal-sensing range as well as man-made and natural sources of interferencelimit active - target systems

5.2 Passive-target

Passive-target systems use ambient or naturally occurring signals. Examples include compassessensing the Earth‟s field and vision systems sensing intentionally placed fiducially (e.g., circles,squares) or natural features.

Limitation: Passive-target systems are also subject to signal degradation, for example poorlighting or proximity to steel in buildings can defeat vision and compass systems.

5.3 Inertial

Inertial systems are completely self-contained, sensing physical phenomena created by linearacceleration and angular motion.


7/11


29

Limitation: Inertial sensors measure acceleration or motion rates, so their signals must beintegrated to produce position or orientation. Noise, calibration error, and the gravity field imparterrors on these signals, producing accumulated position and orientation drift. Position requiresdouble integration of linear acceleration, so the accumulation of position drift grows as the squareof elapsed time. Orientation only requires a single integration of rotation rate, so the driftaccumulates linearly with elapsed time.

6. PROPOSED METHOD FOR TRACKER -BASED R EGISTRATION

6.1 Setup

Here a planar canvas is used that has specific shape. The shape is used as the cue for the vision- based tracking. The system configuration is illustrated in Figure 3[8]. We also use a binocularvideo see through HMD which enables users to perceive depth. The HMD is connected to a videocapture card ,which captures input videos from the cameras built into the HMD. The NVIDIAGeForce GTS 250 graphics processor is used for image processing. The positions and orientations

of the HMD and Brush Device are tracked using Polhemus LIBERTY, a six-DOF trackingsystem that uses magnetic sensors. A transmitter is also used as a reference point for the sensors.The brush device is connected to the main PC through an input/output (I/O) box. The I/O boxretrieves data from the devices and sends them to the main pc[8].

We used OpenGL and the OpenGL Utility Toolkit (GLUT) for the graphics API. In creating theMR space, we first set the videos captured by the Osprey-440 as the background and then create avirtual viewing point in OpenGL by obtaining the position and orientation of the HMD fromPolhemus LIBERTY. In done so, users feel as if they are manipulating virtual objects in the realworld [8].

Figure 3. System configuration.


8/11


30

6.2 Contour tracker

6.2.1 Descriptor

The first step in our method is extracting the region of the canvas. The popular and recent methodsuch as MSER can be applied. We also can use the simple color segmentation for simple outlineas illustrated in Figure 4[8].

Figure 4: Keypoint extraction

The next step is extracting the outline by using contour estimation. The outline that may containmany points is simplified using DP polygon simplification. The remaining points are used askeypoints. The next step is computing the descriptor r (relevance measure) that is computed usingthree consequent points in the simplified polygon [8]. r is defined as

where and are the length of two connected segments (lines) and is the angle betweenwhere and . r depends on and the ratio of both segments. We assume these properties willnot change drastically due to scale and rotation changes. More specifically, will not change dueto rotation and the ratio will not change due to scale changes. However, these properties maychange due to perspective change. We handle the perspective change by tracking step that will beexplained later on the registration method subsection. Computing descriptors of the shape is done

both off-line and on-line. In off-line process, the descriptors are stored in the database whereas inon-line process, the descriptors of unknown shape are matched with the descriptors in thedatabase (see Figure5)[8]

Figure5: descriptor matching


9/11


31

6.2.2 Registration

During the shapes registration, the keypoints of unidentified shape are extracted. The sequencesof relevance measures are then computed and the corresponding tuple (region or shape id,

keypoint id) is looked up in the hash table (see Figure 5). Since the hash table is many-to-onerelationship, in order to get the matched region and keypoint, histogram matching (voting) is performed. This process yields keypoints correspondences between a shape captured in thecamera and a shape in the database. We choose three neighboring relevance measure values torepresent a keypoint of a shape. In this case, one keypoint is actually described by its fourneighbor‟s keypoints. For example r0, r1, r2 represents a keypoint with id=1. In ourimplementation, we increase the number of relevance measure to four in order to createdistinctive representation of a keypoint for example r0, r1, r2, r3 for keypoint with id=1. Duringruntime, when a shape is matched to one in the database, sequences of relevance measure of thematched shape are added to the hash table so that the registration becomes robust against

perspective change.

6.2.3 Pose estimation

The camera pose is estimated using homography that is calculated using at least four keypointscorrespondences as the result of the shape registration. The outliers from the keypoints areremoved using the inverse homography. The camera pose is then optimized using Levenberg-Marquardt [9] by minimizing the re-projection error that is the distance between the projectedkeypoints from the shape database and the extracted keypoints in the captured frames. Thecamera pose is then refined by considering the keypoints correspondence to the detected shape in

previous frame. These two optimizations produce a stable camera pose[8].

6.2.4 Unifying the coordinate system

One issue to switch from a magnetic sensor to a visual tracker is the unification of coordinatesystem. Since in the vision-based tracking, the coordinate system is independent from the sensorcoordinate system, the transformation from the camera coordinate system into the sensorcoordinate system is necessary. This unification is also required in order to enable the paintinginteraction since the brush and eraser device are located in the sensor coordinate system.

The unification is quite straightforward since the tracking is done using the camera attached in theHMD. Note that a magnetic sensor is also attached in the HMD which make the unification

becomes simple. Therefore, the unification is done by computing the transformation matrix forany object in the camera coordinate system. The transformation matrix is computed bymultiplying the view matrix of the HMD by the rotation and translation matrix retrieved using

homography as the result of the shape registration. As a result, the virtual canvas can be paintedusing the brush device that has a magnetic sensor on it. The result of the painting can be seen inFigure 6[8].


10/11


32

Figure 6. Painting result on a canvas that has a specific shape.

7. CONCLUSION

In this paper we proposed tree registration technology in augmented reality. These technics areKnowledge-based registration technology, computer vision-based registration technology, andtracker-based registration technology. In computer vision-based technology we described howdifferentiated the frames and how detect and tracking the feature points in image registration

purpose. In tracker based technology we described three type of tracking technics such as active-target, passive-target, and inertial and our next plan to use their hybrid concept in augmentedreality. We also proposed setup and contour tracker method where we used the asymmetry 2Dobject as the tracking marker as the canvas, but in the future, we are planning to extend ourmethod to 3D objects. Using 3D objects as the marker, it is possible to paint on any object that ofcourse must be detectable objects.

R EFERENCES

[1] Wikipedia. Augmented reality. Available at:http://en.wikipedia.org/wiki/augmented_reality(Accessed 20 February 2012) .

[2] Augmented Reality Technology and Art: Analysis and Visualization of Evolving ConceptualModels. 1550-6037/12 $26.00 © 2012 IEEE DOI 10.1109/IV.2012.77

[3] Li Yi-bo, Kang Shao-peng, Qiao Zhi-hua, Zhu Qiong “Development Actuality and Application ofRegistration technology in Augmented Reality” 978-0-7695-3311-7/08 $25.00©2008 IEEE DOI10.1109/ ISCID .2008. 120 .

[4] Jae-Young Lee Jong-Soo Choi, and Oh-Seong Kwon Chan-Sik Park:” A study on ConstructionDefect Management Using augmented reality technology” 978-1-4673-1401-5/12/$31.00 ©2012IEEE

[5] J. Barron, N. Thacker, ”Computing 2D and 3D Optical Flow,” Tina-Vision, 2005.

[6] G. Bradski, A. Kaehler, ”Learning Opencv: Computer Vision with the Opencv Library,” O„REILLY,2008.

[7] Suya you, Ulrich Neumann, Ronald Azuma:” Hybrid Inertial and Vision Tracking for AugmentedReality Registration”

[8] Sandy Martedi1*, Maki Sugimoto, Hideo Saito, Mai Otsuki2*, Asako Kimura, Fumihisa Shibata:” ATracking Method for 2D canvas in MR- based interactive painting system” 978-1-4799-3211-5/13$31.00 © 2013 IEEE DOI 10.1109/SITIS.2013.128

[9] Lourakis, M. “levmar: Levenberg-marquardt nonlinear least squares algorithms in C/C++.” [Web page] http://www.ics.forth.gr/~lourakis/levmar/, Jul.2004.[Accessed on 23 Aug. 2013.].


11/11


33

AUTHORS

Prabha Shreeraj Nair was born in chhattisgarh, India, in 1973. She received the B.E.Degree in Computer Technology from Nagpur University, in 1996 and Masters inComputer Science and Engineering from Kakatiya University, in 2007.She has been

working as an Assistant Professor, Galgotias University, Greater Noida, Uttar Pradesh,India she has 18 years of teaching experience. She is engaged in research on SoftwareEngineering, Augmented reality.

Vijayalakshmi S was born in the year 1975. She received the B.Sc. degree in ComputerScience from Bharathidasan University, Trichirapalli, India in 1995, the MCA degreefrom the same University in 1998 and the M.phil. degree from the same University in theyear 2006. She received her doctorate in 2013. She has been working as an AssistantProfessor (grade-III), Galgotias University, Greater Noida, Uttar Pradesh, India she has17 years of teaching experience and 10 years of research experience. She has publishedmany papers in the area of image processing especially in medical imaging

Documents

REGISTRATION TECHNOLOGIES and THEIR CLASSIFICATION IN AUGMENTED REALITY THE KNOWLEDGE-BASED Registration, COMPUTER VISION-BASED REGISTRATION and TRACKER-BASED REGISTRATION TECHNOLOGY