A Hand Control and Automatic Grasping System for Synthetic Actors

EUROGRAPHICS '94 / M. Daehlen and L. Kjelldahl (Guest Editors), Blackwell Publishers © Eurographics Association, 1994

Volume 13, (1994), number 3

A Hand Control and Automatic Grasping System for Synthetic Actors

Ramon Mas Sanso

Department of Mathematics and Computer Science, Balearics Islands University, Cra. de Valldemossa, km 7.5 E- 07071 Palma de Mallorca, Spain

Daniel Thalmann

Computer Graphics Lab, Swiss Federal Institute of Technology, CH 1015 Lausanne, Switzerland

Abstract

In the computer animation field, the interest for grasping has appeared with the development of synthetic actors. Based on a grasp taxonomy, we propose a completely automatic grasping system for synthetic actors. In particular, the system can decide to use a pinch when the object is too small to be grasped by more than two fingers or to use a two-handed grasp when the object is too large. The system also offers both direct and inverse kinematics to control the articulations. In order to ensure realistic looking closing of the hand, several of the joints are constrained. A brief description of the system and results are also presented.

Keywords: Hand control, automatic grasping, synthetic actor, animation, kinematics

1. Introduction

The interest on the study of human grasping has several motivations; among them, medical surgery and artificial prosthesis design to correct human deficiencies dues to accidents or diseases. With the introduction of the concept of synthetic actors interacting with their environment, this interest has also appeared in the computer animation field.

Two approaches are usually taken to study the human grasping mechanism [Cut89]: an empirical approach, where the knowledge acquired by observing how humans grasp objects is applied, and an analytical approach in which physical laws are used to model the grasping process. In the former, it's often difficult to deduce, from observation of the human behavior, principles that can be applied to approximated models. In the latter, we often need to simplify the models to a point that the approach is only valid for a very reduced set of experiences. Our final goal consists in obtaining "realistic looking" images and not in accurately modeling the physical world. Therefore, we base our study on the observation of the human being behavior when a grasping process takes place.

Our goal is the creation of a system where a synthetic actor will be able to take the objects of the surrounding world. A set of primitives is used to represent the different parts where an object can be grasped.

C-168 R.M. Sanso et al. / A Hand Control and Automatic Grasping System

1. I . Previous Work

We can find the most significant efforts to understand human grasping in both medical and robotics literature. A categorization proposed by [Sch 19] distinguishes six different grasps: cylindrical, fingertip, hook, palmar, spherical and lateral. McBride [McB42] proposes a classification based on the parts of the hand which take part in the task (whole hand, thumb-finger, palm-digits). A taxonomy widely used is the one introduced by Napier [Nap56]. He noticed that grasps are highly related to the task requirements and divided them into power grasps, required in tasks when strength is needed, and precision grasps, when we want to have fine control. In [AIL85] the concept of "virtual finger" is introduced. A virtual finger is one or more real fingers which work together in a task. Lyons [Lyo85] introduces three grasps, depending on the level of interaction with the object: the encompass grasp in which the object is firmly taken, the lateral grasp, in which rotational movements can be transmitted to the object and the precision grasp, where arbitrary small motions of the object are allowed.

Iberal [Ibe87] uses the opposing forces applied by virtual fingers to classify human grasps. He considers a set of three "oppositions": the pad opposition, between the pads of the thumb and the fingers which offers flexibility at the expense of stability, the palm opposition, between the palm and the digits which sacrifices flexibility in favor of stability and side opposition between the thumb pad and the side of the index finger, it is a compromise between flexibility and stability. The classification of the grasps is done according to the applied oppositions. A complete taxonomy of human grasps has been proposed by [Cut89] from the empirical observation of the manufacturing machinists operations, the grasps are classified taking into account size and shape characteristics and the task required precision.

In the computer animation field, the interest for the human grasping has appeared with the introduction of the synthetic actors. In [MaT88], we find one of the first attempts to facilitate the task of animating actors' interaction with their environment. This approach has been considered by the authors as semi-automatic. It is the animator who has to position the hand and decide the contact points of the hand with the object. The calculation of the flexion angle of one of the articulations to allow the contact finger-object is done semi- automatically by computing the angle which corresponds to a given distance using a dichotomy search.

[RiG91] presents a full description of a grasping system that allows both, an automatic or an animator chosen grasp. They start from the strategy first introduced by [Tom87] in the robotics literature. The main idea being to approximate the objects with simple primitives. The mechanisms to grasp the primitives are known in advance and constitute what they call knowledge database. Grasping is divided into three phases: a task initialization phase, where the target object is classified as a single primitive to determine the grasping strategy, a target approach phase in which the final grasp is decided, and a grasp execution phase where the hand will close around the object. However, in their approach, Rijpkema and Girard consider the grasp mode (palmar or lateral) as a decision to be taken by the animator. It is a correct approach if the goal is to facilitate the animator's task. But when the main goal is to give "synthetic life" to a "synthetic actor", the animator role is reduced to procedural instructions such as "take the glass on the table". The level of dependence of the actor with respect to the animator has to be reduced.

2. The Human Hand Model

The human hand is composed by 27 bones, which are divided into three groups: the carpals or wrist bones, the metacarpals or palm bones and the phalanxes or finger bones [fig 1]. In our skeleton we model the distal and proximal interphalangeal joints of all the fingers as one-degree of freedom joints (flexion). To model the other joints we have to make a distinction between the thumb and the rest of the fingers, given the higher versatility of the former. The thumb has been modeled with a pivoting rotation around the axis formed by the index finger, a flexion and an abduction. The metacarpophalangeal joints of the other fingers are modeled as two degrees of freedom: a flexion and a pivot. The joint values have been limited according to Table 1.

R.M. Sanso et al. / A Hand Control and Automatic Grasping System C-169

In Figure 2, we show a graphic representation of our hand skeleton model. The fingers and the palm use a geometric representation (fig 3) composed by cylinders and spheres, the spheres represent the joints, the cylinders represent the links between joints.

Table 1: Finger joint values

Figure 1. The human hand

Figure 2. The Skeleton model. Links are represented by segments and joints by nodes

C-170 R.M. Sanso et al. /A Hand Control and Automatic Grasping System

Figure 3. A solid representation of the hand model

3. The Hand Control Subsystem

The system allows the user to define new hand postures that will be used to describe a grasp. To generate a hand posture, both direct and inverse kinematics can be used to control all the articulations of the hand.

3.1. Direct kinematics

In direct kinematics, the values of the articulations are known and the problem consists in determining the Cartesian position of the end of the articulated chain.

Using direct kinematics the user can rotate the desired articulation a number of degrees, the resulting rotation is automatically transmitted to the following joints in the chain, updating the skeleton structure. The affected articulation can be selected by using a two-dimensional representation of the skeleton data structure [Bou93] or by directly selecting the articulation on the graphic window. The first approach can be tedious if the user is not used to the representation and is unable to associate a 2D-node with the wanted articulation. The drawback of the last approach is that every joint can have up to three rotational degrees of freedom, all of them are placed together and are represented by the same graphical entity . The most effective way to select the articulation is picking it in the graphic window and using up and down arrows to move through the bi- dimensional representation. To optimize the selection mechanism, a graphic entity is associated to the most frequently manipulated degree of freedom, the flexion.

Once the articulation has been selected, the angular value can be adjusted by both the displacement of a slider in the control panel, where the name and current value of the articulation is continuously displayed or by translating the cursor through the graphical window.

3.2. Inverse kinematics

In the inverse kinematics problem, we have to determine the joint values which will allow us to modify the end-effector of the articulated chain (main task), corresponding in our case to the tips of the fingers, from a desired position and orientation of the end-effector. The problem consists in solving the equation

(1)

where X is the position of the end-effector at a given instant of time (n-dimensional) and Y is the vector describing the current angular configuration of the articulations (m-dimensional).

A general solution to the previous equation is given by

where


(2)

is the equation solution in the articulated space (n-dimensional). is the description of the position and orientation of the end-effector we want to modify in the Cartesian space (m-dimensional). is the Jacobian matrix describing the problem linearization. It describes the variation of the Cartesian coordinates with respect to the joints of the chain. is the only pseudo-inverse of J giving the solution of minimum norm for the main task.

The dimension of the main task m has to be lower than or equal to the articular dimension of the chain n to be sure a solution exists.

Using inverse kinematics we could select interactively any of the fingers and position it at the desired place. This feature is specially useful for the postures where tip contact between several fingers is desired.

In the case of the fingers (little, ring, middle and index), we also want to constrain several of the joints to ensure realistic looking closing of the hand. As noted in [RiG91] there is a nearly linear relation between the distal and proximal interphalangeal joints (fig 4). Another applied constraint is related to the flexion at the metacarpal joints, when the fingers are flexed together they reduce the adduction in order to avoid buttressing each other (fig 5).

Figure 4. Relation between the distal and proximal interphalangeal joints

Figure 5. Constraining the flexion of the metacarpophalangeal joints


Constraining during direct kinematics is done while modifying joint values, during inverse kinematics the influences of other articulations are introduced in the corresponding columns of the Jacobian matrix.

4. The Grasping Subsystem

We consider the use of basic primitives proposed by [Tom87] as being the most adequate to help in the grasp choice. The selected primitives are the cylinder, the sphere and the box. Those primitives are used as feature based descriptions of the objects. There is not a direct association of a primitive and an object geometry [MaG91] but an association between a primitive and a part of the object with an important role in the grasping determination. The objects will be stored together with the primitives corresponding to the grasp able parts of the object. We use the grasp taxonomy proposed by [CuH90], a revision from [CuW86]. The grasps are classified depending on the degree of precision of the task and on the geometrical characteristics of the objects. We relate the precision of the task to a combination of the geometrical attributes of the primitive and its mass.

The grasping process can be divided into the following steps:

User task specification. Grasp selection.

Grasp execution. Geometric calculation of the hand position.

4.1. User Task Specification

First of all, we have to decide which object is going to be grasped. The user can directly select the primitive corresponding to the part which has to be grasped in the graphic window. In the second step, the user can select the hand to be used, if none is indicated then the right one is used as default (if ulterior computations decide one hand is enough to handle the object if not, both hands will be used).

4.2. Automatic grasp selection

The adaptation of the Cutkosky and Howe taxonomy to the primitives scheme can be resumed in Table 2.

Table 2: Adaptation of the Cutkosky and Howe taxonomy to the primitives scheme

In all cases, when the primitive is too small to be grasped by more then two fingers, the grasp applied is a pinch. In the same way, if the object is too large to be taken by a hand then a two-handed grasp is applied, using both hands in opposite sides of the object.

To determine if a parameter is small or large, we compare it with the size of the hand. Thus a sphere has a large diameter if it is larger than the width of the hand. The mass is compared with a pre-defined user adjustable threshold.


4.3. Geometric Calculations of the Hand Position

In our system we use a local coordinate system for the reference point in the hand and a local coordinate system for the primitive. The position and orientation of the primitive coordinate system will determine how the object will be grasped. It will be used as an inverse kinematics goal to locate the reference coordinate system of the hand. The axis of a primitive will be oriented as follows: the y-axis points upwards, the z-axis points towards the palm of the hand and the x-axis is the cross product of y and z.

Block Reference We call the face of the block intersected by the z-axis the goal face, the faces where fingers touch the block are called the contact faces. A block will be used to describe the parts of the object that have to be grasped using a pad opposition [Ibe87]. To determine the grasp position we only need to select the appropriate pair of opposed faces of the block (contact faces). To reduce the number of plausible goal faces we use the following heuristics:

1. From human observation we have noticed that very often we place our thumb in a visible face of the object to be grasped. In this case, with a simple test we can reduce the number of grasp positions from 24 to 12.

2. To direct the palm toward the exterior side of the body while grasping an object results in a very unnatural grasp position. This allows us to eliminate some of the remaining possibilities.

The application of the first heuristic reduces the possible contact faces to three pairs. For every pair of contact or opposite faces there are four plausible goal faces. This makes a total of 12 possible grasp positions. After applying the second heuristic some other postures of the hand can also be eliminated. From the reduced set we choose the contact faces as those minimizing a weighted sum of the translation and rotation distance between the initial and final hand positions [MaG91]. Now we already know the opposing faces, this gives us the orientation of the reference axis. The position of the block reference will be the center of mass of the primitive. So the first attempt to grasp a block will be by closing the involved fingers around its center of mass. However there are cases where this is not possible. To correctly find the reference position we will proceed as follows:

1. Set the center of mass of the block as the initial position of the axis. 2. Transform the block to the hand coordinate reference system and look if there is any intersection with the

initial hand posture. 3. If no intersection is found then we have a valid position. 4. If an intersection is found then the position of the primitive reference has to be approached to the border

of the object. Make a dichotomy search to find the nearest point of the center where no intersection is detected, so the reference axis are located as near of the center as possible.

Sphere and Cylinder Reference The orientation of the cylinder reference coordinate system is deduced from the orientation of its axis. The position of the reference will be in the center of the axis. In the case of the sphere we have a more wide orientation space. We can solve the problem defining an axis for the sphere primitive. In that case the same method than the one applied to the cylinder can be used to find the sphere reference. We can use the information implicitly contained in the primitive references, as the orientation, to better describe the parts of the object that can be grasped.

Hand references The coordinate system of the hand is grasp dependent, so a different coordinates guide system is initialized for every different grasp. In the case of the pinch grasp, where the thumb and the index are considered as the most involved fingers, the position of the hand reference will be determined as the middle point of the line joining the tips of the thumb and the index when they are at a distance corresponding to the width of the


goal face. This point is found by a simple interpolation (fig 6). The z-axis will be perpendicular to the palm of the hand, the x-axis will be perpendicular to the fingers, and the y-axis their cross product. For the disk and sphere wrap, the position of the reference is the center of the open hand and for the cylinder wrap it is in the index metacarpophalangeal joint. The tripod uses as a reference the meeting point of the thumb, index and middle fingers (fig 7).

Figure 6. Hand reference for the pinch grasp

Figure 7. Hand references for (a) the cylinder wrap, (b) the disk wrap , (c) the tripod and (d) the sphere

4.4. Grasp Execution

Now we have all the requisites to execute the grasp. We use inverse kinematics to make the reference axis of the hand match its orientation and position with the reference axis of the primitive. When the goal has been reached an interpolation process takes place between the initial position of the hand and the rest position


associated with the grasp. Collision detection between hand elements and the object are calculated to determine the exact final position of every finger. The final position of the hand is stored together with the final configuration of the arm and then a visualization of the full process is done interpolating the initial positions of the arm and the hand.

4.5. Results

The grasping system has been implemented in C on SGI IRIS Indigo-2 workstations. In the following pages we show some images of the resulting grasp when applied to the block (Fig.8), cylinder (Fig.9 and Fig. 10*in Color Section) and sphere (Fig.11) primitives of several sizes. Both simple and two-handed grasps are shown. The last example is a sequence where two primitives are used to describe the parts of a tennis racket (Fig.12) that can be grasped. And the resulting grasp if we asked to take the cylindrical one.

Figure 8. Block grasping using fingers, one hand and two hands

Figure 9. Cylinder grasping using two or more fingers

* See page C-526 for Figure 10.


Figure 11. Sphere grasping using fingers, one hand and two hands

Figure 12. Racket grasping

R.M. Sanso et al. /A Hand Control and Automatic Grasping System C-177

Acknowledgments

The research was partly supported by the Swiss National Research Foundation and OFES and is part of the ESPRIT Project HUMANOID.

References

[AIL85] Arbib, M.A., Iberall, T. and Lyons, D., Coordinated Control Programs for Movements of the Hand, in Hand Function and the Neocortex, A.W. Goodwin and I. Darian-Smith (edts), Berlin, Springer-Verlag (1985). Boulic R, Huang Z, et al. Human Data Structure, User Reference and Implementation Manual, Project ESPRIT Humanoid A Real-time and Parallel System for the Simulation of Virtual Humans, EPFL DI-LIG, 1993. Cutkosky, M. R. and Howe R.D., Human Grasp Choice and Robotic Grasp Analyis, in Dextrous Robot Hands, Venkataraman, S.T. and Iberall T. (Edts), Springer-Verlag (1990). Iberall T., The Nature of Human Prehension: Three Dextrous Hands in One, Proceedings 1987 IEEE International Conference on Robotics and Automation, Raleigh, NC, pp. 396-401. Lyons, D., A Simple Set of Grasps for a Dextrous Hand, Proceedings 1985 International Conference on Robotics and Automation, St. Louis, MO, pp. 588-593. Magnenat-Thalmann N. Laperriere R. and Thalmann D., Joint-Dependent Local Deformations for Hand Animation and Object Grasping, in Proceedings Graphics Interface '88, Edmonton. McBride, E.D., Disability Evaluation. Third Edition, J.B. Lippincott Co. (1942). Napier, J. R., The Prehensile Movements of the Human Hand, Journal of Bone and Joint Surgery,

Rijpkema, H. and Girard M., Computer Animation of Knowledge-based Grasping, Computer Graphics, vol. 25, N. 4, pp:339-348, SIGGRAPH 1991. Schlesinger, G., Der Mechanische Aufbau der Kunstlichen Glieder. In M. Borchardt et al (Edt), Ersatzglieder und Arbeitshilfen fur Kriegsbeschadigte und Unfallverletzte. Berlin, Springer (1919). Tomovic R., Bekey G.A. and Karplus W.J., A Strategy for Grasp Synthesis with multifingered Robot Hands, Proceedings 1987 IEEE on Robotics and Automation, pp 83-89. Turner R, Gobbetti E, Mangili A, Thalmann D, Magnenat Thalmann N. An Object-Oriented Methodology using Dynamic Variables for Animation and Scientific Visualization, Proc. CGI '90, Springer Verlag, pp.3 17-327.

[Bou93]

[Cut89]

[Ibe87]

[Lyo85]

[Mat88]

[McB42] [Nap56]

[RiG9 1]

[Sch19]

38B(4), pp. 902-913 (1956).

[Tom87]

[Tur90]

Documents

A Hand Control and Automatic Grasping System for Synthetic Actors