5
HAL Id: hal-00646405 https://hal.inria.fr/hal-00646405 Submitted on 19 Feb 2013 HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés. Computational Model of Film Editing for Interactive Storytelling Christophe Lino, Mathieu Chollet, Marc Christie, Rémi Ronfard To cite this version: Christophe Lino, Mathieu Chollet, Marc Christie, Rémi Ronfard. Computational Model of Film Editing for Interactive Storytelling. ICIDS 2011 - International Conference on Interactive Digital Sto- rytelling, Nov 2011, Vancouver, Canada. pp.305-308, 10.1007/978-3-642-25289-1_35. hal-00646405

Computational Model of Film Editing for Interactive ... · Computational Model of Film Editing for Interactive Storytelling Christophe Lino 1;2, Mathieu Chollet 3, Marc Christie ,

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

HAL Id: hal-00646405https://hal.inria.fr/hal-00646405

Submitted on 19 Feb 2013

HAL is a multi-disciplinary open accessarchive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come fromteaching and research institutions in France orabroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, estdestinée au dépôt et à la diffusion de documentsscientifiques de niveau recherche, publiés ou non,émanant des établissements d’enseignement et derecherche français ou étrangers, des laboratoirespublics ou privés.

Computational Model of Film Editing for InteractiveStorytelling

Christophe Lino, Mathieu Chollet, Marc Christie, Rémi Ronfard

To cite this version:Christophe Lino, Mathieu Chollet, Marc Christie, Rémi Ronfard. Computational Model of FilmEditing for Interactive Storytelling. ICIDS 2011 - International Conference on Interactive Digital Sto-rytelling, Nov 2011, Vancouver, Canada. pp.305-308, �10.1007/978-3-642-25289-1_35�. �hal-00646405�

Computational Model of Film Editing forInteractive Storytelling

Christophe Lino1,2, Mathieu Chollet1,3, Marc Christie1, Remi Ronfard3

1 MIMETIC, INRIA Rennes Bretagne - Atlantique, France2 School of Computing Science, Newcastle University, UK

3 IMAGINE, INRIA Grenoble Rhone Alpes, France

Abstract. Generating interactive narratives as movies requires knowl-edge in cinematography (camera placement, framing, lighting) and filmediting (cutting between cameras). We present a framework for generat-ing a well-edited movie from interactively generated scene contents andcameras. Our system computes a sequence of shots by simultaneouslychoosing which camera to use, when to cut in and out of the shot, andwhich camera to cut to.

Keywords: Camera planning, Virtual Cinematography

1 Introduction

In interactive storytelling, it is useful to present 3D animation in a cinematicstyle, which means selecting appropriate cameras and appropriate inter-cuttingbetween cameras to properly convey the narrative. We propose an optimizationframework for selecting shots and cuts while the narrative unfolds, based on arelatively simple scoring scheme driven by working practices of film and television[3, 6]. We cast the problem of film editing as selecting a path inside an editinggraph which consists of a collection of evolving film takes (a take is a continuoussequence of images from a given camera) and precisely deciding when to cut inand out of film takes. In contrast to related work, we also account for a preciseenforcement of pacing (rhythm at which cuts are performed). We propose analgorithm suitable for online editing which uses an efficient best-first searchtechnique. The algorithm relies on short-term anticipation to improve qualityin cuts and produce movies consistent with the rules of cinematography andediting, including shot composition, continuity editing and pacing.

The paper is organized in two parts. The first part describes the score func-tions used to evaluate shots, transitions and pacing illustrated by a number ofexamples. The second part explains the search process for exploring the editinggraph during the storytelling process with a very minimal lookahead.

2 Film Grammar Rules

In our system the score of a movie is built up incrementally as the sum of thescores of its shot fragments and transitions. The cost per fragment (a fragment

CSA = 0.8 CS

A = 0.3 CSA = 0.2 CS

A = 0.7 CSA = 0.5 CS

A = 0.2

Fig. 1. Action costs. Left: Three shots of a drinking action. Right: Three shots of apouring action.

is a part of a take of duration ∆t) is evaluated as a weighted sum of all violationsof the rules of frame composition wk×CS

k . And similarly, the cost of a transition(or cut) is evaluated as a weighted sum of all violations of the rules of editingwl × CT

l (see equation 1).We compute a complete sequence s as a sequence of shots si of durations

di and cuts between shots. Each shot si is processed as the concatenation offragments f(t) where t is a time interval of length ∆t. We then assume that thecost of s is the sum of the costs for all of its fragments and cuts, plus a functionCP of the durations of shots:

(1)

C(s) =∑t

(∑k

wk × CSk (f(t), t) +

∑l

wl × CTl (f(t), f(t+ 1))

)+∑i

CP (di)

2.1 Shot composition

The cost of a shot fragment integrates the violation of three terms: action, vis-ibility, and composition. The action term CS

A(t) measures the amount of thescene action which is missed in the given fragment, computed as a sum over allactions a occurring during the fragment:

(2) CSA(t) =

∑a

imp(a)×MA[type(a), size(a), angle(a)]

where type(a) is the type of action, imp(a) its importance in the narrative,size(a) the screen size of its protagonist and angle(a) the profile angle of its pro-tagonist. The action matrix MA contains empirical preferences for shot framingsas a function of action types. Figure 1 illustrates the preferences for shot sizesand profiles for the special case of two actions types: pouring and drinking.

2.2 Shot transitions

A transition between shots causes a visual discontinuity. The art of the editor isto minimize the perception of discontinuity by selecting appropriate shots andmoments for cutting [1, 2, 5]. In this work, we compute the cost of a cut as asum of terms measuring discontinuities in the screen positions, gaze directionsand motion directions of actors. Moreover, we weight each term with the screensize size(i) of each actor i, so that continuity in the foreground receives a largerreward than in the background.

Fig. 2. Gaze continuity. Left: the gaze direction of the main character changes, resultingin a poor cut. Right: keeping the gaze directions consistent results in a better cut.

Screen continuity Screen continuity prevents actors in successive shots to ap-pear to jump around the screen. Because the actor’s eyes are the most importantcenter of attention, we favor transitions which maintain the actor’s eyes at thesame screen location.

Gaze continuity When watching a movie the gaze direction of actors shouldnot change. We thus use a cost function that penalizes camera changes thatcause apparent reversals in the actor’s gaze directions.

(3) CTGAZE =

∑i∈screen(f1)∩screen(f2)

size(i)× δ(sign(xGi,f1)− sign(xGi,f2))

where screen(fk) represents the actors that project on the screen during frag-ment fk, xGi,fk is the horizontal on-screen coordinate of the gaze direction foractor i in fragment fk and δ is the Kroneker symbol. Figure 2 shows two cutswith increasing gaze continuity scores.

Motion continuity The motion direction of actors in two successive shotsshould not change also. We thus use a cost function that penalizes camerachanges that cause apparent reversals in the actor’s motion, defined as:

(4) CTMOTION =

∑i∈screen(f1)∩screen(f2)

size(i)× δ(sign(xMi,f1)− sign(xMi,f2))

where xMi,fk is the screen motion of the actor’s eyes in fragment fk measured inthe horizontal on-screen direction, and δ is the Kroneker symbol.

2.3 Shot durations

To control the pace of the editing, we introduce a duration cost per shot, mea-suring the deviation from a log normal law, where d is the duration of the shot,µ and σ are resp. the mean and the standard deviation of the log normal law:

(5) CP (d) =(log(d)− µ)2

2σ2

The log-normal distribution is a compact and discriminative representation ofshot durations in movies as well as sentence lengths in natural language [4] andits two parameters can be used as a signature of film editing or writing styles.

3 Film editing as path finding

The computation of an optimal sequence of shots consists in searching the path ofleast cost in our editing graph. Exact and efficient algorithms exist for computinga solution offline. For interactive storytelling applications, we instead describean approximate method that chooses shots and cuts incrementally as the storyunfolds and runs at interactive framerates. At a given depth in the search process(i.e. advancement in time over the fragments), a decision needs to be madewhether to stay within the current shot or perform a cut to a shot in an othertake. We use a short observation window over the next W fragments to computethe best moment for transition. Given the current shot is sc, for a given time tin the observation window and for each shot si 6= sc, we compute the cost CCUT

of a possible transition from shot sc to shot si, and we compare it to the costCNOCUT of staying in the current shot.

If CNOCUT (sc) ≤ mini CCUT (sc, si) (i.e. the cost of staying in sc is the min-imal cost), we extend the duration of sc by ∆t and the observation window isshifted a fragment ahead. If there exists a shot si such that CCUT (sc, si) <CNOCUT (sc) at time t, we need to know whether to cut at the current time t toshot si, or to wait for a better moment. To implement this, we scan successivefragments at t + ∆t, t + 2∆t, ..t + W∆t in the observation window until a costlower than CCUT (sc, si) is found. In such case, the best cut occurs later and theobservation window can be shifted a fragment ahead. Otherwise, t represents thebest moment for a transition and a cut is performed towards shot si. Results arepresented here http://sites.google.com/site/christophelino/work/film editing.

4 Conclusion

We have introduced a novel framework for virtual cinematography and editingwhich adds an evaluation function to previous approaches. Preliminary resultsdemonstrate that our approach is efficient in separating correct from incorrectshot sequences in complex narratives with many actors and actions, and is thusappropriate for future research in film-mediated interactive storytelling.

This work has been funded (in part) by the European Commission undergrant agreement IRIS (FP7-ICT-231824).

References

1. Berliner, T., Cohen, D.J.: The illusion of continuity: Active perception and theclassical editing system. Journal of Film and Video 63(1), 44–63 (2011)

2. D’Ydewalle, G., Vanderbeeken, M.: Perceptual and cognitive processing of editingrules in film (1990)

3. Murch, W.: In the blink of an eye (1986)4. Salt, B.: Film Style and Technology: History and Analysis. Starword (2003)5. Smith, T.J.: An Attentional Theory of Continuity Editing. Ph.D. thesis, University

of Edinburgh (2005)6. Thompson, R.: Grammar of the Edit. Focal Press (1993)