26
1 www.q2s.ntnu.no Parametric Modeling of Human Response to a Sudden Tempo Change 129 th AES Convention, Paper 8308 November 7, 2010, San Francisco Nima Darabi 1 , U. Peter Svensson 1 , Jon Forbord 1 Centre for Quantifiable Quality of Service in Communication Systems (Q2S) Norwegian University of Science and Technology (NTNU)

1 Parametric Modeling of Human Response to a Sudden Tempo Change 129 th AES Convention, Paper 8308 November 7, 2010, San Francisco Nima

Embed Size (px)

Citation preview

1

www.q2s.ntnu.no

Parametric Modeling of Human Response to a Sudden Tempo Change

129th AES Convention, Paper 8308November 7, 2010, San Francisco

Nima Darabi1, U. Peter Svensson1, Jon Forbord1

Centre for Quantifiable Quality of Service in Communication Systems (Q2S)

Norwegian University of Science and Technology (NTNU)

2

www.q2s.ntnu.no

Sensorimotor synchronization (SMS)

Title

Definitions:• Rhythmic synchronization between a timed sensory stimulus and a motor activity.

• Rhythmic coordination of perception and action.

• Referential behavior : describing behavior of any kind, which is subject to some stimulus (input) that is used to produce a response (output).

Goal: To model human tapping behavior to a recurring external (auditory/visual) source.

3

www.q2s.ntnu.no

The Subjective Test Setup• Task: The subjects were asked to perform a basic SMS task:

tapping in one-to-one (1:1) in synchronization with an auditory stimuli

• Mediums: Hand clapping / Finger tapping• Tempos:

– A = 100 bpm– B = 140 bpm

– C = 180 bpm

• Tempo steps– Positive: AB, BC, AC– Negative: BA, CB, CA

• Number of repetitions: 10

• Example of a generated sequence: BCBACBCABACBCBCBACACABABCABACABCB-ACBCABACBCBCBACACABABCABACAB

Title

4

www.q2s.ntnu.no

The Subjective Test Setup

Title

Right: A subject while taking the interactive subjective sensorimotor test while monitored by the examiner

Left: Test application interface

5

www.q2s.ntnu.no

Objectivity – Tapping sessions

Title

For the tapping sessions the subjects were encouraged to use the wrist to lead the discrete action, to take an abrupt, pulsed action and to suddenly release the downward force on the Macbook’s space button to decrease the asynchrony compared to smooth continuous movements [1].

• [1] M.T. Elliot, A.E. Welchman, A.M. Wing, “Being discrete helps keep to the beat”, 2009, Exp Brain Res (2009) 192: 731-737

6

www.q2s.ntnu.no

Objectivity – Clapping Sessions

Title

As of the clapping trials, among the eight different configurations that Repp has proposed [2][3] participants were instructed to clap by taking the fingers-to-palm position of the right hand relative to the left hand, forming a right angle with a natural curvature. This common natural configuration is called A3 clapping mode and has the highest central frequency [4].

[2] Bruno H. Repp, “The Sound of two hands clapping: An exploratory study”, Journal of Acoustical Society of America, Am. 81 (4), April 1987[3] Leevi Peltola, Cumhur Erkut, Perry R. Cook, Vesa Välimäki, "Synthesis of Hand Clapping Sounds", IEEE Transaction on audio, speech, and language processing, Vol. 15, No. 3, March 2007[4] Antti Jylhä, Cumhur Erkut, Inferring the hand configuration from hand clapping sounds, Proc. of the 11th Int. Conference on Digital Audio Effects, Espoo, Finland, September 1-4, 2008

7

www.q2s.ntnu.no

Post processing: Averaging and enrolling the missed claps

Title

Due to the intrinsic lack of data-points and the noisy nature of the experience, the analysis is highly sensitive to the missed claps and wrongly detected ones.

•Secondary click: As a user generated impulse is detected, a secondary different click sound was immediately reflected, in addition to the its stimulus click.

•Monitoring: An examiner monitored the application’s interface to make sure that neither some claps are missing nor possible additional impulses are detected.

•Post Processing: An algorithm was also applied to remove possible additional impulses and adding the potentially missed ones.

•Averaging: Due to the noisy and inaccurate nature of •the experiment, 10 repetitions of each test was measured and averaged.

8

www.q2s.ntnu.no

Averaged Tempo Step Response

Title

Top: All subjects’ measured average IOI response to a tempo step change of type BC (from 140 to 180 claps per minute (in the finger-tapping task). A total number of 120=12 persons10 repetitions are averaged to obtain each data point.

Bottom: The related tempo response, upsampled by staircase, linear, shape-preserving cubic, and cubic spline interpolations.

9

www.q2s.ntnu.no

Upsampling and Interpolation

Title

•The main assumption is that a human subject has a central timekeeper representing his/her continuous internal tempo. •In our subjective test we are measuring this timekeeper from time to time with a non-regular (varying) sampling frequency of the user’s handclaps.

•The consequent challenge is thus the fact that we are provided with few data-points for each step response.

•In order to get a high-resolution and regularly sampled signal to account for a sustainable identification we reconstruct the step response with a considerably higher artificial sampling frequency.

•Each interpolation force its own assumption regarding the intersample behavior of this central timekeeper.

10

www.q2s.ntnu.no

Retrievable Band

Title

•From an information theoretical point of view, there is a misleading degree of freedom that upsampling provides us with.

•In the Bode plot of the upsampled signal, the pool of data-points in the frequency domain can contain no more information about the system to extract than our limited vectors of averaged timestamps.

•Different interpolation methods, representing different inter-sample behaviors, have different artifacts in frequency ranges outside of a range.

•Retrievable Band is the frequency range of the Bode plot in which the retrievable information underlying the system does exist.

11

www.q2s.ntnu.no

Retrievable Band

Title

Upsampling the recorded signal with different interpolations, has an impact on the amplitude and the phase of the Bode plot outside of a retrievable band, which is marked by vertical solid lines.

12

www.q2s.ntnu.no

System Identification

Title

•Model parameters are estimated by MATLAB© System Identification toolbox, using iterative prediction-error minimization method.

•The FIT ratio:

FIT = (1 −(Y − ˆ Y )2∑(Y −Y )2∑

) ×100%

13

www.q2s.ntnu.no

Results, Modeled Step Response

Title

•Model parameters are estimated by MATLAB’s system identification using iterative prediction-error minimization method

14

www.q2s.ntnu.no

Results, Modeled Step Response

Title

Tempo response upsampled by cubic spline method plotted for six steps for Clapping Trials, 12 subjects, 10 repetitions.

15

www.q2s.ntnu.no

Results, Modeled Step Response

Title

Tempo response upsampled by cubic spline method plotted for six steps for Tapping Trials, 12 subjects, 10 repetitions.

16

www.q2s.ntnu.no

Results, Pole-Zero Plot

Title

8 Different identified systems for the same signal visualized in the Z-plane.

17

www.q2s.ntnu.no

Results, Extracted Parameters

Title

8 Different identified systems for the same signal reported numerically.

18

www.q2s.ntnu.no

Pole/Zero Plot

Title

19

www.q2s.ntnu.no

Choice of the Interpolation

Title

For all the step responses linear interpolation shows to have the best fit, even slightly better than shape-preserving and cubic spline methods

20

www.q2s.ntnu.no

Choice of the Model

Title

•P2DU is the simplest good (the best simple) model. To include another parameter, at the cost of increased complexity, an additional real pole or a zero helps.

•Adding a third parameter (P3DU) improves the system better than an additional third pole (P2DUZ).

•Including them both still gives a mildly better fit, but the system identified by P3DUZ model could be more complex than needed in many related applications to this modeling.

21

www.q2s.ntnu.no

P2DU, Linear interpolation

Title

22

www.q2s.ntnu.no

Model Parameters

Title

P2UD model parameters estimated for the linearly interpolated data, averaged over 120 trials (12 subjects, 10 repetitions).

23

www.q2s.ntnu.no Title

Future Works

1. Other scenarios:

Process: Period correction vs. Phase correction

• Task: in-phase, anti-phase, 1:1:, 2:1, 1:2, 1:3, etc.• Sensory: Auditory, Visual, Tactile• Motor: finger-tapping, hand-clapping, musical

instruments, game devices, hitting different weights against a pad, movement without contact

24

www.q2s.ntnu.no Title

Future Works

2. Identifying Sensory and Acting systems separately: the whole individual-medium chain from sensory to action in a sensorimotor synchronization task is identified in a black-box approach.

This could be broken down into more details like the sensory part (auditory system, internal brain and nerve system) and the motor part (muscles, and external equipment) to reveal each of these two block’s transfer function.

25

www.q2s.ntnu.no Title

Future Works

3. Model Validation: Comparing the model output to the observed output, even given the best fit does not assure us that the model would work for other similar sets of observation.

The statistical approaches to verify the model by exposing it to as much data as possible are skipped in this paper.

The similarity of the estimated SMS model parameters for two different tasks of hand-clapping and finger-tapping, however, is an indicator that the results are reliable.

26

www.q2s.ntnu.no Title

Questions

?