9
CONFERENCE REPORT CONFERENCE REPORT AES 54 th International Conference Audio Forensics 12–14 June 2014 London, UK 622 J. Audio Eng. Soc., Vol. 62, No. 9, 2014 September

AES 54 th Audio ForensicsCONFERENCE REPORT CONFERENCE REPORT AES 54 th International Conference Audio Forensics 12 –14 June 2014 London, UK 622 J. Audio Eng. Soc., Vol. 62, No. 9,

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: AES 54 th Audio ForensicsCONFERENCE REPORT CONFERENCE REPORT AES 54 th International Conference Audio Forensics 12 –14 June 2014 London, UK 622 J. Audio Eng. Soc., Vol. 62, No. 9,

CONFERENCE REPORT

CONFERENCE REPORT

AES 54th International Conference

Audio Forensics12–14 June 2014

London, UK

622 J. Audio Eng. Soc., Vol. 62, No. 9, 2014 September

Page 2: AES 54 th Audio ForensicsCONFERENCE REPORT CONFERENCE REPORT AES 54 th International Conference Audio Forensics 12 –14 June 2014 London, UK 622 J. Audio Eng. Soc., Vol. 62, No. 9,

CONFERENCE REPORT

J. Audio Eng. Soc., Vol. 62, No. 9, 2014 September 623

In June 2014 the city of London welcomed the audio community for the lat-est international audio forensics conference. Participants from all over theworld gathered to share information on research and practice in forensic

science. The 54th AES International Conference, Audio Forensics—Tech-niques, Technologies, and Practice, was the most recent AES event focusingon the field of audio forensic analysis and interpretation. The sequence of AESaudio forensic conferences began in 2005 with the 26th AES Conference heldin Denver. The 33rd Conference returned to Denver in 2008, followed in 2010by the 39th Conference in Hillerød, Denmark, and back to Denver for the 46thConference in 2012.As was true for the prior AES audio forensics meetings, the 54th Conference

brought together an outstanding combination of practitioners, researchers,law enforcement professionals, attorneys, and many other experts andstudents all sharing an interest in the latest developments and contributionsto audio forensic science made by AES members. The 62 registered partici-pants included representatives from 15 countries.

Platinum Sponsor

Sponsors

Page 3: AES 54 th Audio ForensicsCONFERENCE REPORT CONFERENCE REPORT AES 54 th International Conference Audio Forensics 12 –14 June 2014 London, UK 622 J. Audio Eng. Soc., Vol. 62, No. 9,

CONFERENCE REPORT

The two years of planning prior to the 54th Conferenceinvolved members of the AES Technical Committee on AudioForensics and the AES headquarters staff. Mark Huckvale and

Jeff M. Smith, were conference cochairs, while Mike Brookes andDurand R. Begault served as papers cochairs. Workshop sessionswere organized by Catalin Grigoras, and Patrick Naylor organizedthe exhibition of audio forensics products. The committee’s workwas supplemented by student volunteers from University CollegeLondon who assisted with registration and presentation logistics.The conference was hosted at the Holiday Inn – Bloomsbury,

located in central London near Russell Square. The Bloomsburydistrict of London is noted for its connections to education, thearts, literature, and medicine. Former residents of the area includeCharles Dickens, T.S. Eliot, Virginia Woolf, John Maynard Keynes,and Dorothy Sayers, among many others. This vibrant part of thecity is also home to University College London, the BritishMuseum, the British Library, and the National Hospital forNeurology and Neurosurgery. The hotel meeting area and accom-modations were ideally suited to the conference, and enabled thedelegates to enjoy one-on-one interaction and small-group discus-sions that are among the key features of all AES internationalconferences. The meeting schedule included group lunches, coffeebreaks, and other unstructured social events that were filled withlively conversations.

CONFERENCE OPENINGThe conference opened on Thursdaymorning, 12 June, with clear skies andvery comfortable temperatures in Lon-don. Conference cochair Mark Huck-vale of University College London wel-comed the delegates and provided anoverview of the conference. Huckvalealso shared some historical insightsabout the Bloomsbury district of Lon-don, and invited the delegates to enjoythe sights, sounds, and flavors of thecommunity. Cochair Jeff Smith, associ-ate director of the National Center forMedia Forensics of the University ofColorado Denver, added his words ofwelcome to the participants, andthanked the sponsors and exhibitors,led by platinum sponsor CEDAR AudioLtd. Other exhibitors and sponsorsincluded AGNITiO, iZotope, OxfordWave Research, Salient Sciences, andVoxalys. The exhibitors were presentthroughout the conference for demon-strations and discussion. The cochairsalso thanked the local organizing com-

mittee and the volunteers for arranging the conference venue andspecial events.

KEYNOTE LECTUREItiel Dror, a noted cognitive neuroscience researcher from Uni-versity College London, presented an engaging, provocative, andentertaining keynote address for the 54th Conference entitled“Cognitive Bias in the Interpretation of Forensic Evidence.” Drorperforms research in the performance of experts in real interpre-tive situations, and his work reinforces the fact that even expertscan show unintended bias when presented with evidence from an

investigation. While muchof Dror’s work has involvedtactical decision-making inreal-time by doctors, sol-diers, and others who mustmake split-second decisions,he pointed out that forensicscience is generally carriedout without immediate timepressure, so bias in forensicinvestigations can beaddressed in a more com-prehensive and systematicmanner.Dror explained that the

human brain is built to inter-pret a highly filtered and inherently biased version of the worldaround us, and that “expertise” can actually represent a veryspecialized type of bias: the ability to focus on specific details andmeanings while ignoring many other aspects of a particularcircumstance or scene.In forensic analysis, one of the ways in which cognitive bias often

occurs is by a detective or another investigator sharing side-infor-mation about the case that is inevitably prejudicial to the expert’sinterpretation. Dror emphasized that audio forensic experts shouldnever see themselves as “part of the team,” trying to “make thecase” for one side or the other. His advice is for experts to acknowl-edge that cognitive bias is always present, and rather than treatingthis as a fault or weakness, consider ways to help the court or juryunderstand the potential uncertainty that is present even in anexpert’s testimony.

EXHIBITOR INTRODUCTIONSAmong the features of many AES conferences are exhibits andhands-on demonstrations provided by companies and productdevelopers. The 54th Conference was no exception, and the end ofthe Thursday morning technical session included some time forthe exhibitors to introduce themselves and the products beingshown at the event.CEDAR Audio Ltd. of Cambridge, UK, platinum sponsor of the

conference, was represented by Gordon Reid, who explained the

624 J. Audio Eng. Soc., Vol. 62, No. 9, 2014 September

Itiel Dror engages delegates to theconference during his keynote.

Jeff Smith, left, thanks Gordon Reid of CEDAR, platinum sponsor.

Mark Huckvale welcomeseveryone to the event.

Jeff Smith thanks theconference sponsors.

Page 4: AES 54 th Audio ForensicsCONFERENCE REPORT CONFERENCE REPORT AES 54 th International Conference Audio Forensics 12 –14 June 2014 London, UK 622 J. Audio Eng. Soc., Vol. 62, No. 9,

CONFERENCE REPORT

history of CEDAR and its noise-reduction and quality-enhancementproducts, and described a wide variety of applications in the enter-tainment and audio forensics fields.Oxford Wave Research, of Oxford, U.K., was introduced by Anil

Alexander. Alexander explained his company’s work in automaticspeaker diarization (processing a recorded dialog into separatesegments containing the utterances of each participant in therecorded conversation), time alignment and removal of knownaudio, and automatic talker recognition.Jonas Lindh introduced Voxalys AB, of Göteborg, Sweden.

Voxalys is a forensic speech and acoustic lab that offers consulta-tion and services regarding forensic phonetics, audio, training, andsoftware.Salient Sciences, Durham, North Carolina, USA, was represented

by Don Tunstall. Salient Sciences was formed recently by a mergerof Digital Audio Corporation and Salient Stills. The companyprovides audio/video analysis and enhancement services for lawenforcement and forensic analysis, including hardware, VST plug-ins, and other software components. The company also providestraining, as well as custom engineering and system development.Rounding out the outstanding group of exhibitors was

Antonio Moreno, technical sales director of AGNITiO, a company

from Madrid, Spain, specializing in voice-identification productsand technology.The exhibition area was located in the lobby immediately adja-

cent to the main meeting room for the conference, providingnumerous opportunities for the delegates to examine the productsand services offered by the sponsoring companies.

TECHNICAL PROGRAM—DAY 1Following a break for lunch, the first afternoon of the conferenceincluded a tutorial session and presentation of the first group oftechnical papers.

TUTORIAL 1: SPEAKER COMPARISONGordon Reid of CEDAR served as the chair for a tutorial sessioncovering current aspects of forensic automatic speaker recognitionfrom degraded speech material, and also from recorded speech thathad been passed through a stage of noise reduction or qualityenhancement processing.Reid introduced Hermann Künzel, University of Marburg,

Germany, who since 2009 has studied the question of whether ornot audio enhancement processing can be simultaneously suitablefor audition by human listeners and for automatic speaker recogni-tion by software. Künzel also introduced Paul Alexander of CEDAR,who is involved with testing and evaluating forensic audio algo-rithms.Antonio Moreno of AGNITiO described the basic principles of

automatic speaker recognition using the classical method ofphonetic analysis. The approach used in AGNITiO’s Batvox softwareis to use a closed comparison of a particular example speechrecording to a set of potential matches. The analysis uses mel-frequency cepstral coefficients (MFCCs) and a Gaussian MixtureModel (GMM) as the matching method.One important issue is that a given speech recording will include

not only the spectro-phonetic characteristics of the talker, but alsobe convolved acoustically with the characteristics of the recordingsystem and communications channel, plus additive noise anddistortion. This means that it isnecessary to attempt to separatethe speaker variability from thechannel variability so that thecomparison is of the talker’scharacteristics and not thechannels’ attributes.Künzel then described an

experiment conducted to evalu-ate the speaker recognitionperformance for undisturbed(clean) speech and speechdegraded with various types ofnoise and reverberation added.The experiment further involvedthree different types of signalenhancement for the degradedspeech samples so that recogni-tion performance could becompared following theenhancement processes. Theresults showed that theenhancement improved therecognition performance withsome types of noise, but alsocaused worse performance with

J. Audio Eng. Soc., Vol. 62, No. 9, 2014 September 625

Antonio Moreno explainsautomatic speaker recognition.

Hermann Künzel looks into signalenhancement for improvingspeech recognition.

CONFERENCE EXHIBITION IN FULL SWING

Page 5: AES 54 th Audio ForensicsCONFERENCE REPORT CONFERENCE REPORT AES 54 th International Conference Audio Forensics 12 –14 June 2014 London, UK 622 J. Audio Eng. Soc., Vol. 62, No. 9,

other types of noise and enhancement settings. A paper describingthe experiment was published recently in the AES Journal (Künzel,Hermann, and Alexander, Paul, “Forensic Automatic SpeakerRecognition with Degraded and Enhanced Speech,” vol. 62, no. 4,April 2014, pp. 244-253).

PAPER SESSION 1: FORENSIC SPEAKER COMPARISONFollowing the afternoon break for tea, coffee, discussion, and con-versation in the exhibition area, the Thursday afternoon technicalpaper session was introduced by Durand Begault, papers cochair.The first paper, “A Study of F0 as a Function of Vocal Effort,” by

Eddy B. Brixen of EBB-consult, Denmark, described his investiga-tion of the fundamental frequency (F0) of speech under variationsin vocal effort such as casual conversation, loud talking, andyelling. The empirical work showed a general trend toward increas-ing F0 with increasing vocal effort, which may lead to future inves-tigation of how an examiner might compensate for such changeswhen comparing utterances recorded under circumstances ofdiffering vocal exertion.Michael Jessen of the Bundeskriminalamt (Federal Criminal

Police Office), Germany, presented a paper entitled “Forensic VoiceComparisons in German with Phonetic and Automatic FeaturesUsing VOCALISE Software.” Jessen described his work in collabora-tion with Anil Alexander and Oscar Forth of Oxford Wave ResearchLtd. to use the commercial forensic acoustical analysis package

VOCALISE (Voice Comparison and Analysis of the Likelihood ofSpeech Evidence) for their experiments. Jessen described how theGerman Federal Police use the software for forensic voice compar-isons. The software is found to be particularly convenient forextracting spectral and phonetic parameters from speech examples,although the issues of how to determine the various analysissettings and interpret the results in the presence of channel effectsremain as challenges for the forensic examiner.The concluding paper of the first session was “Evaluation

Results of Speaker Verification for VoIP Transmission withPacket Loss,” presented by Jörg Bitzer of Fraunhofer IDMT.Bitzer examined the degree to which speaker verification couldbe affected by the common packet loss and error concealmentstrategies employed in unreliable communications systems suchas Voice over Internet Protocol (VoIP). Under the conditionsused in the experiment, Bitzer found that the influence ofpacket loss on automatic speaker verification can be neglected.Challenges still remain, however, for other types of degradation,such as background noise, echoes, and reverberation.Upon the conclusion of the successful first day of the confer-

ence the attendees were invited to reconvene for an informalreception at the Grant Museum of Zoology and ComparativeAnatomy, located on the campus of University College London,just a short walk from the conference site. The Grant Museumdates back to its founding in 1827 by Robert Edmond Grant(1793–1874), who set about creating a vast collection of animalspecimens from around the world for use in his universitycourses. While the modern fields of zoology and biology increas-ingly rely upon DNA and other molecular analysis techniquesfor categorizing species, the Grant zoological collectionprovides a fascinating look at micro and macro fauna gatheredover nearly 200 years. The 54th Conference delegates enjoyedchampagne, hors d’oeuvres, and conversation while strollingamong exhibits including the bones of the extinct Dodo andQuagga, thousands of microscopic specimen slides, and theentire intact skeleton of a five-meter-long Anaconda snake. Theparticipants relished the opportunity to relax and unwind in theMuseum’s unique surroundings while discussing the day’s topicsand presentations.

TECHNICAL PROGRAM—DAY 2The second day of the conference dawned in London with comfort-able temperatures and clear skies. Delegates from Brazil werehappy about the news overnight of their team winning the openinggame of the World Cup in SaoPaulo, while those from Mexico,Spain, and the Netherlands wereanticipating word of their ownteam’s results during the Fridaycompetitions.

TUTORIAL 2:DEREVERBERATIONCatalin Grigoras introduced theopening event of the day, a tuto-rial session on experimentalmethods for dereverberation.Patrick Naylor of Imperial Col-lege London provided a clearand complete introduction tothe topic, including the acousti-cal origins of reverberation and

CONFERENCE REPORT

626 J. Audio Eng. Soc., Vol. 62, No. 9, 2014 September

Patrick Naylor gives his tutorialon the various methods forremoving reverberation. Patrickalso organized the exhibition.

A reception at the Grant Museum enables the assembled company todiscuss audio forensics in the company of stuffed animals.

From left, Doug Lacey, Jeff Smith, and Rob Maher

Page 6: AES 54 th Audio ForensicsCONFERENCE REPORT CONFERENCE REPORT AES 54 th International Conference Audio Forensics 12 –14 June 2014 London, UK 622 J. Audio Eng. Soc., Vol. 62, No. 9,

CONFERENCE REPORT

J. Audio Eng. Soc., Vol. 62, No. 9, 2014 September 627

the various approaches that havebeen developed over the years totry to “de-convolve” the originalsound from a reverberantrecording. Naylor and his stu-dents, Alastair Moore and Chris-tine Evers, described severalexperimental techniques forenhancement of reverberantspeech and for blind systemidentification. Dereverberationis considered an unsolved prob-lem, and many of the difficulttheoretical and practical consid-erations remain interesting top-ics for ongoing exploration.

PAPER SESSION 2: ACOUSTICSThe chair of the morning papers session, Mike Brookes, introducedEphraim Gower of the Botswana International University of Sci-ence and Technology, Palapye, Botswana. Gower spoke about“Exploiting Sparsity for Source Separation Using the Sliding RatioSignal Algorithm,” a time-domain approach to solving convolu-tional mixing. The technique approaches the signal separationproblem by exploiting the low likelihood that the energy from mul-tiple concurrent signals is present at the same time and same fre-quency. To the extent that the signals are disjoint in this fashion,the concept is to create an approximation to the separated signalsin each time segment of a sliding discrete Fourier transform(SDFT).The next paper, presented by Hamza Javed of Imperial College

London, was entitled “Evaluation of an Extended ReverberationDecay Tail Metric as a Measure ofPerceived Reverberation.” Javedexplained the need for a quantitativeway to estimate the reverberation levelin recorded speech and the potentialusefulness of the reverb “decay tail” incomparing the improvement producedby dereverberation algorithms.The morning session concluded with

a paper by Alastair Moore, also ofImperial College London, entitled“Room Identification UsingRoomprints.” The goal of the work wasto have a way to identify characteristicsof the room in which a recording was

made based on the recording itself. Moore’s experiment involved 22room impulse responses that were filtered with five different filter-bank resolution settings (1⁄2, 1⁄3, ¼, 1⁄8, and 1⁄12 octave) to determine ifthey could be discriminated reliably. The results indicated that ¼-octave resolution with lowest frequency between 100 Hz and 300Hz is sufficient to achieve at least 95.7% identification accuracy onthe test database.

TUTORIAL 3: FORENSIC AUDIO AUTHENTICATIONWORKSHOPEddy Brixen introduced Catalin Grigoras, 54th Conference work-shops chair and director of the National Center for Media Forensicsat the University of ColoradoDenver. Grigoras describedmany key principles for han-dling digital evidence, especiallyin the context of authentication.The audio forensic examinermust take great care to avoidcontamination of the digitalmaterial through careless orinadvertent handling. Grigorasrecommended that the examinercarefully document the chain ofcustody and the processing stepsinvolved so that there would be no question later about theintegrity of the audio material or the conclusions drawn from it.

PAPER SESSION 3: AUDIO AUTHENTICATIONFollowing a break for tea and conversation, the afternoon tech-nical program resumed with a paper session chaired by DanielRappaport of CACI Digital Forensics Lab, Alexandria, Virginia,USA. Rappaport introduced Bruce Koenig of BEK TEK LLC,Clifton, Virginia, who presented the paper “Forensic Authentic-ity Analyses of the Metadata in Re-Encoded WAV files.” Koenig explainedthe basic format of a Resource Inter-change File Format (RIFF) file con-taining audio material , typical lyreferred to as a WAV file. The WAV fileuses embedded tags and logical infor-mation containers, commonly called“chunks,” to hold not only the audiodata itself, but also descriptive infor-mation (metadata) about the data for-mat, audio parameters, and possiblyproprietary manufacturer-specific

Catalin Grigoras explains theprinciples of handling digitalevidence.

A crowded conference hall with delegates listening attentively during one of the many fascinating papers sessions.

Christine Evers describestechniques for the enhancementof reverberant speech and blindsystem identification.

Alastair Moore speaks on“roomprints.”

Bruce Koenig deals withmetadata in RIFF files.

Page 7: AES 54 th Audio ForensicsCONFERENCE REPORT CONFERENCE REPORT AES 54 th International Conference Audio Forensics 12 –14 June 2014 London, UK 622 J. Audio Eng. Soc., Vol. 62, No. 9,

information inserted by the particular recording device or soft-ware package used to create or edit the WAV file. Koenigexplained that when an original WAV file is opened by a file edit-ing program and simply saved again without deliberate alter-ation, it is still possible that the editing program may insert newor altered metadata into the newly saved WAV file. Based on anexperiment with nine different audio recording devices, Koenigsuggested that audio forensic examiners become familiar withthe particular WAV file chunks produced by various devices andfile editors as one way to identify possible tampering or copying,since simply opening a file and saving it unaltered can result inchanges to the chunk contents in some cases.Next, Luca Cuccovil lo of Fraunhofer IDMT, Germany,

presented his group’s work entitled “A Multi-Codec AudioDataset for Codec Analysis and Tampering Detection.” The workhas involved creating a dataset of “tampered” and untampered

audio excerpts, where the tamperingconsists of a change in the encodingparameters, such as encoder type,encoding bit rate, and the framingalignment of the encoder. The alter-ations are made in such a mannerthat the changes should not be audi-ble, but yet may still be detected byidentifying underlying changes in theencoding that result in variousencoding-dependent artifacts. Thealterations of interest are situationsin which an original encoded streamis decoded, edited, and then re-

encoded by the same or a different codec. Full annotation isprovided with the dataset so that the type and extent of thetampering is known. The dataset is available for use by anyoneunder the Creative Commons license, and the developers arehopeful that it will be a useful resource for developing tamper-ing-detection strategies and for testing software analysissystems.Anibal Ferreira of the University of Porto, Portugal, spoke

about “Real-Time Monitoring of ENF and THD QualityParameters of the Electrical Grid in Portugal.” The experiment

was to measure the electrical grid powerquality at several separate locations inPortugal’s power grid. As expected, theElectrical Network Frequency (ENF) wasconsistent across the entire grid, show-ing drift and macro changes due to loadand generation management. The qual-ity of the sinusoidal electrical powerwaveform varied from place to place onthe grid, as measured by Total HarmonicDistortion (THD). While the ENF isgoverned by exactly balancing the elec-trical generation capacity with theinstantaneous electrical load on the grid,the THD depends upon distortion intro-duced by nonlinearities in the powersystem, such as solid-state electricalswitching power supplies, triac powercontrols, andi n d u s t r i a ls w i t c h i n gdevices.

Concluding Friday’s technicalpapers was “Quantization LevelAnalysis for Forensic MediaAuthentication,” presented by CatalinGrigoras. Grigoras proposed that oneway to detect possible audio f i lemodification in the case of an origi-nal low-resolution file (e.g., 8-bitquantization) that has beenconverted for editing at a higherresolution (e.g., 16-bit), is to look fortell-tale indications that there arediscrete quantization levels remain-ing in the data stream. Similarly,PCM files that have had a change in quantization level followedby amplitude fading or interpolation may show tell-tale linearinterpolation effects between the original discrete levels. Thequantization level analysis is appropriate when the file isnormalized when converted to higher resolution, meaning thatthe least-significant bits are zero. If the file is converted whilemaintaining absolute value so that the most-significant bits arefilled with the sign bits, the detection strategy will not be effec-tive.

FRIDAY SOCIAL EVENT: BLOOMSBURY WALKING TOURAND WINE BAR VISITWith the delegates ready to unwind after two fine days of audioforensic content, the conference organizing committee thought-fully provided the opportunity for an informal social outing onFriday evening. The social time began with two expert guidesleading groups on a fascinating walking tour of the Bloomsburyarea, including Russell, Tavistock, Red Lion, Bloomsbury, andGarden Squares, plus Coram’s Fields and the British Museum.The weather cooperated beautifully, and the tour guides’insights and observations made London’s 18th-centurystreetscapes and modern plazas meld into a charming display ofwhat makes London a world-class city. All too soon the tourcame to an end, but, thankfully, the culminating destination wasTruckles Wine Bar, located just down the block from the BritishMuseum. The conference delegates were treated to a fine selec-

CONFERENCE REPORT

628 J. Audio Eng. Soc., Vol. 62, No. 9, 2014 September

Delegates gather in front of a statue of Gandhi during a walking tour of Bloomsbury.

Luca Cuccovillo speaks oncodec analysis.

Anibal Ferreira looks intothe electrical networkfrequency.

Page 8: AES 54 th Audio ForensicsCONFERENCE REPORT CONFERENCE REPORT AES 54 th International Conference Audio Forensics 12 –14 June 2014 London, UK 622 J. Audio Eng. Soc., Vol. 62, No. 9,

CONFERENCE REPORT

J. Audio Eng. Soc., Vol. 62, No. 9, 2014 September 629

tion of hors d’oeuvres, ales, wines, and other beverages derivedfrom fruits and grains, while mingling in the spacious courtyardwith Londoners preparing for the World Cup matches to be tele-vised later that evening.

TECHNICAL PROGRAM—DAY 3The final day of the 54th Conference featured a tutorial and threetechnical paper sessions. Prior to the session, the hallway discus-sions among some of the delegates included the Spain vs. Nether-lands World Cup match, in which the Dutch prevailed 5–1.

TUTORIAL 4: SPEECH INTELLIGIBILITYSpeech intelligibility is an important aspect of audio forensics.Many audio forensic investigations involve understanding and

transcribing speech utter-ances, and understanding theways in which audio process-ing algorithms can positivelyor negatively affect intelligibil-ity is a key area of interest. Thetutorial, presented by GastonHilkhuysen of the University ofMarseille, France, covered theexperimental results of humantesting with speech presentedas isolated words and as mean-ingful sentences. Speech-enhancement algorithms tendto work better at high signal-to-noise ratios, but with highSNR the intelligibility is oftenvery good to begin with. Insome studies it appears that

simple spectral shaping, such as light high-pass filtering, cangive as much of an intelligibility improvement as more compli-cated methods. Hilkhuysen explained that this may be due tothe high-pass filtering simply limiting the upward spread ofmasking, since from a psychoacoustical standpoint, noise andinterfering sounds tend to mask high-frequency content to agreater degree than low-frequency content.Hilkhuysen emphasized that most speech intelligibility and

speech quality tests show that performance is better with steady,stationary noise like the background hum within a moving auto-mobile or train, and performance is worse for fluc-tuating or time-varying noise such as speechbabble or interfering music. Considerable effortcontinues in the intelligibility and enhancementfields, and both theoretical and practical break-throughs are needed.

PAPER SESSION 4: SPEECH ENHANCEMENTRichard Stanton of Imperial College, London, pre-sented an interesting paper entitled “A Differen-tiable Approximation to Speech Intelligibility Indexwith Applications to Listening Enhancement.” TheSpeech Intelligibility Index (SII) is a standardizedobjective measurement that is intended to be a reli-able estimate of intelligibility using a weighted-sumof signal-to-noise calculations in a set of subbands.The SII requires separate access to both the “clean”speech and to the “noise,” and then produces anestimate of the expected intelligibility for the

speech and noise mixture. Stanton explained that the SII stan-dard results in a discontinuous function across frequency, whichis therefore not mathematically differentiable. Determining a dif-ferentiable version of the SII would provide a way to do mathe-matical minimization and optimization, so his work has been todevelop an approximation to the SII that is smooth and continu-ous and analytically differentiable. The results have been verygood, and have been shown to allow an optimized filter thatimproved the SII of noisy speech segments while simultaneouslyconstraining signal power.

PAPER SESSION 5: FORENSIC MUSICOLOGYForensic musicology deals with disputes about similarities in musi-cal works that could be considered copyright infringement, authen-tication of musical recordings, determining the origin of melodicmaterial, and similar questions that can benefit from systematicaudio forensic investigation. Durand Begault of the Audio ForensicCenter of Charles M. Salter Associates, San Francisco, provided adescription of recent developments in the field in his paper “Foren-sic Musicology – An Overview.” Begault explained that the field offorensic musicology is traditionally highly subjective, and forensicexperts in the field are, more often than not, asked to provide sub-jective analyses based primarily on their own impressions or“golden ear” assessments rather than a clear and objective scien-tific approach. Begault warned against a reliance on subjectiveimpressions, since human observers are innately able to locate pat-terns and similarities even in unstructured and random shapessuch as clouds and ink blots. He also recommended that forensicmusicological examiners eschew reliance on elaborate sequences ofsophisticated steps and transformations that tend to exaggerate thepotential similarity of musical material. In conclusion, Begault rec-ommended that forensic examiners avoid the highly subjective andpseudo-scientific approaches, and continue to develop and publishnew techniques in forensic musicology that will benefit the foren-sic community and, ultimately, the legal system.

PAPER SESSION 6: GUNSHOT ACOUSTICSFollowing the final lunch break and active discussions in theexhibition area, Papers cochair Durand Begault rounded up thedelegates to return to the main conference room for the lasttechnical paper session of the conference, joking that it was“time to conclude the meeting with a bang: two papers onforensic gunshot acoustics.”

From left, Mark Huckvale, Jeff Smith, Mike Brookes, Durand Begault, and CatalinGrigoras of the conference committee

Gaston Hilkhuysen tackles thetopic of speech intelligibility.

Page 9: AES 54 th Audio ForensicsCONFERENCE REPORT CONFERENCE REPORT AES 54 th International Conference Audio Forensics 12 –14 June 2014 London, UK 622 J. Audio Eng. Soc., Vol. 62, No. 9,

The first paper was “Gunshot Recordings from Digital VoiceRecorders,” presented by Rob Maher of Montana State University,Bozeman, Montana. Maher explained that the audio gunshotrecordings commonly presented for forensic examination are

increasingly made using digitalvoice recorders whose micro-phones, amplifiers, and speechcoding algorithms are notdesigned to capture the highintensity and brief impulses ofgunfire. Maher described thecharacteristics of typical muzzleblast sounds from firearms,emphasizing that the durationof the muzzle blast itself is onlya few milliseconds, while mostforensic gunshot recordingsinclude significant amounts ofreflected and reverberant sonicenergy that lingers for hundreds

or thousands of milliseconds depending upon the acousticalsurroundings. Amplitude clipping and temporal distortion from thecoding algorithm and the recorder’s automatic gain control areoften present in gunshot recordings, making it difficult to be confi-dent about waveform integrity. He presented several experimentalresults obtained under controlled conditions, and also severalexamples of gunshot audio from actual forensic investigations.Maher recommended that audio forensic examiners become awareof the strengths and weaknesses of gunshot recordings obtainedfrom digital voice recorders, and utilize the evidence while keepingin mind the inherent limitations of the recording devices.Douglas Lacey of BEK TEK LLC presented the final technical

paper of the conference, entitled “The Effect of Sample Lengthon Cross-Correlation Comparisons of Recorded GunshotSounds.” Lacey reported on his team’s investigation of tech-niques to determine a quantitative measure of similaritybetween two recorded gunshots. A crosscorrelation techniquehas previously been explored for this purpose, and Laceycompared correlations calculated with time segment lengthsvarying between 2 and 50 milliseconds to an expert’s subjective

impression of similarity. The results showed that the objectivecrosscorrelation results were in general agreement with thesubjective comparison, and that observing the crosscorrelationfor a range of segment lengths gave useful insights into thedetails and acoustical characteristics of the gunshot samples.

CONFERENCE BEST PAPER AWARDThe organizing committee arranged a Best Paper Award competi-tion with a cash prize made possible by conference Platinum Spon-sor, CEDAR Audio Ltd., and selected by CEDAR’s EngineeringDirector Christopher Hicks. The paper selected for the award was“A Differentiable Approximation to Speech Intelligibility Index withApplications to Listening Enhancement,” authored by RichardStanton, Nikolay Gaubitch, Patrick Naylor, and Mike Brookes.Richard Stanton received a hearty ovation and congratulations ashe accepted the award on behalf of the authors.

AES AUDIO FORENSICS: CHARTING THE FUTUREThe AES 54th Conference continued the tradition established atthe four prior AES forensics conferences by treating the partici-pants to a superior slate of audio forensics papers and work-shops. AES has proven its place as the leading professionalgroup in the field of forensic audio analysis and interpretation.Conference cochairs Jeff Smith and Mark Huckvale, togetherwith AES deputy director Roger Furness, concluded the confer-ence by thanking the volunteer conference committee and com-mending all of the conference delegates for their effort in pre-senting their work for the AES audience.It continues to be an outstanding time to be involved in the

growing field of audio forensics. The delegates at the 54thConference left the wonderful city of London with many newand exciting concepts, and all look forward to the next opportu-nity to attend a future AES conference on Forensic Audio.

CONFERENCE REPORT

630 J. Audio Eng. Soc., Vol. 62, No. 9, 2014 September

The Best Paper Award is presented to Richard Stanton, right, by Chris Hicks

Jos Vermeulen, left, and Antonio Moreno get down to brass tacks over athorny point in audio forensics.

Editor’s note: a USB drive or downloadable PDF of the conferencepapers can be purchased online at www.aes.org/publications/conf.cfm

Rob Maher explains some issuesabout gunshot recordings.