10
I n June 2012 the city of Denver, Colorado, welcomed the AES audio forensics community back to the Mile-High City for another outstanding conference. Par- ticipants from all over the world gathered to share information on research and practice in forensic science. The AES 46th International Conference, Audio Foren- sics—Recording, Recovery, Analysis, and Interpretation, was the most recent AES event focusing on the field of audio forensic analysis and interpretation. The sequence of AES audio forensic conferences began in 2005 with the 26th AES Con- ference held in Denver. The 33rd Conference returned to Denver in 2008, followed by the 39th Conference that convened in 2010 in Hillerød, Denmark. The 85 registered participants included representatives from more than 15 coun- tries. The conference brought together an outstanding combination of practition- ers, researchers, law-enforcement professionals, attorneys, and many other experts and students all sharing an interest in the latest developments and contributions to audio forensic science made by AES members. CONFERENCE REPORT 718 J. Audio Eng. Soc., Vol. 60, No. 9, 2012 September AES 46 th International Conference 14–16 June 2012 Denver, CO, USA Audio Forensics—Recording, Recovery, Analysis, and Interpretation Sponsors

Audio Forensics—Recording, Recovery, Analysis, and ... · audio forensics meetings by recruiting an outstanding organiz - ing committee. Jeff M. Smith, conference chair, was joined

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Audio Forensics—Recording, Recovery, Analysis, and ... · audio forensics meetings by recruiting an outstanding organiz - ing committee. Jeff M. Smith, conference chair, was joined

In June 2012 the city of Denver, Colorado, welcomed the AES audio forensicscommunity back to the Mile-High City for another outstanding conference. Par-ticipants from all over the world gathered to share information on research and

practice in forensic science. The AES 46th International Conference, Audio Foren-sics—Recording, Recovery, Analysis, and Interpretation, was the most recent AESevent focusing on the field of audio forensic analysis and interpretation. Thesequence of AES audio forensic conferences began in 2005 with the 26th AES Con-ference held in Denver. The 33rd Conference returned to Denver in 2008, followedby the 39th Conference that convened in 2010 in Hillerød, Denmark.

The 85 registered participants included representatives from more than 15 coun-tries. The conference brought together an outstanding combination of practition-ers, researchers, law-enforcement professionals, attorneys, and many other expertsand students all sharing an interest in the latest developments and contributions toaudio forensic science made by AES members.

CONFERENCE REPORT

718 J. Audio Eng. Soc., Vol. 60, No. 9, 2012 September

AES 46th International Conference

14–16 June 2012Denver, CO, USA

Audio Forensics—Recording, Recovery, Analysis, and Interpretation

Sponsors

Page 2: Audio Forensics—Recording, Recovery, Analysis, and ... · audio forensics meetings by recruiting an outstanding organiz - ing committee. Jeff M. Smith, conference chair, was joined

CONFERENCE REPORT

J. Audio Eng. Soc., Vol. 60, No. 9, 2012 September 719

The Warwick Hotel, adjacent to theNational Center for Media Forensics Statue of a cowboy riding a bucking

bronco portrays Denver’s pioneeringspirit.

Hosted by

Page 3: Audio Forensics—Recording, Recovery, Analysis, and ... · audio forensics meetings by recruiting an outstanding organiz - ing committee. Jeff M. Smith, conference chair, was joined

The many months of planning prior to the 46th Conferenceinvolved members of the AES Technical Committee on AudioForensics, the AES Colorado Section, the AES University ofColorado Denver Student Section, and the AES headquartersstaff. The 46th Conference emulated the success of the prioraudio forensics meetings by recruiting an outstanding organiz-ing committee. Jeff M. Smith, conference chair, was joined bypapers cochairs Catalin Grigoras and Durand Begault, workshopscochairs Eddy Bøgh Brixen and Christopher Peltier, treasurer JoeErickson, facilities and registration coordinator Leah Haloin,Social Coordinator Kellyn Smith, and local committee chairsWanda Newman and Ann Sanders. The committee’s superlativeefforts created a top-notch technical program and a fun andcollegial conference atmosphere.

The conference venue was the Warwick Denver Hotel, locatedin bustling downtown Denver next door to the National Centerfor Media Forensics and only about three blocks from theColorado State Capitol building. The accommodations and meet-ing facilities were well-appointed and comfortable. The attendeestook advantage of the traditional AES international conferenceformat, which is deliberately designed to provide opportunitiesfor small-group interaction and easy face-to-face discussionamong the participants. It was clear that the attendees made fulluse of the conference venue’s fine features.

CONFERENCE OPENINGThe conference opened on Thursday morning, 14 June, with a verypleasant mid-summer day. Despite the visibility of smoke from theHigh Park wildfire in the mountains 120 km (75 miles) northwestof Denver, the conditions downtown in the Mile High City wereclear and very comfortable. The formal program began with intro-ductory remarks by Jeff Smith, conference chair. Smith, associatedirector of the National Center for Media Forensics of the Univer-sity of Colorado Denver (NCMF UCD), gave a warm welcome andintroduction to the participants, sponsors, and special guests andprovided a compelling overview of the conference. Smith thankedthe local organizing committee and volunteers from the AES

Colorado Section and theUCD AES Student Section.Smith introduced RoderickNairn, UCD provost and vice chancellor for academicand student affairs, whoadded his official words ofwelcome on behalf of theUniversity of Colorado Denver campus.

Key sponsors andexhibitors for the AES 46thConference included Agnitio,Blue Collar Audio, Cognitech,Digital Audio Corporation,and AES Sustaining MemberiZotope, Inc. In addition tobrief opening presentations,the exhibitors each providedinformation and hands-ondemonstrations throughoutthe conference in the posterpresentation area immedi-ately adjacent to the mainconference room.

Jeff Smith acknowledged Geoffrey Stewart Morrison, who hadprovided a special pre-conference evening event entitled“Workshop on Validity and Reliability in Forensic VoiceComparison.” He also acknowledged the NCMF staff for provid-ing special pre- and postconference training courses entitled“Forensic Authentication of Digital Audio” and “Forensic AudioEnhancement.”

KEYNOTE LECTUREPhil Mellinger, chief scientist at Trusted Knight Corporation,provided a special opening keynote presentation based upon hispublished article from Forensic Magazine (vol. 8, no. 1, Feb/Mar2011, pp. 19–24) about his recent work investigating the infa-mous “18½ minute gap” in one of the key Watergate tapesrecorded June 20, 1972, from the Nixon White House. In 2004,Mellinger became interested in some of the unresolved mysteriesof the Watergate era, such as the unknown (at that time) identityof the individual known as “Deep Throat” who had providedinformation to Washington Post reporters Bob Woodward andCarl Bernstein. Even after Deep Throat’s identity was revealed in2005 to be W. Mark Felt, FBI associate director (1971–1973), theWatergate topics remained of interest to Mellinger, and heexplained how he became particularly curious about the mysteryof how the 18½ minute gap occurred. Various theories have beenproposed over the years about who might have been involved inthe erasure besides President Nixon’s secretary, Rose MaryWoods. Mellinger presented his evidence that the “gap” tape waserased in three separate stages: first a 4-minute-35-second por-tion using a recorder in Rose Mary Woods’ office, a second 12-minute-46-second erasure that took place somewhere other thanWoods’ office, and a final 1-minute-9-second erasure while againin Woods’ office.

TECHNICAL PROGRAMThe conference papers co-chairs, Catalin Grigoras of theNational Center for Media Forensics, Denver, CO, and DurandBegault of Charles M. Salter Associates, San Francisco, CA, puttogether a wide range of paper sessions covering audio forensicresearch and practice. The conference workshops co-chairs,Eddy Brixen of EBB-Consult, Denmark, and Christopher Peltierof Charles M. Salter Associates, San Francisco, did an equallyfine job developing a fascinating slate of topics and workshoppresenters.

CONFERENCE REPORT

720 J. Audio Eng. Soc., Vol. 60, No. 9, 2012 September

Phil Mellinger during his keynote address on the Watergate tapes.

Jeff Smith, conference chair, opensthe event.

Roger Furness, AES deputy director,welcomes delegates to Denver.

Page 4: Audio Forensics—Recording, Recovery, Analysis, and ... · audio forensics meetings by recruiting an outstanding organiz - ing committee. Jeff M. Smith, conference chair, was joined

CONFERENCE REPORT

722 J. Audio Eng. Soc., Vol. 60, No. 9, 2012 September

Workshop 1: Forensic AudioEnhancementThe technical portion of the conferencebegan with a special workshop session onaudio enhancement, moderated by Christo-pher Peltier. The first presenter on the work-shop was Mark Huckvale of University Col-lege London, London, UK, on the topic ofenhancement of speech in noise. Huckvaledescribed several research projects con-ducted by the Centre for Law EnforcementAudio Research (CLEAR), a joint researchcenter of University College London andImperial College London. Among their find-ings was a way to describe the effect of audioenhancement processing as a “noise” shift inthe psychometric function, which helpsdescribe the common situation in whichenhanced audio is judged to be of better per-ceptual quality than the original signal, butthe intelligibility of the speech is actuallydegraded. Researchers at the CLEAR centerperformed a study with twelve enhancementmodules from five commercial audioenhancement systems over the full range ofeach systems’ control parameters and foundthat all had at least one set of parametersthat slightly increased intelligibility (up to2 dB improvement), but the modules alsohad settings that degraded intelligibility by an even greater amount (up to 3.8 dBdegradation). Based on these findings, Huck-vale’s group recommends that forensic exam-iners be very careful about choosing theappropriate product and parameter settingsto suit the signal and the processing scenario.

The second workshop presenter was EddyBrixen, who provided a very interesting set ofrecommendations regarding the preparationof audio material for presentation tountrained listening panels, such as juries incourt proceedings. Listeners in these situa-tions are generally not screened for hearingimpairment, and the playback circumstancesand acoustical surroundings are typicallyuncontrolled. Brixen’s recommendationsinclude the need to explain carefully what isto be heard, to make sure the playback levelis appropriate, and to review the examplesusing the same equipment that will be usedin court.

Paper Session 1: Enhancement of Forensic Audio IAfter the opening lunch break on Thursday,the conference paper presentations beganwith the topic of audio enhancement. Thefirst paper on the session concerned errorconcealment in audio signals. The work byStephan Preihs, Fabian-Robert Stöter, andJörn Ostermann of Leibniz Universität Han-

nover, was entitled “Low Delay Error Con-cealment for Audio Signals.” The presenta-tion described two model-based methodssuitable for real-time extrapolation of miss-ing audio samples in a stream using Kalmanfiltering or variable-order linear prediction.Although not specifically tied to audio foren-sics, the error concealment concepts couldbe applied to recover degraded audio fromforensic sources.

The second enhancement paper was enti-tled “Music and Noise Fingerprinting andReference Cancellation Applied to ForensicAudio Enhancement,” by Anil Alexander andOscar Forth of Oxford Wave Research, andDonald Tunstall of Digital AudioCorporation. The authors described an inter-esting method to cancel interfering sound ina forensic audio recording if the unwantedsound is from a known source, such as acommercial music recording or an archivedbroadcast. Their approach is to identify theinterfering sound material, time-align theknown material with the forensic recording,and then use a least-mean-square (LMS)algorithm to model adaptively the acousticaleffects of the room in which the recordingwas made. The approach can also be appliedif simultaneous recordings are made in thesame room using spatially separated micro-phones or even using completely separaterecording systems (e.g., two smartphones).As long as the time-varying time alignmentof the recordings can be estimated, thecancellation quality can be very good.

Poster Session 1: MiscellaneousTechniquesA scheduled break in the middle of the after-noon provided an opportunity for the partic-ipants to enjoy some snacks and beverageswhile visiting with the conference exhibitorsand the first poster session authors.

A poster entitled “Tone Removal Using aBand Focus Speech ReconstructionAlgorithm” by Darren M. Haddad andAndrew J. Noga of the Air Force ResearchLaboratory, Rome, NY, presented a methodto remove very narrowband interferencefrom audio recordings using a super-resolution spectrum-analysis technique.

Two of the posters dealt with the observa-tion that some digital audio recorders andcoding algorithms might introduce a system-atic DC offset that might be useful for audioauthentication. “The Effects of AudioCompression Algorithms on DC Offset,” byDaniel Fuller, and “The Gain Effect: How Doesit Affect the DC?” by Sean Jacobson, both ofthe National Center for Media Forensics,Denver, CO, reported on experiments to meas-ure and classify the offset characteristics.

Authors, from top: MarkHuckvale, Richard Conners,Daniel Rappaport, GeoffreyStuart Morrison, Keith McElveen

Page 5: Audio Forensics—Recording, Recovery, Analysis, and ... · audio forensics meetings by recruiting an outstanding organiz - ing committee. Jeff M. Smith, conference chair, was joined

CONFERENCE REPORT

J. Audio Eng. Soc., Vol. 60, No. 9, 2012 September 723

The fourth poster in session 1 was “Designing an AutomatedGunshot Detection and Image Response System,” by Jordan R.Graves, also of NCMF, Denver. Graves was unfortunately unable toattend the session, but his poster described plans for a camerasystem to be steered by acoustic detectors and triangulation.

Paper Session 2: Enhancement of Forensic Audio IIFollowing the enjoyable break and poster session, the late after-noon technical session convened with two paper presentationson the topic of audio enhancement. The first paper, “EnhancingLow SNR Speech Corrupted by Non-Stationary Tonal Noises,” byScott Nordlund and J. Keith McElveen, covered a preprocessingtechnique to model the tonal peaks in the noise spectrum underthe assumption that the noise characteristics vary more slowlythan the desired speech spectrum. The process reduces(whitens) the noise spectrum by reducing the spectral peaks.The resulting signal is then more suitable for subsequent pro-cessing with spectral subtraction or some other noise reductiontechnique.

The next paper entitled “Effects of Replay on the Intelligibilityof Noisy Speech” was presented by Mark Huckvale of UniversityCollege London. The paper described a very interesting experi-ment to test the typical intuition that repeated listening to anoisy speech recording will improve its intelligibility. Theresults indicate that listeners do get improvement by listeningmore than once, but the improvement does not continue toincrease after four or five repetitions, even though the listenersthink that their performance continues to increase upon hear-ing even more repetitions. Huckvale explained that the intelligi-bility improvement attributable to the multiple repetitions isapproximately the same as a 1.5 dB improvement in signal-to-noise ratio (SNR).

Upon the conclusion of the first day of the conference, theattendees gathered for conversation, snacks, and beverages at acocktail reception in the 15th floor ballroom. Amid views of theeast-central Denver neighborhoods, the AES conference partici-pants enjoyed the opportunity to relax and unwind while engag-ing one another in lively discussions of the day’s topics and

presentations. The combination of first-time attendees and AESmembers who have attended numerous AES conferencesprovided an excellent setting for collegial discussion.

Paper Session 3: ENF Analysis IFriday, June 15, 2012, opened with a fine informal buffet break-fast and coffee service. The technical sessions began with a set oftechnical papers covering various aspects of Electrical NetworkFrequency (ENF) analysis. ENF refers to the tell-tale presence of“hum” in an audio recording due to leakage or coupling of thealternating current (AC) signal from the electrical power gridinto the audio circuits. If an ENF signature is detected in theaudio, it is possible to compare the subtle frequency fluctuationsin the ENF signal to a database of electrical grid frequency his-tory information and thereby estimate the date and time atwhich the audio recording was made. The electrical grid fre-quency tends to vary around its nominal value (e.g., 60 Hz inthe U.S., 50 Hz in Europe) due to the normal, random variationin generator capacity and electrical load changes on the grid.

The first ENF paper “Phase & Amplitude Analysis of the ENFfor Digital Audio Authentication,” was by Sean Coetzee of PrismForensics, Los Angeles, CA. Coetzee described his work in exam-ining the phase of the extracted ENF signal to detect discontinu-ities or phase “jumps” that might indicate tampering. Next,Richard Conners of Virginia Tech, Blacksburg, VA, presented apaper entitled “Effects of Oscillator Errors on ENF Analysis,”which investigated the effect of crystal oscillator discrepanciesbetween the recording device and the ENF reference database. Ifthe recorder’s clock crystal is inaccurate, a systematic frequencyoffset will be present between the ENF extracted from the audiosignal and the separately determined grid reference. Conners’research team has developed a reliable procedure to detect andcompensate for the discrepancies attributable to the crystaloscillator frequency errors.

The third paper on the ENF session was by Harrison Archer ofthe National Center for Media Forensics, Denver. Archer’s paper,“Quantifying Effects of Lossy Compression on ENF Signals,”reported on his MS thesis work that examined the effects of ten

compression algorithms on 100 different hoursof recorded ENF signals sampled over a five-month period. Archer found that the ENF infor-mation appeared satisfactory for automatedmatching with eight of the ten tested codingalgorithms. The test was done on ENF signalsalone, and therefore future work will be neededto determine the effect on the low-frequencyENF band when regular audio material is pres-ent, especially in the case of perceptual audiocodecs.

The final paper on the Friday morning sessionwas “A Study of the Accuracy and Precision ofQuadratic Frequency Interpolation for ENFEstimation,” presented by Richard Conners ofVirginia Tech. The research involved the use ofparabolic (second-order) interpolation ofdiscrete Fourier transform (DFT) magnitudespectra to refine the frequency estimate forunderlying signal components presumed to bequasi-sinusoidal, such as the ENF component inan audio recording. The work largely duplicatedand confirmed the results from prior studies insinusoidal analysis dating back to the 1980s.

Delegates concentrate hard during a presentation on Electrical Network Frequencyanalysis.

Page 6: Audio Forensics—Recording, Recovery, Analysis, and ... · audio forensics meetings by recruiting an outstanding organiz - ing committee. Jeff M. Smith, conference chair, was joined

CONFERENCE REPORT

724 J. Audio Eng. Soc., Vol. 60, No. 9, 2012 September

Workshop 2: WinHex for Forensic Audio AnalysisThe Friday morning program concluded with a special workshoppresentation by Doug Lacey of BEK TEK LLC, Clifton, VA. Laceyprovided an overview and simple tutorial of digital forensics fea-tures of the WinHex software package. WinHex is a product of X-Ways Software Technology AG of Cologne, Germany. WinHexprovides a wide variety of software features for displaying binarydigital file contents in hexadecimal form, giving a visual repre-sentation of the file contents. Lacey explained several forensicanalysis situations in which the interpretation of digital file con-tents can be helpful, such as interpretation of unknown file for-mats and proprietary coding methods.

Paper Session 4: ENF Analysis IIAfter an enjoyable lunch break, the attendees reconvened for theFriday afternoon paper presentations to hear about additionalresults on ENF analysis. The first paper, “Advances in ElectricNetwork Frequency Acquisition Systems and Stand Alone ProbeApplications for the Authentication of Digital Media,” was pre-sented by Chris Jenkins of Blue Collar Audio, Denver, CO. Jenkins described several important considerations necessary toensure the quality and integrity of ENF data acquisition andstorage systems for use as reference databases. In particular,Jenkins pointed out the issue of time base and sampling syn-chronization, since the digital sample clock timing in most ana-log to digital conversion systems may not be sufficiently accu-rate for long-term timing reliability.

The second paper in the session was another presentation byRichard Conners of Virginia Tech. His presentation, “UsingSimple Monte Carlo Methods and a Grid Database to Determinethe Operational Parameters for the ENF Matching Process,”described an investigation of how reliable the matching processmay be when comparing an ENF record extracted from an audiorecording with the reference database. Conners pointed out thatthe extracted ENF will have noise and distortion that couldreduce the confidence of finding a match between the eviden-tiary recording and the database. Additional work will be neededto quantify the typical noise characteristics of extracted ENFsequences, and to determine the degree to which the noiseaffects a forensic examiner’s interpretation of the ENF results.

The ENF session concluded with a paper authored by CatalinGrigoras and Jeff Smith of the National Center for Media

Forensics entitled “Advances in ENF Analysis for Digital MediaAuthentication.” Grigoras explained a series of investigationsinto the quality and reliability of ENF extraction and analysis. Apotentially large range of candidate matches occur whencomparing an extracted ENF segment with a reference databaseusing crosscorrelation or using mean quadratic difference. Thelength of the evidentiary recording and the criteria used toestablish a match can have a large effect on the reliability of theprocess.

Poster Session 2: ENF AnalysisThe mid-afternoon break for refreshments, exhibits, and postersincluded three presentations on ENF analysis. A poster byNicholas Ng of the National Center for Media Forensicsdescribed an assessment of the integrity and consistency of thetwo ENF databases currently being recorded in Denver and inLas Vegas. Alireza Sanaei of Anglia Ruskin University, Cam-bridge, UK, presented his work entitled “Tuning and Optimiza-tion of an ENF Extraction Algorithm,” which involved optimiz-ing ENF extraction by using the network frequency harmonics,not just the fundamental frequency. Given a certain frequencymeasurement precision, Sanaei showed that the fundamentalfrequency estimate is improved by looking at the harmonic fre-quencies and then dividing by the harmonic number. The thirdposter on the session was presented by Anthony Nash of CharlesM. Salter Associates, Inc., of San Francisco, CA. Nash’s presenta-tion, coauthored by Durand Begault and Christopher Peltier,considered how best to quantify the accuracy and precision ofENF measurements, particularly the ability to trace the fre-quency measurement to a standard time reference. Nashexpressed his recommendation that the ultimate usefulness ofENF databases will require quantifying the measurement uncer-tainty of frequency counters and frequency measurement algo-rithms that are calibrated and traceable back to a national fre-quency standard.

Paper Session 5: Audio AuthenticationAs the conference momentum continued to build, the final Fri-day paper session comprised a collection of four papers in thearea of audio interpretation and authentication. The first paper,“Analytical Framework for Digital Audio Authentication,” waspresented by Daniel Rappaport of the National Center for MediaForensics. Rappaport gave an interesting overview of theauthentication opportunities and challenges in the era of digitalaudio recording, editing, and processing. The second sessionpaper was by Hafiz Malik of the University of Michigan-Dear-born, Dearborn, MI, who presented “Microphone IdentificationUsing Higher-Order Statistics.” Malik’s work involved a set ofexperiments to determine if there are sufficient unique artifactsto be found in an audio recording that are attributable to themicrophone used to make the recording. The goal is to have amethod to detect forgeries or fabricated evidence by notinginconsistencies between the recorded signal and the micro-phone characteristics. Malik explained that this is a difficultproblem because it is not easy to separate the distortion of themicrophone from the spectral characteristics of the signal itself.Further work will be needed to deal with the many variablesinvolved in the signal chain of even a seemingly simple audiorecording, and to handle the nonlinear aspects that may influ-ence the higher-order statistical methodology.

The third presentation was entitled “Using Ripple Signals forthe Authentication of Audio,” by Dagmar Boss of the Bayerisches

Interested delegates study the poster on ENF by Nash, Begault, and Peltier.

Page 7: Audio Forensics—Recording, Recovery, Analysis, and ... · audio forensics meetings by recruiting an outstanding organiz - ing committee. Jeff M. Smith, conference chair, was joined

CONFERENCE REPORT

J. Audio Eng. Soc., Vol. 60, No. 9, 2012 September 725

Landeskriminalamt (Bavarian State Criminal Police Office),München (Munich), Germany. Boss explained that specialsignaling tones are used in the electrical power system in somecountries such as Germany, Australia, New Zealand, SouthAfrica, and the United Kingdom. The special signaling tones,known as ripple signals, are in the 100 Hz to 2 kHz range, andare used to provide load management of the electrical grid.Devices attached to the power grid can detect the ripple signalsand adjust their load behavior, for example, by shutting downnonessential systems during peak load conditions. Just like theENF signals, the ripple signals can be coupled into an audiorecording system and may be detectable in the recorded audiodata. Boss suggested that the ripple signals could potentiallyprovide an additional piece ofevidence for assessing digital audioauthenticity.

The f inal paper on the Fridaysession was “Evaluation of theAverage DC Offset Values for NineSmall Digital Audio Recorders,” byBruce Koenig of BEK TEK LLC,Clifton, VA. Koenig described theresults of an experiment to see ifrecordings made with nine differentoff-the-shelf digital audio recordersmight exhibit DC offset characteris-tics that would be meaningful forevaluating forensic audio authentic-ity. Unfortunately, the preliminaryresults were found to have inconsis-tencies and a sufficiently large vari-ance that would likely rule out thisapproach for authenticity assess-ment.

Friday Social Event: BanjoBilly and Katie MullenEager to wind down after a coupledays of intense learning and discus-sion, the conference attendees wel-comed the opportunity to enjoy aninformal social event on Fridayevening. The conference organizerschartered an excursion throughdowntown Denver with Banjo Billy,a local Denver personality whoguides historic tours on a bus modi-fied to look like the ramshackle home of a hillbilly. The buspicked everyone up at the hotel and then drove around thedowntown area while Banjo Billy shared Denver history, jokes,ghost stories, and other local lore while the riders enjoyed aselection of beverages obtained from several Colorado breweriesand wineries.

Fortunately, the all-too-soon conclusion of the bus tour didnot signal the end of the festivities, as the group next enjoyed afine reception and dinner at Katie Mullen’s Irish Restaurant andPub, located on the fashionable 16th Street Mall in the SheratonDenver Downtown Hotel. Following dinner, many conferenceparticipants headed for a stroll along the 16th Street Mall toenjoy the sights and sounds of a comfortable Friday evening,before walking a few blocks back to the conference venue at theWarwick Hotel.

Paper Session 6: Miscellaneous TechniquesSaturday, June 16, 2012, marked the third and final day of theconference. With the Friday evening social activities still freshin everyone’s memory, the breakfast buffet and a large mug ofcoffee or tea provided a nice way to gear-up for the technicalsessions.

The opening session on Saturday included four presentationson miscellaneous audio forensics techniques. The first presenterwas Mark Huckvale of University College London, who gave alively and informative talk entitled “Effectiveness of ElectronicVoice Disguise Between Friends.” The paper dealt with the audioforensic situation in which the identity of a witness is to bedisguised to protect him or her from potential retribution. The

challenge is to maintain intelligi-bi l ity while concealing thewitness ’ identity even i f thedefendant is actually well-acquainted with the witness.Huckvale described an experi-ment in which a group ofstudents who al l knew each other very well were asked toidenti fy classmates based onaudio recordings of each studentreading a short script . Thestudents had a 90% success ratein identifying one another withundisguised playback of therecordings.

The researchers found that evenwith rather extreme pitch andsimulated vocal tract lengthchanges, such as a shift of 8 semi-tones (frequency shift factor of1.6) and a tract length change of120%, identification performancewas still better than chance, andcertain distinctive talkers werefound to be very recognizable.Based on these results, Huckvalesuggested that there is a signifi-cant risk of identifying a familiartalker even with extensive vocaldisguise, and this risk should beconsidered very carefully by theprosecuting authority.

Next, Keith McElveen of WaveSciences Corporation, Charleston, SC, presented a paper entitled“GMM-Based Efficient Language Identification,” that describedthe use of Gaussian Mixture Models (GMMs) for statistical classifi-cation and identification from a recording of speech uttered in anunknown language. McElveen explained the use of a perceptualminimum variance distortionless response (PMVDR) algorithmfor the processing front-end. The system successively comparesthe input speech segment to multiple GMMS, with each GMMtrained for a particular language, then selects the best match. Theresults were very good using the 1994 Oregon Graduate Institute(OGI) Multilanguage Telephone Speech database, with somewhatlower performance using the African Speech Technology (AST)telephone speech database. McElveen suggested that the lowerAST performance was due to a mixture of dialects present in eachof the AST language groups.

Exhibitors iZotope (top) and Cognitech (bottom) showtheir analysis and capture tools to delegates.

Page 8: Audio Forensics—Recording, Recovery, Analysis, and ... · audio forensics meetings by recruiting an outstanding organiz - ing committee. Jeff M. Smith, conference chair, was joined

CONFERENCE REPORT

726 J. Audio Eng. Soc., Vol. 60, No. 9, 2012 September

The third paper, “Automatic Search and Classification ofSound Sources in Long-Term Surveillance Recordings,” wasauthored by Robert C. Maher and Joseph Studniarz of MontanaState University, Bozeman, MT. Maher explained the challengesinvolved in forensic interpretation of audio surveillance record-ings that are days, weeks, or even months in duration. Maher’sresearch has employed automatic search methods based onspectral templates and two-dimensional (time–frequency) filter-ing to identify changes in the sonic texture. Maher showed anexample experiment that found a probable gunshot within a 30-hour recording.

The final paper on the Saturday morning session was “Carvingand Reorganizing Fragmented MP3 Files Using Syntactic andSpectral Information,” presented by Sascha Zmudzinski of theFraunhofer Institute for Secure Information Technology,Darmstadt, Germany. File carving refers to the recovery andreconnection of deleted or damaged files in a digital storagesystem, such as a hard disk drive. Modern computer operatingsystems typically are able to utilize noncontiguous sectors onthe disk for storing large files, so a single file may have sectionsstored in many different physical locations on the disk, which isknown as file fragmentation. If the file table is deleted ordamaged, it may be difficult to find and piece together the frag-mented information. Zmudzinski’s team has developed severaltechniques to perform file carving for MP3 files, including locat-ing the MP3 header frame codes, applying file structure andframe length rules, and then using spectral matching from theMP3 data to infer the file, frame, and sector continuity.

Workshop 3: Expert TestimonyThe third workshop for the conference commenced after a briefmorning break for refreshments and conversation. Jeff Smithmoderated the workshop, entitled “Forensic Science and ExpertTestimony: Insights and Strategies from the Advo-cate’s Perspective,” and introduced the three work-shop presenters.

The first presentation was in the form of a pre-recorded video produced by Peter Weinberg, a trialand litigation consultant with Litigare Litigation &Trial Design Consulting, LLC, of Denver, CO. Inthe video, Weinberg described the role of theexpert witness in legal proceedings and empha-sized the difference between advocacy for one sideor the other in the legal arguments versus beingan advocate for the truth. He recommended thatthe expert witness serve in the role of educator:working to be an effective teacher for the courtand for the jury. He expressed his opinion that agood educator must be credible, trusted, andauthoritative on technical matters, while alsoappearing friendly, interested, and engaged. Anexpert witness who is aloof, argumentative, orseems to be acting as an advocate for the client canoften be discounted by the jury.

The second presenter for the expert witnessworkshop was Joseph Mathews, director of govern-ment relations for Milliman Care Guidelines,Seattle, WA. Mathews was unable to be in Denver,but he provided his presentation via an interactiveSkype hookup. Mathews spoke about the impor-tance of expert witnesses in civi l l i t igation.Attorneys very quickly think about securing

expert testimony when they begin working on a new case.Depositions are very important in civil litigation and Mathewssuggested that expert witnesses be carefully prepared and rehearsed prior to a tough deposition, including being ofconsistent tone and demeanor during both direct and crossexamination.

The final presenter for the Expert Testimony workshop wasHenry R. Reeve, Denver District Attorney’s Office, who spokeabout the role of expert testimony in criminal cases. Reeveemphasized the high stakes involved in any criminal proceeding:the court must determine whether or not the government willbe allowed to deprive an individual of his or her basic rights andfreedom. Reeve recommended developing a clear understandingof each person’s role in the legal process. An expert witnessmust not try to “out-lawyer” the attorneys, as the roles of theexpert and the attorney are different in the court proceeding andjudges and juries are not interested in having expert witnessesspar with the legal counsel. He also explained that in most crim-inal legal venues in the United States the notes prepared by anexpert working on a criminal case are generally discoverable,meaning that the expert’s notes must be provided to the prose-cuting and defense attorneys.

Paper Session 7: Laboratory PracticesAfter a relaxing Saturday lunch break, the attendees reconvenedfor the final afternoon of the conference. David Hallimore of theHouston Police Department and Michael Piper of the U.S. SecretService provided information and remarks about the ScientificWorking Group on Digital Evidence (SWGDE) and its currenteffort to define a core-competencies document for forensicaudio examiners. Hallimore and Piper explained that the core-competencies document is being developed in response to theNational Academy of Sciences’ 2009 report “Strengthening

SOCIAL EVENTS

Banjo Billy’s Bus Tour as enjoyed by many delegates tothe 46th conference.

Page 9: Audio Forensics—Recording, Recovery, Analysis, and ... · audio forensics meetings by recruiting an outstanding organiz - ing committee. Jeff M. Smith, conference chair, was joined

CONFERENCE REPORT

J. Audio Eng. Soc., Vol. 60, No. 9, 2012 September 727

Forensic Science in the United States: A Path Forward,” thatstrongly criticizes the forensic sciences in general for a lack ofenforceable standards and accredited education and trainingopportunities. The SWGDE group is seeking additional feedbackfrom the audio forensics community as the core-competenciesdocument is revised and finalized.

Paper Session 8: Speaker Recognition and Comparison IThe afternoon presentations continued with a paper session onspeaker recognition. The first presentation was “Comparing Auto-

matic Forensic Voice Comparison Systems Under Forensic Condi-tions,” by Timo Becker of the Bundeskriminalamt (Federal CriminalPolice Office), Germany. The experiment compared seven custom andcommercial voice-comparison systems using audio material from theGerman Speech Database produced by the Bavarian State Police. Theresults showed that the seven systems all performed about equally,but each system reacted differently to changes in channel characteris-tics, spoken dialect, and speaker accent. Becker recommends thathuman forensic examiners must be involved as much as possible inassessing and evaluating the results of the automatic systems.

Next, Geoffrey Morrison of the University of New South Wales,

Top row, from left, Catalin Grigoras, Durand Begault, andDaniel Fuller (volunteer)

Middle row, from left, Wanda Newman and Ann Sanders, Jeff Smith, and Leah Haloin

Bottom row, Eddy Brixen (left) and Christopher Peltier (right),flanking author Mark Huckvale

CONFERENCE COMMITTEE MEMBERS

Page 10: Audio Forensics—Recording, Recovery, Analysis, and ... · audio forensics meetings by recruiting an outstanding organiz - ing committee. Jeff M. Smith, conference chair, was joined

CONFERENCE REPORT

728 J. Audio Eng. Soc., Vol. 60, No. 9, 2012 September

Australia, presented a paper under the intriguingtitle “What Did Bain Really Say? A PreliminaryForensic Analysis of the Disputed Utterance Basedon Data, Acoustic Analysis, Statistical Models,Calculation of Likelihood Ratios, and Testing ofValidity.” The audio material was from a notorious1995 murder case and 2009 retrial in New Zealand.The dispute centered on a marginally intelligibleutterance in an emergency-call-center recordingthat had highly conflicting interpretations by theprosecution and the defense. Morrison described hiswork to develop a likelihood ratio of the probabilitythat the word “shot” was uttered (prosecutiontheory) versus the probability that the word “can’t”was uttered (defense theory), using acousticalanalysis of that portion of the audio recording.Morrison concluded that the likelihood that thedefendant had uttered “can’t” was 31,000 timesmore likely than the likelihood that the word “shot”was spoken. He concluded his presentation withseveral recommendations for future investigationsof this type and he shared some concerns aboutensuring validity of the testing process.

Poster Session 3: Speaker Recognition and ComparisonFollowing the Session 8 paper presentations, theafternoon break and poster session continued thesame theme, with four fascinating posters covering topics in voicecomparison and voice recognition.

“Development of System for Forensic Applications Using SpanishWords,” presented by Jose Benito Trangol, Universidad NacionalAutonoma de Mexico, Mexico City.

“Case Studies from MPRJ (District Attorney’s Office of Rio deJaneiro, Brazil): Voice Disguise and Automatic Speaker Recognition(ASR),” presented by Eline Portela, Maria Gargaglione, and MônicaAzzariti, CSI/Ministério Público do Estado, Rio de Janeiro, Brazil.

“Influence of Recording Distance on Voice Quality for Use inSpeaker Recognition,” by Brian Prendergast, National Center forMedia Forensics, Denver.

“Synthetic Voice Forgery and Voice Comparison,” by GuillaumeGalou, Gendarmerie Nationale Criminal Research Institute, France.

The presenters provided many key insights into the voicecomparison field, and the attendees enjoyed many interesting andinformative conversations in the poster area and overflowing intothe lobby corridor.

Paper Session 9: Speaker Recognition and Comparison IIWrapping up the afternoon break and posters, Jeff Smith and theorganizing committee invited everyone to bring their beveragesand snacks back to the main conference room and reconvened thefinal paper session of the conference.

Anibal Ferreira, University of Porto, Portugal, explained his workanalyzing speech recordings for tell-tale phase changes that may beuseful in speaker identification. His presentation, “SpeakerIdentification Using Phonetic Segmentation and NormalizedRelative Delays of Source Harmonics,” reported on the use ofphonetic segmentation and phase extraction for speaker IDpurposes, particularly using the vowel portions of the speechsegments. Ferreira has found that including phase information canresult in better classification performance.

The final paper, “Securing Speaker Verification System AgainstReplay Attack,” was presented by Hafiz Malik of the University ofMichigan-Dearborn. Malik explained that some types of speakerverification systems are used to grant access to restricted facilitiesor online data transactions, but a potential weakness of suchsystems is a replay attack in which a prior recording of the author-ized talker is played back in an attempt to fool the verificationsystem. Malik’s preliminary work has been focused on modelingmicrophone nonlinearities and higher-order statistics to detectdifferences between a live-spoken phrase and a phrase played backfrom a prior recording.

AES AUDIO FORENSICS: SETTING THE STANDARDThe AES 46th Conference continued the tradition from the threeprior AES forensics conferences with superior audio forensicspapers and workshops. AES reaffirmed its place as the leading pro-fessional group in the field of forensic audio analysis and interpre-tation. Jeff Smith, conference chair, and Roger Furness, AESdeputy director, expressed thanks and praise to the conferencecommittee, and encouraged all participants to be active AES mem-bers and to continue presenting their work at future AES events.

It is a great time to be involved in the emerging field of audioforensics. The 46th Conference attendees headed home fromDenver invigorated with new and innovative ideas and in ardentanticipation of the next AES Conference on Forensic Audio.

Editor’s note: You can purchase a copy of the conferenceproceedings at www.aes.org/publications/conferences. Copies of individual papers are available at www.aes.org/e-lib.

Brian Prendergast of the National Center for Media Forensics expounds the ideas on hisposter.