Upload
dangkhuong
View
230
Download
2
Embed Size (px)
Citation preview
Holly5 VoiceXML Developer Guide Holly Voice Platform 5.1
Document number: hvp-vxml-0009
Version: 1-0
Issue date: December 22 2009
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
2/61
Copyright
© Copyright 2013 West Corporation. These documents are confidential and contain proprietary
information. No part of these documents may be reproduced, published or disclosed in whole or part,
by any means: mechanical, electronic, photocopying, recording or otherwise without the prior written
permission of West Corporation. or Holly Australia Pty Ltd.
The information contained in this document is strictly commercial in confidence and can only be
provided to persons who have signed a non-disclosure agreement. This document is not to be copied
without prior written consent.
Control
Version
Date
Change Notes
Author
1-0
22 Dec 2009
Approved for release
A Hunt
Related Documents
Document Title
Doc Number
Holly Voice Platform Release Notes – HVP Release 5.1
hvp-rpt-141
Holly Management System User Guide – HVP Release 5.1
hvp-hms-0013
Holly Voice Platform Operations Guide – HVP Release 5.1
hvp-0028
McGlashan et al., Voice Extensible Markup Language (VoiceXML) Version 2.0, W3C
Recommendation 16 March 2004, http://www.w3.org/TR/2004/REC-voicexml20-20040316/
VoiceXML 2.0
Oshry et al., Voice Extensible Markup Language (VoiceXML) 2.1, W3C Recommendation 19
June 2007, http://www.w3.org/TR/2007/REC-voicexml21-20070619/
VoiceXML 2.1
D. Kristol and L. Montuli, “HTTP State Management Mechanism”, RFC 2965, October 2000
RFC2965
T. Berners-Lee, R. Fielding, U.C. Irvine and L. Masinter, “Uniform Resource Identifiers (URI):
Generic Syntax”, RFC 2396, August 1998
RFC2396
McGlashan & Hunt, Speech Recognition Grammar Specification Version 1.0, W3C
Recommendation 16 March 2004, http://www.w3.org/TR/2004/REC-speech-grammar-
20040316/
SRGS 1.0
Burnett, Walker & Hunt, Speech Synthesis Markup Language (SSML) Version 1.0, W3C
Recommendation 7 September 2004, http://www.w3.org/TR/2004/REC-speech-synthesis-
20040907/
SSML 1.0
Tichelen & Burke, Semantic Interpretation for Speech Recognition (SISR) Version 1.0, W3C
Working Draft 3 November 2006, http://www.w3.org/TR/2006/WD-semantic-interpretation-
20061103/
SISR 1.0
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
3/61
Table of Contents
1. Introduction .......................................................................................... 6
2. VoiceXML Documents and Execution ............................................................ 7
2.1 Standards Compliance ........................................................................ 7
2.1.1 VoiceXML 2.0 & 2.1 Compliance .................................................. 7
2.1.2 Compatibility of VoiceXML 2.0 & 2.1 ............................................. 7
2.1.3 VoiceXML Extensions ................................................................ 7
2.1.4 Holly Voice Platform Dependent Behaviors ...................................... 7
2.2 VoiceXML Document Handling ............................................................... 7
2.2.1 VoiceXML Document Headers ...................................................... 7
2.2.2 Header and Parsing ................................................................. 8
2.2.3 Scope of VoiceXML Properties ..................................................... 8
2.2.4 Character Sets and Supported Languages ........................................ 9
2.3 Events .......................................................................................... 9
2.4 Errors ........................................................................................... 9
2.4.1 Fetch Errors .......................................................................... 9
2.4.2 Other Errors ........................................................................ 11
2.5 Fetching Behaviors .......................................................................... 12
2.5.1 Initial Document Failover ........................................................ 12
2.5.2 HTTP ................................................................................ 12
2.5.3 HTTPS ............................................................................... 13
2.6 HTTP User-Agent Header .................................................................. 13
2.6.1 DNS Resolving Behavior ........................................................... 14
2.6.2 Cookies ............................................................................. 14
2.6.3 Caching ............................................................................. 14
2.6.4 Disabling Caching .................................................................. 15
2.7 Default Property Values .................................................................... 16
2.8 Access Control on <data> .................................................................. 16
2.9 Browser Protections ........................................................................ 17
3. Input: Speech Recognition and DTMF ........................................................... 18
3.1 Selecting a Speech Recognizer ............................................................ 18
3.1.1 ASR Engine Switching ............................................................. 18
3.1.2 Allowed ASR Engines .............................................................. 19
3.2 ASR-Specific Behaviors ..................................................................... 19
3.2.1 Engine-Specific Properties ....................................................... 19
3.2.2 vLingo ............................................................................... 19
3.2.3 Loquendo ........................................................................... 19
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
3.3 ASR Sessions ................................................................................. 20
3.4 Timeouts (speech) .......................................................................... 20
3.5 Confidence Scores .......................................................................... 20
3.6 N-Best ........................................................................................ 20
3.7 Record ........................................................................................ 20
3.8 Recording User Utterances during Recognition ......................................... 21
3.9 Re-Recognition from Recorded Utterances .............................................. 21
3.10 Grammars .................................................................................... 21
3.10.1 Standard Grammars ............................................................... 21
3.10.2 Proprietary Grammars ............................................................ 22
3.10.3 External Grammars ............................................................... 22
3.10.4 Pre-built and Binary Grammars .................................................. 22
3.10.5 Universal Command Grammars .................................................. 22
3.10.6 Builtin Support ..................................................................... 22
3.10.7 Grammar Fetch Behavior ......................................................... 23
3.11 DTMF ......................................................................................... 24
3.11.1 interdigittimeout and termtimeout ............................................. 24
3.11.2 termchar ........................................................................... 24
3.12 DTMF Buffering. ............................................................................. 24
3.13 Holly DTMF Recognizer v2 ................................................................. 25
4. Output: Prompting and TTS ...................................................................... 27
4.1 Selecting a TTS Engine ..................................................................... 27
4.1.1 TTS Switching ...................................................................... 27
4.1.2 Using a Non-default TTS Voice .................................................. 27
4.1.3 SSML ................................................................................ 28
4.2 Audio Files ................................................................................... 28
4.2.1 Throwing Errors on Audio Fetch Failures ....................................... 28
4.3 <mark> Element ............................................................................. 29
5. Telephony: Session & Transfers ................................................................. 30
5.1 Session Variables ............................................................................ 30
5.1.1 Session Variables for Outbound ................................................. 31
5.2 Session.connection.aai Example .......................................................... 31
5.3 Passing Data Between Sessions ............................................................ 32
5.4 Transfers ..................................................................................... 32
5.4.1 Transfer Types ..................................................................... 32
5.4.2 Destination URIs ................................................................... 33
5.4.3 Transfer CLID ...................................................................... 33
5.4.4 Recognition During Transfer ..................................................... 33
5.4.5 Whisper Transfer .................................................................. 33
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
6. Logging ............................................................................................... 40
6.1 Events ........................................................................................ 40
6.1.1 Configuring Event Logging ....................................................... 43
6.2 <log> Element ............................................................................... 43
6.2.1 Label on <log> ..................................................................... 44
6.2.2 Changing the Event Type ......................................................... 44
6.2.3 Objects and Arrays ................................................................ 44
6.2.4 ECMAScript Log Function ......................................................... 45
6.3 Call Record: LOG_CALLS ................................................................... 45
6.4 Log Suppression ............................................................................. 46
6.4.1 Exceptions to Suppression ....................................................... 47
6.4.2 Record of Suppression ............................................................ 47
6.4.3 Logging Masked Data .............................................................. 48
6.5 Raising Alarms ............................................................................... 48
A. Appendix: Application Parameters .............................................................. 50
A.1 VoiceXML ..................................................................................... 50
A.2 Speech Recognition ......................................................................... 51
A.3 DTMF ......................................................................................... 52
A.4 Text to Speech .............................................................................. 52
A.5 Logging ....................................................................................... 52
A.6 Telephony .................................................................................... 53
B. Appendix: Re-Recognition from Recorded Utterance ....................................... 54
B.1 Re-recognition in VoiceXML Applications ................................................ 55
B.2 Prompts and Barge-in ...................................................................... 58
B.3 ASR Configuration ........................................................................... 58
C. Appendix: Holly DTMF Recognizer v2 .......................................................... 59
C.1 SRGS+XML .................................................................................... 59
C.2 Sample grammars ........................................................................... 59
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
1. Introduction
The purpose of this document is to provide a guide for VoiceXML developers who are constructing
VoiceXML applications to run on the Holly Voice Platform. It documents the characteristics of the Holly
VoiceXML implementation and details the supported VoiceXML extensions to the standard.
This document does not provide a full introduction to programming in VoiceXML and some knowledge of
the VoiceXML 2.0 & 2.1 standards is assumed. See
VoiceXML 2.0: http://www.w3.org/TR/2004/REC-voicexml20-20040316/ (16 March 2004)
VoiceXML 2.1: http://www.w3.org/TR/2007/REC-voicexml21-20070619/ (19 June 2007)
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
2. VoiceXML Documents and Execution 2.1 Standards Compliance
2.1.1 VoiceXML 2.0 & 2.1 Compliance
Holly is VoiceXML 2.0 and VoiceXML 2.1 conformant platform according to the following specifications.
VoiceXML 2.0: http://www.w3.org/TR/2004/REC-voicexml20-20040316/ (16 March 2004)
VoiceXML 2.1: http://www.w3.org/TR/2007/REC-voicexml21-20070619/ (19 June 2007)
The Holly Voice Platform supports all required capabilities defined in these standards and supports
most of the optional features as documented in this guide.
In terms of the VoiceXML architectural model, the Holly Voice Browser comprises a VoiceXML
Interpreter and a VoiceXML Interpreter Context integrated with the Holly Voice Platform.
2.1.2 Compatibility of VoiceXML 2.0 & 2.1
VoiceXML 2.1 is fully compatible with VoiceXML 2.0 so there is no requirement for application
migration. Further, Holly enables VoiceXML 2.0 and 2.1 applications to co-reside on the same platform
and even allows a single call or application to mix VoiceXML 2.0 and 2.1 content.
2.1.3 VoiceXML Extensions
Holly sees clear benefit to customers by providing faithful and compliant implementation of open
standards. In addition to having a Certified 100% Compliant implementation of the VoiceXML standard,
Holly has implemented limited extensions where required to provide customers with functionality that
cannot be directly implemented through the standard. These extensions are clearly labeled as such in
this document.
2.1.4 Holly Voice Platform Dependent Behaviors
In implementing the VoiceXML specification, Holly delivers a range of value-add capabilities that
enhance its utility for development and operations staff whilst maintaining full compatibility to the
standard.
Note: Some properties depend on the configuration of the Holly Voice Platform as described
elsewhere in this document. Platform administrators may choose to switch off support for
some features and therefore properties may have no effect despite being set correctly.
2.2 VoiceXML Document Handling
2.2.1 VoiceXML Document Headers
When using VoiceXML 2.1 features, the following declaration must be present in the document header:
<vxml version="2.1" xmlns="http://www.w3.org/2001/vxml">
Following is a sample VoiceXML 2.1 header:
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
<vxml xmlns="http://www.w3.org/2001/vxml"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3.org/2001/vxml
http://www.w3.org/TR/voicexml20/vxml.xsd"
version="2.1">
Following is a sample VoiceXML 2.0 header:
<vxml xmlns="http://www.w3.org/2001/vxml"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3.org/2001/vxml
http://www.w3.org/TR/voicexml20/vxml.xsd"
version="2.0">
2.2.2 Header and Parsing
For conforming validation of input VoiceXML documents, the ‘xmlns’ attribute must be supplied on the
root element of the document with the value “http://www.w3.org/2001/vxml” as required by
VoiceXML 2.0.
The handling of the ‘xmlns’ attribute is documented in the following table.
‘xmlns’ attribute value
Document Treatment
http://www.w3.org/2001/vx
ml
Indicates that the document is a Conforming VoiceXML 2.0 document. The Holly
Voice Platform performs strict document checking according to the standard.
<not defined>
The Holly Voice Platform assumes a VoiceXML 2.0 document and performs loose
validation of the document content.
<other namespaces>
The Holly Voice Platform throws an ‘error.badfetch’ event.
The application parameter ‘nonstrictvxml’ can be set to “true” to achieve loose validation of
document content even when the ‘xmlns’ attribute is present on the root element of documents. This
parameter affects the parsing of all documents loaded during a call, and also causes the ECMAscript
interpreter to ignore references to undefined properties.
The ‘nonstrictvxml’ parameter can only be set as an application parameter via the Holly Management
System; it cannot be set as a VoiceXML property.
With ‘nonstrictvxml’ set to true the parser uses DTD rather than VXML schema.
2.2.3 Scope of VoiceXML Properties
The VoiceXML property behaviors are fully implemented following Section 6.3 of VoiceXML 2.0.
Properties may be declared with application scope (in the root document), with document scope
(within a <vxml> element), or for a particular <menu>, <form>, or form item.
Properties apply to their parent element and all the descendants of the parent. A property at a lower
level overrides a property at a higher level. When different values for a property are specified at the
same level, the last one in document order applies. Properties specified in the application root
document provide default values for properties in every document in the application; properties
specified in an individual document override property values specified in the application root document.
Additionally, the Holly Voice Platform provides the ability to set properties at the session level
(interpreter context). Session properties are set as an Application Paramater on the Applications page
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
in the Holly Management System (administrators should refer to the Holly Management System User
Guide for details on modifying application parameters). The session scope is above the application
scope as defined in VoiceXML 2.0, section 5.1.2.
2.2.4 Character Sets and Supported Languages
By default the Holly Voice Platform assumes a language identifier of ‘en-AU’. Other values may be
specified in VoiceXML documents using the ‘xml:lang’ attribute, but they will result in
‘error.unsupported.language’ events being thrown unless they are also listed in the ‘Supported
Languages’ configuration parameter. Note that ASR and TTS engines may require additional
configuration to support other languages.
HVP 5.1 has support for the Latin-1 character set (ISO-8859-1) in grammar files, TTS strings, ECMAScript,
in line grammar content, and similar functionality, e.g. “é” as in “café”.
HVP 5.1 supports UTF-8 in recognizers, TTS, VoiceXML applications, grammars and reporting.
2.3 Events
Errors on Document Transition
The Holly Voice Platform handles document transition errors, typically ‘error.badfetch’, in the scope of
the calling document.
2.4 Errors
This section documents common errors generated by the Holly Voice Browser as viewable in the event
logs through the Holly Management System. For errors not described in this section, please contact
Holly Customer Support.
2.4.1 Fetch Errors
All fetch errors result in a fetch failure and an error.badfetch will be thrown within the application.
The web fetching architecture of the Holly Voice Browser is implemented in the four layers shown
below. Fetch errors propagate through these four layers resulting in multiple error reports for a single
anomalous event. Typically for any fetch anomaly there will be an error from each layer (in a few cases
there may be two from layer 3).
Layer
Error numbers
1 High level I/O layer
203, 211, 576
2 High level network I/O layer
204, 206, 228
3 HTTP protocol layer
217, 219, 221, 236, 237, 244, 249
4 Low level socket and SSL layer
241, 500, 700
Error
#
Distinguishing error text
Cause
Does Holly terminate call
Comments
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
Error
#
Distinguishing error text
Cause
Does Holly terminate call
Comments
203
Unable to open
URI
An attempt to fetch
a VoiceXML
document failed with
one of the errors
documented above
On first document
triggers failover;
on subsequent
documents
handled by
application
The specific error condition will be
reflected in other error messages.
204
Open error +
<URL>
An error occurred
while attempting to
open a connection to
the server
No
Investigate a network or application
server issue.
206
Read error +
rc=51
The Holly Voice
Browser timed out
during a read
operation on an
HTTP(S) connection
No
Investigate a network or application
server issue.
211
Audio fetch
failed + <URL>
The audio resource
could not be
retrieved
No
This is a “top level” error message;
other messages give more specific
explanation.
217
open failed
(internal error)
+ <URL>
Failed to open a
connection to the
server denoted by
the URL
No
Investigate a network or application
server issue.
221
Failed to get
HTTP status
The status line in the
HTTP response could
not be parsed
No
More specific errors will also be
reported; investigate a network or
application server issue.
228
Fetch operation
timed out
An attempt to fetch
a resource exceeded
fetchtimeout while
creating or reading
from a connection
No
Investigate a network or application
server issue, or change the fetchtimeout
property.
236
Fetch timeout
exceeded during
Open
An attempt to open a
connection exceeded
fetchtimeout
No
Investigate a network or application
server issue.
237
Fetch timeout
exceeded during
read
A connection was
created successfully,
but while attempting
to read from it
fetchtimeout was
exceeded
No
Investigate a network or application
server issue.
241
Socket error –
timeout
connecting to
socket
Failed to open a
connection to a
particular IP address
No
Investigate a network or application
server issue.
244
Socket error –
cannot connect
+ <URL>
An error, other than
a timeout, occurred
while connecting to a
server
No
More specific errors will also be
reported; investigate a network or
application server issue.
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
Error
#
Distinguishing error text
Cause
Does Holly terminate call
Comments
249
No data
available on
socket for
reading
An error occurred
while trying to read
the status line in the
HTTP response
No
Error 221 and more specific errors will
also be reported; investigate a network
or application server issue.
500
Socket
operations:
System call
failed +
errno=131
Socket read failed
because the
connection was
closed by the server
No
Typically caused by a network timeout
on an idle persistent connection.
576
Grammar fetch
failed + <URL>
An attempt to fetch
an external grammar
document failed with
one of the errors
documented above
No
The specific error condition will be
reflected in other error messages.
700
SSL operation
failed +
SSL_get_error=5
Premature close of
an SSL connection
No
There are two underlying causes of the
error: the remote end is not strictly
adhering to the SSL protocol, or an idle
persistent connection has timed out. In
the first case the error can be ignored.
2.4.2 Other Errors
Error
#
Distinguishing error text
Cause
Does Holly terminate call
Comments
205
Unable to parse
contents of URI.
The VoiceXML
document does not
consist of valid
VoiceXML.
On first document
triggers failover; on
subsequent
document fetches a
VoiceXML error is
raised and can be
handled by the
application
If the document dump call event is
enabled for the application, Holly
Management System will show the
errors in the document.
Correction to the VoiceXML document
is required.
212
Licence
Manager
connection
failed
The Holly Voice
Browser was unable
to communicate with
any of the configured
Holly Licence
Managers
Yes
Check that Holly Licence Managers
configured for Holly Voice Browser are
correct.
Check that Holly Licence Managers are
running.
517
Error activating
grammar
The ASR engine
returned an error
when the Holly Voice
Browser attempted
to activate a
grammar
No
There may be a semantic error in the
grammar (for example, incorrect tag
format), or if the default grammar
fetch style is used there may be
character encoding problems with
XML files.
Check the ASR engine logs for details
of the grammar error.
If the default grammar fetch style is
used, set
‘com.holly.grammarfetchstyle’ to
“absolute”.
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
Error
#
Distinguishing error text
Cause
Does Holly terminate call
Comments
550
OSBrec:
Recognition
failed +
reason=gramma
r error
Unknown or illegal
grammar not caught
as an activation error
The Holly Voice
Browser throws the
event
"error.recognition”.
If the application is
unable to handle
the event the call
terminates.
Review the grammar or the ASR
engine diagnostic logs for more
information.
550
OSBrec:
Recognition
failed +
reason=recogniti
on error
Unknown or illegal
grammar not caught
as an activation
error.
Or, other error raised
by the speech
recognizer.
The Holly Voice
Browser throws the
event
"error.recognition”.
If the application is
unable to handle
the event the call
terminates.
Review the grammar or the ASR
engine diagnostic logs for more
information.
2.5 Fetching Behaviors
2.5.1 Initial Document Failover
The initial URL for an application is configured through the Application page in HMS. Holly allows
multiple initial URLs to be specified. The platform starts by attempting to execute from the first URL.
Upon failure to execute the first URL the platform will failover to the second defined URL, then to the
third and so on.
Failover of the initial URL is triggered by any of the following:
• Failure to fetch an application’s initial document or its root document
• Failure to fetch the application’s initial document or root document before a timeout defined on
the Application page in HMS
• Failure to parse either the application’s initial document or root document.
If failover exhausts all declared initial URIs then the call is rejected. The default behavior is to play
the following error message: "We are currently experiencing technical difficulties. Please try again
later."
The default error message is configurable through the default error handler in the defaults document
specified by ‘uriplatformdefaults.browser’ (URI Platform Defaults). Other rejection behaviors can be
configured through HMS (see the Holly Operations Manual).
2.5.2 HTTP
The Holly Voice Platform supports HTTP/1.0 and HTTP/1.1.
HVP implements HTTP persistence for both standard HTTP and secure HTTPS.
HTTP Persistence
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
It is sometimes necessary to align the persistence of HTTP sessions of the VoiceXML browser and
application server. The platform may be configured by the platform adminstrator for one of three
modes of persistence with the 'httppersistencelevel' configuration property.
• 10 Request: connection is maintained for a duration of a single HTTP request
• 20 Session (default): connection is maintained for the duration of a VoiceXML session
• 30 Indefinite: connection is maintained indefinitely and may be used in subsequent calls
The Application Parameter inet.sessionpersistence may be set to overwrite the browser configuration.
A value of true is equivalent to Session level. A value of false in equivalent to indefinite.
HTTP Retries
The browser will always attempt a retry of HTTP or HTTPS GET or POST requests to a server if no data
has been sent to the server while attempting the connection. This is a safe retry as the server is
unaware of the browser's intention to send a request and thus the server doesn't change state of the
call (if any).
Application parameters control the retry policy in the situation when the browser was able to send a
partial or complete request to the server or received a partial or complete reply from the server but
subsequent communication failed on the TCP/IP level or there was a problem parsing the HTTP
response from the server. The browser will retry the request if a relevant parameter permits it.
Parameter
Description
Values
com.holly.retryget
Control HTTP GET request retry policy.
true
false (default)
com.holly.retrypost
Control HTTP POST request retry policy.
true
false (default)
In the cases of both safe and unsafe retries the browser will attempt the retry only once.
In the case of an unsafe retry the browser won't attempt retrying if it fails when reading the body of a
response from the server.
2.5.3 HTTPS
The Holly Voice Platform supports the ‘https’ URI scheme.
The HVP fully supports accessing VoiceXML documents over HTTP as well as HTTPS where a
secure/encrypted connection is required.
The Holly Voice Browser supports certificate-based authentication of a server and certificate-based
client authentication (the latter is at the server’s request). When configuring an HTTPS-based
VoiceXML application server for use with the HVP, please note that SSL certificate validation is
dependent on the application parameter ‘ssl.authenticateserver’ (“false” = no certificate validation,
“true” = perform certificate validation) which can be set on the HMS Applications page. The SSL
certificate chosen for use can be self-signed, or signed by an authority such as a Certificate Authority.
2.6 HTTP User-Agent Header
The Holly Voice Browser sets the ‘User-Agent’ header in HTTP requests to “HVP/5.1”. It is configurable
via the ‘HTTP User Agent’ configuration parameter. The User-Agent request-header field contains
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
information about the user agent originating the request. Refer to RFC 2616, Section 14.43,
http://www.w3.org/Protocols/rfc2616/rfc2616.html.
2.6.1 DNS Resolving Behavior
The default behavior is for all fetches to resolve DNS when performing the fetch.
Setting the application parameter “resolvehostnames=true” on the HMS Applications page resolves
hostnames on failover and all subsequent fetches will use the IP address(es) obtained from DNS
resolution.
2.6.2 Cookies
The Holly Voice Platform supports cookies as described in [RFC 2965]. Multiple cookies are sent as
separate Cookie request headers, not as a list in a single Cookie request header.
If the application parameter ‘singlecookieheader’ is set, and there is more than one cookie for an HTTP
request, the browser sends the cookies folded into a single HTTP Cookie header, as described in RFC
2965, section 3.3.4.
2.6.3 Caching
Document caching on the Holly Voice Platform is determined by the HTTP cache control headers
supplied by the application server. The VoiceXML cache control properties ‘documentmaxage’,
‘documentmaxstale’, ‘grammarmaxage’, ‘grammarmaxstale’, ‘datamaxage’, ‘objectmaxage’,
‘objectmaxstale’, ‘scriptmaxage’, and ‘scriptmaxstale’ can be used to control caching from within
VoiceXML applications. Refer to [VoiceXML 2.0] section 6.3.5.
HTTP headers give a lot of control over caching. Voice applications, CGI scripts, or Web server may
generate them in response to HVP browser requests (see the diagram above).
A typical HTTP response header might look like this:
HTTP/1.1 200 OK
Date: Thr, 22 Jun 2006 15:37:45 GMT
Server: Apache/1.3.3 (Unix)
Cache-Control: max-age=3600
Expires: Fri, 23 Jun 2006 21:30:45 GMT
Last-Modified: Mon, 19 Jun 2006 10:07:15 GMT
ETag: "3e95-520-33c6faaf"
Content-Length: 2040 Content-
Type: audio/x-wav
Cache related HTTP headers
Header
Semantic
max-age=seconds
Specifies the maximum amount of time in seconds when a cached copy will be
considered fresh. This directive is relative to the time of the request. Setting this
parameter to zero (max-age = 0) suppresses caching.
Example:
Cache-Control: max-age=60
s-max-age=seconds
Same as “max-age”. “s-max-age” takes precedence over “max-age” if both headers are
present.
Example:
Cache-Control: s-max-age=60
no-store
Instructs the cache not to keep a copy of the resource under any conditions
Example:
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
Header
Semantic
Cache-Control: no-store
no-cache
The cache doesn’t use the response without revalidation with the origin server. This
prevents caching or forces revalidation of the resource on every request.
Example:
Cache-Control: no-cache
Pragma: no-cache
Same as no-cache. Used in HTTP/1.0 protocol.
Example:
Pragma: no-cache
Expires: date
The field gives the date/time after which the response is considered stale. The cache
does not return a stale cache entry without revalidation with the origin server.
This header may be useful in some situations, but it has certain limitation. First it is easy
to forget to update this header, which will suppress caching after the set date. Second,
if the server clock and HVP clock are not synchronized then this header may cause an
undesired effect. The “max-age” directive, if specified, overrides the “Expires” header.
Example:
Expires: Fri, 23 Jun 2006 21:30:45 GMT
Last-Modified: date
Indicates the date and time at which the origin server believes the resource was last
modified.
If none of “Expires”, “max-age”, or “s-maxage” appears in the response, and the
response does not include other restrictions on caching, the cache computes a freshness
lifetime using a heuristic. The cache copy expires in a period which is 10% of the
difference between “now” time and Last-Modified time. E.g. if Last-Modified is 60
seconds before “now”, then the entry stales in approximately 6 seconds in a future.
Servers should send Last-Modified header whenever feasible to provide means of
validating of resources.
Example:
Last-Modified: Mon, 19 Jun 2006 10:07:15 GMT
private
Indicates that the response is intended for a single user and must not be cached by a
shared cache. HVP cache doesn’t store the response.
Example:
Cache-Control: private
ETag: entity-tag
Provides the current value of the entity tag. If a cashed copy of the resource is stale
then ETag value may be used for the resource validation.
Entity tag must change whenever the associated resource changes in any way. Servers
should send an entity tag unless it is not feasible to generate one.
Example:
ETag: "3e95-520-33c6faaf"
If no cache-related headers (excluding the “ETag” header) are specified in the response then the cache
treats it as not cacheable and does not save a copy of the response.
2.6.4 Disabling Caching
A common requirement during development is to disable caching so that all content is always fetched
from the application server. To achieve this set the maxage properties to zero, i.e. ‘audiomaxage’,
‘documentmaxage’, ‘grammarmaxage’, ‘scriptmaxage’.
These can be set as application parameters via the Holly Management System.
Alternatively they can be set in the VoiceXML application using the <property> element (typically in the
application root document so it affects overall behavior).
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
2.7 Default Property Values
The following are the factory-default settings for VoiceXML properties. The platform defaults may be
changed as a browser configuration (via HMS Configuration by the Administrator). The defaults for
individual applications may be changed by setting an Application Parameter via HMS Application
Configuration). Applications may also change the properties using the VoiceXML <property> element.
Key Default Value
audiofetchhint
prefetch
bargein
true
bargeintype
speech
confidencelevel
0.5
documentfetchhint
safe
fetchaudiodelay
2s
fetchaudiominimum
5s
fetchtimeout
7s
grammarfetchhint
prefetch
inputmodes
dtmf voice
maxnbest
1
objectfetchhint
prefetch
scriptfetchhint
prefetch
sensitivity
0.5
speedvsaccuracy
0.5
termchar
#
termtimeout
0s
universals
none
2.8 Access Control on <data>
Access control on <data> (see Section 5 of VoiceXML 2.1) allows an application server to indicate that
XML content is authorized for use by only selected applications. The Holly Voice Platform implements
this behavior to complement its multi-tenancy.
Access control element may specify virtual hosts as well as IP addresses and hostnames. Virtual host is
specified using Application, Affiliate and Service Provider name separated with dots. The attribute is
case insensitive.
<?access-control allow="application.affiliate.service_provider.host"?>
There are two exceptions to VXML 2.1 specification:
• * '.com' part of fully qualified name is omitted
• * if there is no access control instruction then access is allowed by default
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
2.9 Browser Protections
The Holly Voice Browser imposes the following constraints on running applications to ensure a rogue
application does not reduce the platform’s capacity to service other applications. Each constraint is
configurable via the Holly Management System (administrators should refer to the Holly Management
System User Guide for details on modifying configuration parameters).
Configuration Parameter
Description
Default Value
JavaScript Max Branches
A count is kept of each time a script jumps backward or returns from
a function; the count is not permitted to exceed this value.
100000
ECMAScript Max Object
Depth
When serializing objects -- for example, for logging or transferring
between execution contexts on return from a subdialog -- the
browser will generate an error if objects are nested to a depth
greater than this value.
10
Maximum Documents
The browser is not permitted to fetch more document instances than
this value.
500
Maximum Event Count
This value provides an upper limit to the number of events that can
be thrown by a particular condition within a single form; this value is
included to assist in the detection of infinite loops or bugs.
12
Maximum Event Rethrows
This value provides an upper limit to the number of times a
particular event can be rethrown.
6
Maximum Execution Stack
Depth
This value effectively limits the depth of subdialog calls within an
application.
5
Maximum Loop Iterations
This value limits the number of iterations of the form interpretation
algorithm on a single form.
100
Maximum Dialogs with no
User Input
The number of transitions between forms without entering a wait
state is limited to this value.
10
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
3. Input: Speech Recognition and DTMF 3.1 Selecting a Speech Recognizer
The Holly Voice Platform supports a broad range of speech recognition products. Developers should
contact the platform administrator to confirm which ASR products are available. Administrators should
refer to the ASR Configuration information in the Holly Voice Platform Operations Guide.
A single installation may be configured to support many speech recognizers. Holly allows for each
Application to select its preferred speech recognizer or it may use the platform’s default.
The default speech recognizer for an Application is determined by the value of the ‘asrengine’ property
(a VoiceXML property extension). The Application default may be set as an Application Parameter
through HMS or can be set using a VoiceXML <property> element with appropriate scope.
For example:
<property name="asrengine" value="dtmf"/>
The table shows the list of speech recognition products and the corresponding ‘asrengine’ value.
Vendor
ASR Engine
Value
Holly
Holly DTMF (Direct API)
dtmf
IBM
IBM Websphere Voice Server 5.1.3 (MRCP v1)
wvs513-mrcp1
Loquendo
Loquendo 7.8 ASR (MRCP v1)
loquendo-mrcp1
LumenVox
LumenVox
lumenvox-mrcp1
Nuance
Nuance 8.5 ASR (Direct API)
nuance
Nuance
Nuance 8.5 ASR (MRCP v1)
nuance85-mrcp1
Nuance
Nuance 9.0 ASR (MRCP v1)
nuance90-mrcp1
Nuance
OSR - Open Speech Recognizer (Direct API)
scansoft
Nuance
SpeechWorks Media Server 4.0 (MRCP v1)
swms40-mrcp1
Siemens
Siemens (MRCP v1)
siemens
Telisma
Telisma 1.3 Patch 1 (MRCP v1)
telisma-mrcp1
vLingo
vLingo Network ASR Service
vlingo
Note: The Nuance 8.5 ASR (via direct API) recognition interface is not available on Linux HVP
deployments.
Note: The ASR ‘asrengine’ values may be customized at a platform installation. The table shows the
default values which may not apply if customized.
3.1.1 ASR Engine Switching
The Holly Voice Platform performs all recognitions (voice or dtmf) using the default ASR engine. The
Holly Voice Platform permits switching between ASR engines within a call and within a single VoiceXML
document. To switch engines set the ‘asrengine’ in a <property> element with appropriate scope (e.g.
field, form or document scope).
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
3.1.2 Allowed ASR Engines
The Holly Voice Platform allows an Administrator to restrict an application to the use of specific named
speech recognizers. This is achieved by setting ‘gw.asrallowed’ as Application Parameter. The value is
a comma-separated list of allowed engines.
For example, setting “gw.asrallowed=nuance90-mrcp1,dtmf” means the ASR engine can only be
switched between Nuance 9.0 ASR (MRCP v1) and the Holly DTMF Recognizer.
3.2 ASR-Specific Behaviors
This section documents configuration and behaviors that are specific to the ASR products supported by
the Holly Voice Platform.
3.2.1 Engine-Specific Properties
The Holly Voice Platform allows for any Nuance property defined in the respective reference manuals
to be set using the VoiceXML <property> element and passed to the engine.
Recognizers
Property Prefixes
Example
Nuance Recognizer 9
Nuance OSR 3.0.x
Nuance SWMS
swirec_
swiep_
<property name="swirec_state_beam" value="-20"/>
<property name="swiep_audio_environment"
value="‘channel=cellular’"/>
A full property list is available in the Reference Guides for these
products.
Nuance 8.5
Nuance 8.5 MRCP
nuance.core.rec
nuance.core.ep
<property name="nuance.core.rec.GenEpFeedback"
value="’TRUE’"/>
<property name="nuance.core.ep.EndSeconds" value="’1.50’"/>
A full property list is available in Nuance 8.5 Documentation.
The developer must ensure the value they set the property to is valid for the underlying speech
recognizer. Standard VoiceXML properties are submitted to the speech recognizer before ASR-specific
properties, so ASR-specific properties will override any parameter mapping made from a VoiceXML
property to an ASR property.
3.2.2 vLingo
vLingo is a network-hosted speech recognition service that facilitates “open grammar” recognition with
very large vocabularies. Unlike the other speech products supported by Holly it does not provide
declarative grammar support (using SRGS or any other grammar standard).
The platform administrator can contact Holly Connects Support to obtain information on licensing
vLingo and documentation on writing VoiceXML applications with vLingo.
3.2.3 Loquendo
To support DTMF-only input with Loquendo ASR (inputmodes=dtmf) the system administrator must set
set the parameter enableDiscontinuousInboundStream=true in the Loquendo Management Console.
To support hash input with termination the system administrator must set the
dtmfNoMatchIfOnlyTermCharPressed=enable in the Loquendo Management Console. This parameter can
be found under Configuration | Advanced | MRCPv1Server.
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
3.3 ASR Sessions
ASR session IDs and the relevant recognizer name are logged as events in the Holly Management
System; this ID can be used to correlate ASR logs with HVP logs.
The event is logged on the first use of the speech recognizer (or DTMF) and each time the application
switches ASR engine.
The format of the log event is:
asrengine=<name>|sessionid=<id>|address=<host:port>|server=<server>|endpoint=<address>
3.4 Timeouts (speech)
As required by [VoiceXML 2.0], section 6.3.2, the Holly Voice Platform uses the maximum of the
‘completetimeout’ and ‘incompletetimeout’ properties as the actual value of the end of utterance
timeout during recognition. The default behavior of the platform is to use the larger of the two
properties ‘completetimeout’ and ‘incompletetimeout’ as the actual value of the end of utterance
timeout during recognition.
For recognizers that do enable the platform to distinguish ‘completetimeout’ and ‘incompletetimeout’,
the ‘com.holly.distincttimeout’ property can be set to “true” to permit the timeouts to be treated
differently.
3.5 Confidence Scores
The Holly Voice Platform sets ‘name$.confidence’ to the utterance confidence in a range of 0.0 to 1.0
as required by VoiceXML 2.0 (see Section 2.3.1). Although the Holly Voice Platform normalizes the
confidence value received from ASR engines to the VoiceXML 2.0 range, the specific interpretation of
values is relative to the ASR engine.
The default confidence level is 0.5. This value can be changed using the standard VoiceXML 2.0
<property> tag to set the ‘confidencelevel’ property (see VoiceXML 2.0, Section 6.3.2). The value can
also be changed for each application by setting ‘confidencelevel’ as an Application Parameter through
HMS.
3.6 N-Best
Following the VoiceXML 2.0 Specification, the N-best list contains a list of recognition results matching
all active grammars ordered by their confidence score (highest confidence to lowest). Active
grammars include those explicitly specified by the VoiceXML application plus application requested
platform grammars such as links, universals, menus and field options.
For application-defined grammars the slot filling follows the explicit specification of the application-
supplied grammar. For platform-generated grammars there is no standard as to how the slots should
be filled for those grammars (if any slots at all) so it is recommended that applications do not rely on
slots for those grammars.
3.7 Record
Recordings are stored as WAV (RIFF header) 8kHz 8-bit mono mu-law single channel.
The ‘maxtime’ attribute on the ‘<record>’ element defaults to 300 seconds.
The Holly Voice Platform does not currently support speech recognition on <record>.
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
DTMF recognition during record is supported by the behavior is determined by the currently selected
speech recognizer as follows:
• The MRCP recognizers handle arbitrary DTMF grammars
• The API recognizers handle a single digit DTMF grammar
• The Holly DTMF Recognizer handles arbitrary DTMF grammars.
3.8 Recording User Utterances during Recognition
The Holly Voice Platform permits user utterances to be recorded during recognition as per VoiceXML
2.1 (see section 7).
Recordings are stored as WAV (RIFF header) 8kHz 8-bit mono mu-law single channel.
Note: Recording of utterances must be enabled by the administrators of the Holly Voice Platform.
Administrators of the platform should refer to the Holly Voice Platform Operations Guide for
more information.
3.9 Re-Recognition from Recorded Utterances
HVP 5.1 introduces the ability for an application to pe-perform “re-recognition” from a recorded
waveform. Audio recorded from caller input using <record> or an utterance recorded during
recognition may be stored by the application then declared as input by the application to a subsequent
recognition attempt – typically using a different set of grammars.
The full description of application development using re-recognition is provided in Appendix B.
3.10 Grammars
3.10.1 Standard Grammars
The Holly Voice Platform supports the standard XML form of SRGS as per the Speech Recognition
Grammar Specification Version 1.0, W3C Recommendation 16 March 2004.
The content type for SRGS grammars should be specified as “application/srgs+xml”.
Recognizers
Grammar Format
Semantic Format
Nuance Recognizer 9
SRGS 1.0 XML
SISR 1.0
Others – look up doc
Nuance OSR 3.0.x
Nuance SWMS XX
SRGS 1.0 XML
SISR 1.0 and SWI
extensions.
Nuance 8.5
SRGS 1.0 XML Draft
GSL
GSL
IBM WVS 5.1.3
SRGS 1.0 XML
SRGS 1.0 ABNF
SISR 1.0
LumenVox 8
SRGS 1.0 XML
SRGS 1.0 ABNF
SISR 1.0
Loquendo
SRGS 1.0 XML
SISR 1.0
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
Recognizers
Grammar Format
Semantic Format
vLingo
Custom
Custom
3.10.2 Proprietary Grammars
Holly supports the use of proprietary grammar formats. Other supported formats are ASR engine-
specific and developers should refer to the relevant production documentation.
Proprietary grammar types, such as Nuance 8.5 GSL grammars, must be put inside CDATA sections when
used as inline VoiceXML grammars.
3.10.3 External Grammars
Note that external grammars declared as ISO-8859-1 do not support non-ASCII (i.e. Latin-1) characters
in “default” mode. In order to use Latin-1 characters in external grammars the property
‘com.holly.grammarfetchstyle’ should be set to “absolute”.
3.10.4 Pre-built and Binary Grammars
Holly supports pre-compiled and binary grammar formats for most recognizers. For example, Holly
supports pre-compiled grammars for OSR and Nuance 9 (these are .gram files created with the “sgc”
Grammar Compiler tool).
Contact Holly for information on using Nuance 8.5 static grammar packages, note that this is not
recommended in Virtual IVR systems.
3.10.5 Universal Command Grammars
The Holly Voice Platform supports the optional default grammars for ‘help’, ‘cancel’, and ‘exit’. These
are controlled by the VoiceXML 2.0 ‘universals’ property; see [VoiceXML 2.0], section 6.3.6. These are
available in English only.
The platform administrator may define new universal command grammars for an MRCP recognizer by
setting the following platform configurations: uriuniversalcancel, uriuniversalexit, and uriuniversalhelp.
3.10.6 Builtin Support
The Holly Voice Platform supports the builtin grammar types as summarized in the following table.
Recognizer
Language
Status
Nuance 8.5
en-AU
All builtin grammar types supported as listed in Appendix P of
[VoiceXML 2.0].
Other languages
Contact Holly Customer Support.
Nuance 9
en-AU
All builtin grammar types supported as listed in Appendix P of
[VoiceXML 2.0].
Other languages
Implemented as Nuance 9 builtin grammars. See the relevant Nuance
9 Language Supplement.
Nuance OSR
(Scansoft)
en-AU
All builtin grammar types supported as listed in Appendix P of
[VoiceXML 2.0]. Additional non-standard types supported as per the
ScanSoft OSR documentation.
Other languages
Implemented as OSR builtin grammars. See the relevant OSR Language
Supplement.
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
Recognizer
Language
Status
Holly DTMF
Recognizer
Any language
All builtin grammar types as listed in Appendix P of [VoiceXML 2.0].
DTMF only.
Holly supports the <type> attribute on a field tag with the builtin grammar types defined in the table
above. Developers may also reference a builtin grammar by specifying a grammar src attribute of
“builtin:<name>”. The syntax “builtin:grammar/<type>” and “builtin:dtmf/<type>” can be used to
specify the input mode for a particular builtin grammar type; see [VoiceXML 2.0], section 2.3.1.2.
Note: This method can also be used for non-standard builtin grammars provided by a particular ASR
engine.
3.10.7 Grammar Fetch Behavior
By default the Holly Voice Platform fetches external grammars and passes them to ASR engines. This
behavior can be changed by means of the ‘com.holly.grammarfetchstyle’ property.
The possible values of the property are “default”, “absolute”, and “relative”. There is no one method
of handling grammars that will work for all cases, for instance “absolute” or “relative” may be more
efficient if big grammars are frequently passed to ASR.
Note: Caching may not work with “absolute” or “relative” as VXML side cache control information is
not passed to the ASR engine. Also, ASR may not have its own caching facility, and even if ASR
is able to use HTTP cache control mechanism to store grammars in its own cache this may
cause problems in a virtual environment (multi-tenancy).
Value
Description
Comments
default
Holly Voice Platform
fetches grammars
“default” requires that the browser fetch and manipulate the grammar. The
browser converts these to multi-byte strings. Proper cookies (if any) are
passed to an application server and cache control directives specified in
VXML document are honoured.
“default” reverts to “absolute” if the grammar URI uses anything other than
the root rule (i.e. the grammar URI contains a fragment) or if the grammar is
a binary grammar (determined by the file name extension, either .gram
or .ngo).
“default” does not work if there are additional URIs specified in the
grammar passed to ASR.
Note: The HVP sends the contents of the grammar in a buffer to the ASR
engine, this may affect performance.
absolute
Holly Voice Platform
resolves relative URI
references and
passes the absolute
URI to the ASR
“absolute” requires both platform and ASR engine to have access to the
application server.
External grammars in proprietary formats (such as Nuance binary grammars)
are always processed as though the ‘com.holly.grammarfetchstyle’ property
as “absolute”. The absolute URI is passed to the ASR engine to fetch.
Support for HTTPS is vendor specific.
Grammar URIs with query parameters of fragment identifiers are also passed
to the recognizer as “absolute” URIs regardless of the property setting.
If cookies or session identifiers are required the ASR may not be able to
fetch the grammars. Check with your network owner if the platform’s load
balancing configuration is able to pass cookies.
relative
Holly Voice Platform
passes the value of
“relative” is typically used with OSR where large grammar files are stored in
the OSR grammar root directory which is controlled by the property
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
Value
Description
Comments
the ‘src’ attribute to
the ASR (it does not
resolve relative URI
references)
SWIsvcRootGrammarDirectory. For example, the file street-names.gram can
be copied to $SWISRSDK/config (the default value of the grammar root) and
referenced in VoiceXML as “street-names.gram”. OSR fetches the file from
disk rather than HTTP.
If cookies or session identifiers are required the ASR may not be able to
fetch the grammars. Check with your network owner if the platform’s load
balancing configuration is able to pass cookies.
3.11 DTMF
3.11.1 interdigittimeout and termtimeout
The Holly Voice Platform supports the properties ‘interdigittimeout’ and ‘termtimeout’. The HVP uses
the maximum of these parameters combined as the actual value of the end of DTMF timeout, the
default value of which is 2 seconds.
For the Nuance 8.5 ASR engine ‘termtimeout’ should be disabled (i.e. “termtimeout=0”) as Nuance
does not provide sufficient information in its DTMF parse results to determine whether a parse is
complete, incomplete, invalid or a valid prefix.
Use ‘interdigittimeout’ to control DTMF recognition timing. This should generally be set per recognition
state (usually a field) to an appropriate value for the input required. The ‘interdigittimeout’ property
can also be set globally for the application by setting an application parameter. Use 0s for a menu
application that accepts a single digit (e.g. “interdigittimeout=0”); use something larger, e.g. 2s, when
collecting information that callers will have to check such as a credit card number.
Note: The default value of ‘termtimeout’ is “0s” in HVP 5.0 to comply with the VoiceXML standard.
Note: The Holly DTMF recognizer and ScanSoft OSR support both interdigittimeout and termtimeout.
3.11.2 termchar
The VoiceXML property termchar is set to “#” by default and can be changed by setting the <property>
element in the VoiceXML application. For example to set termchar to empty:
<property name="termchar" value=" "/>
Note: VoiceXML <property> scoping rules apply. The allowable scopes are application, document,
form, menu, form item – see VoiceXML 2.0 S6.3.
The default termchar property can be modified by creating an Application Parameter through HMS.
3.12 DTMF Buffering
Interactive DTMF parsing is a different mode of DTMF treatment available for HVP 5.0 and later with
the ability to be disabled for compatibility with HVP 4.1 and earlier.
At a global level Interactive DTMF parsing can be enabled or disabled for an ASR engine by setting the
parameter ‘dtmfinteractive’ to “true” or “false” under the appropriate ASR Plug-in component on the
Holly Configuration page via the Holly Management System.
Interactive DTMF parsing can be enabled or disabled at the application level via the
‘sr.dtmfinteractive’ application parameter on the Applications page in the Holly Management System
(administrators should refer to the Holly Management System User Guide for details on modifying
application parameters).
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
The DTMF buffer is cleared whenever prompts that have been queued are played to completion before
continuing processing. One such scenario is when prompts are queued with bargein disabled; another
is when prompts are queued before a fetch that specifies fetchaudio. In the latter case there is an
ambiguity in the VoiceXML 2.0 specification about the handling of DTMF input during the playing of the
prompts and the fetch.
By default, the Holly Voice Browser clears the DTMF buffer after playing the fetchaudio, so any DTMF
collected during the prompt playback or during the fetch will be lost. If the proprietary VoiceXML
property ‘com.holly.fetchaudiodtmf’ is set, the Holly Voice Browser will not clear the buffer after
playing the fetchaudio, and any DTMF collected during the prompt playback or during the fetch is fed
to the next recognition (unless some other action such as a non-bargeinable prompt clears the buffer
first).
The DTMF buffer is also cleared when the ASR engine switches. If the Holly Voice Browser parameter
‘asrengine’ is used to make the switch this will occur immediately before the first recognition in the
scope of the property. If this switch is performed using the VoiceXML ‘asrengine’ property, it won’t
take effect until the first recognition in the application and any DTMF digits collected to that point are
lost. This can be prevented by setting the VoiceXML application parameter ‘asrengine’ to the same
value as the Holly Voice Gateway component configuration parameter ‘srdefault’ so that the ASR
engine switch takes place at the very beginning of the call (administrators should refer to the Holly
Management System User Guide for details on modifying application and configuration parameters).
3.13 Holly DTMF Recognizer v2
The Holly DTMF Recognizer v2 is a conforming XML form grammar processor, as specified in SRGS
section 5.4, except that it does not support references to rules defined in external grammars.
Other notes on its implementation:
• Full schema validation of SRGS+XML documents is not performed
• Recursive grammars are not supported
• xml:lang attributes are ignored (following the SRGS specification)
• Grammars with mode attribute of “voice” are ignored. Only grammar documents that explicitly
set the mode to “dtmf” are processed
• The <record> element is supported.
Tokens
The tokens (terminal symbols) supported by the Holly DTMF Recognizer v2 are shown in the following
table; entries in the same row are synonymous. Uppercase or lowercase alphabetic characters may be
used.
1, one, dtmf-1
2, two, dtmf-2
3, three, dtmf-3
4, four, dtmf-4
5, five, dtmf-5
6, six, dtmf-6
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
7, seven, dtmf-7
8, eight, dtmf-8
9, nine, dtmf-9
*, dtmf-*, star, dtmf-star
#, dtmf-#, hash, pound, dtmf-hash, dtmf-pound
Further information on the Holly DTMF Recognizer v2 is available in Appendix B.
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
4. Output: Prompting and TTS 4.1 Selecting a TTS Engine
The Holly Voice Platform supports a broad range of text-to-speech products. Developers should
contact the platform administrator to confirm which TTS products are available. Administrators should
refer to the TTS Configuration information in the Holly Voice Platform Operations Guide.
A single installation may be configured to support many text-to-speech engines. Holly allows for each
Application to select its preferred text-to-speech engine or it may use the platform’s default.
The default text-to-speech engine for an Application is determined by the value of the ‘ttsengine’
property (a VoiceXML property extension). The Application default may be set as an Application
Parameter through HMS or can be set using a VoiceXML <property> element with appropriate scope.
For example:
<property name="ttsengine" name="realspeak45-mrcp1"/>
The table shows the list of text-to-speech products and the corresponding ‘ttsengine’ value.
Vendor
TTS Engine
Value
Acapela
Acapela (MRCP v1)
acapela-mrcp1
IBM
IBM Websphere Voice Server 5.1.3 (MRCP v1)
wvs513-mrcp1
Nuance
Speechify 3.0 (Direct API)
scansoft
Nuance
SpeechWorks Media Server 4.0 (RealSpeak TTS engine) (MRCP v1)
swms40-mrcp1
Nuance
RealSpeak 4.5 (MRCP v1)
realspeak45-mrcp1
Nuance
Recognizer to RealSpeak 4.0 (Direct API)
realspeak
Loquendo
Loquendo (MRCP v1)
loquendo-mrcp1
Note: The ‘ttsengine’ values may be customized at a platform installation. The platform
administrator can advise if any values are different from those shown in the table.
4.1.1 TTS Switching
The Holly Voice Platform performs all text-to-speech using the default TTS engine. The Holly Voice
Platform permits switching between TTS engines within a call and within a single VoiceXML document.
To switch engines set the ‘ttsengine’ in a <property> element with appropriate scope (e.g. field, form
or document scope). It is, however, not possible to have prompts in the same queue using a different
TTS setting. A switch will only take place when the queue is flushed (usually by performing a
recognition).
Note: The ‘ttsengine’ property is a proprietary extension to VoiceXML.
4.1.2 Using a Non-default TTS Voice
The ‘ttsvoice’ property is used to set a specific TTS voice for a VoiceXML document. The available
values for this parameter are dependent on the TTS voices installed on the platform.
Note: The ‘ttsvoice’ property is a proprietary extension to VoiceXML.
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
4.1.3 SSML
The Holly Voice Platform supports SSML for TTS and audio output.
Prompts consisting of unmarked-up text are wrapped in SSML tags before being sent to the TTS engine.
SSML tag support depends on the capabilities of the TTS engine. The IBM WVS proprietary alphabet and
other proprietary alphabets not starting with the “x-” prefix are supported.
The Holly Voice Platform does not support the VoiceXML 2.0 extension of the ‘<say-as>’ SSML element
described in Appendix P of [VoiceXML 2.0]. That is, the results from a recognition using one of the
builtin grammar types cannot be passed directly to a ‘<say-as>’ element to be read as a valid value of
that type for the current language.
4.2 Audio Files
The Holly Voice Platform supports the required audio formats specified in VoiceXML (see VoiceXML 2.0,
Appendix E).
All audio files must have an 8KHz sample rate. All audio files must contain single channel “mono”
recordings. The audio file must have a file extension matching the file format as shown in the table.
Audio is fetched by the Holly Voice Platform and played by the Holly Voice Gateway. The TTS server is
not used to fetch audio data or generate an audio stream.
Audio Format and Supported Content
Media Type
File Extension
.WAV (RIFF header)
– 8kHz 8-bit mono mu-law single channel
– 8kHz 8-bit mono A-law single channel
– 8KHz 16-bit mono linear [PCM] single channel
audio/x-wav
.wav
Raw (headerless) mu-law
– 8kHz 8-bit mono mu-law single channel (G.711)
audio/basic [RFC1521]
.ulaw
Raw (headerless) A-law
– 8kHz 8 bit mono A-law single channel. (G.711)
audio/x-alaw-basic
.alaw
Raw (headerless) mu-law
– 8kHz 8-bit mono mu-law single channel (G.711)
audio/basic [RFC1521]
.au
Note: There is no support for audio/basic .au with au header format.
4.2.1 Throwing Errors on Audio Fetch Failures
VoiceXML 2.0 specifies that failure to fetch an audio file does not result in an ‘error.badfetch’ event
being thrown, even when there is no alternative content. The Holly Voice Platform recognizes a
property ‘com.holly.audiobadfetch’ which, when set to “true”, results in an ‘error.badfetch’ event
being thrown if an audio file cannot be fetched.
Note: Setting this property to “true” results in behavior that does not conform to VoiceXML 2.0, but
can be very useful for development.
Holly also recognizes the VoiceXML property ‘com.holly.audiofetchalarm’ to enable SNMP and email
alarming for missing prompts, this can be turned on or off as required.
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
The VoiceXML property can be set either at the application level (in the Holly Management System) or
for individual dialogs within an application.
Setting the property “off” suppresses the alarm (SNMP, email etc) and also suppresses the error events
in the Holly Management System Call_Events for audio files fetch errors. The fetch failure would still
be logged as "outcome=error" within the fetch parameter details.
Note: Some lower level socket fetch errors may still be written to the Holly Management System log.
These are outside of the browser control, but have no effect on processing and do not raise
alarms.
Setting the property to “on” returns behavior to normal, raising both an alarm and writing the severe
errors to the Holly Management System log. By default the property is “off”.
4.3 <mark> Element
The standard <mark> tag is supported as in the VoiceXML 2.0 standard.
From HVP 5.1, <mark> may be used without a TTS engine being invoked so long as the prompt content
comprises only <audio> and <mark> elements.
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
5. Telephony: Session & Transfers 5.1 Session Variables
The following are the VoiceXML variables set by the Holly Voice Browser. Note that for outbound calls
there are some differences as documented in the following section.
Variable
Description
session.connection.local.uri
Set to the URI in the SIP “To” header.
session.connection.remote.uri
Set to the URI in the SIP “From” header (prior to any CTI Manager lookup),
this will be "anonymous" if CLI is restricted.
session.connection.aai
Variables received from CTI.
In AIN calls, this is an object whose property/value pairs are the key/value
pairs returned as call data by the CTI Manager. The CTI keys become the
property values, and they resolve to the corresponding CTI values
(interpreted as strings).
In other calls it is defined, but has no properties.
session.connection.originator
Always points to the same object as session.connection.remote.
session.connection.protocol.name
“sip” for SIP calls; an empty string for any other signaling type.
session.connection.protocol.sip.call
id
Set to the value of the Call-ID header field from the SIP INVITE that
initiated the call (inbound calls only).
session.connection.protocol.version
“2.0” for SIP calls; an empty string for any other signaling type.
session.connection.redirect
Always an empty string.
session.telephone.ani
ANI if CLIR is not set or the passCLI application parameter is set; a masked
ANI otherwise.
session.telephone.clearani
Only defined (to be ANI) if the passCLI application parameter is set.
session.telephone.moli
MOLI; only defined if the passMOLI application parameter is set.
session.telephone.dnis
DNIS after any CTI Manager lookup.
session.telephone.iidigits
Always an empty string.
session.telephone.uui
Always an empty string.
session.telephone.rdnis
Always an empty string.
session.telephone.redirect_reason
Always an empty string.
session.telephone.follow_on
Only defined if this is a AIN follow-on call; set to one of “busy”,
“noanswer”, “invalid”, “congestion”, “hangup”.
session.com.holly.callid
The call ID assigned to this call.
session.com.holly.switchmessageid
session.com.holly.trunkgroupid
session.com.holly.trunkid
These three variables come from the CTI Manager. The message ID is used
to determine the follow-on status.
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
Variable
Description
session.com.holly.applicationid
session.com.holly.affiliateid
session.com.holly.servprovid
session.com.holly.initialuri
These four variables come from the Licence Manager.
session.com.holly.channelid
This variable comes from the HVG.
session.com.holly.browserid
The IP address of the host this browser is running on.
session.com.holly.correlationid
session.com.holly.scfid
session.com.holly.callerlocation
These variables are only set if the CTI Manager returns them to the
browser in the call data.
5.1.1 Session Variables for Outbound
After a call is answered by a remote party the Holly Voice Browser commences execution of the
VoiceXML session. The session is started with the outbound parameters of the original call request
placed into ECMAScript variables in the session scope of the VoiceXML context. The mapping of
parameters is shown in the table below.
For outbound calls the CLID/ANI corresponds to the Remote party and the DNIS to the platform service
calling the Remote party. The Holly License Manager uses the DNIS to retrieve the virtual IVR data for
the call handling.
VoiceXML Variable
PLACECALL/PLACECALLRESULT Field
session.connection.local.uri
Local (number of the calling application)
session.connection.remote.uri
Remote (number of the called party).
session.connection.originator
Local (the party that initiated the call, this is a reference to
either session.connection.local or session.connection.remote)
session.telephone.ani
Remote
session.telephone.dnis
Local
session.com.holly.callid
Holly Call ID (returned in PLACECALLRESULT)
session.com.holly.userdata
User Data
5.2 Session.connection.aai Example
In certain telephony configurations the Holly Voice Platform populates the VoiceXML session with CTI or
other connection-related data. This section documents how VoiceXML applications can access this call
data.
For the following examples, suppose the call data returned by the CTI Manager for a call is
fruit = apple
beer = lager
If the name of a property is known in advance, it can be accessed directly:
<var name="fruit" expr="session.connection.aai.fruit"/>
With error-checking:
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
<var name="fruit"/>
<script>
<![CDATA[
if (typeof(session.connection.aai.fruit) != "undefined")
fruit = session.connection.aai.fruit;
else
fruit = ’unknown’;
]]>
</script>
If the property names are not known in advance, the ECMAScript ‘for/in’ operator can be used to
iterate through the properties defined on the session.connection.aai object. Here is an example of
iterating through the session.connection.aai object and reading out the name/value pairs:
<?xml version="1.0" encoding="UTF-8"?>
<vxml version="2.0" xmlns="http://www.w3.org/2001/vxml">
<form id="example">
<var name="text" expr="‘‘"/>
<script>
<![CDATA[
for (var prop in session.connection.aai)
text += prop + " is " + session.connection.aai[prop] + ", ";
]]>
</script>
<block>
<value expr="text"/>
<disconnect/>
<exit/>
</block>
</form>
</vxml>
With the call data above this would result in the prompt “fruit is apple, beer is lager”.
5.3 Passing Data Between Sessions
HVP 5.1 introduces a mechanism for an application to store data to the Holly CTI Manager so that it is
available for a subsequent VoiceXML session in a follow-on call (where the same call results in more
than one VoiceXML session). Contact Holly Support for details.
5.4 Transfers
5.4.1 Transfer Types
Support for transfers varies for each installation, depending upon the telephony environment within
which the Holly Voice Platform is deployed. The VoiceXML interpreter will attempt both blind and
bridge transfer types, according to the rules in VoiceXML 2.0; however, if a particular type is not
supported in a given environment, an ‘error.unsupported.transfer.<type>’ event is thrown. Note that
<transfer> does not set extra shadow variables as described in VoiceXML 2.1, section 7.
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
5.4.2 Destination URIs
The allowed transfer destination URIs in VoiceXML applications depends on the deployment
environment. The VoiceXML interpreter recognizes ‘tel’ URIs, but the actual digits needed to establish
a transfer depend on carrier business rules. Therefore the processing of ‘tel’ URIs is installation-
dependent.
The VoiceXML interpreter recognizes ‘sip’ URIs. Contact the system administrator to determine
whether they are supported in the deployment environment and for content information.
As a special case the Holly Voice Platform treats “relative” URIs, understood here as those with no
scheme component, as belonging to an unnamed proprietary scheme in which the string is passed
without modification to the telephony integration.
For example,
<transfer dest="CODE1">
...
results in the string “CODE1” being passed to the telephony interface as the destination for the
transfer. Contact the system administrator to determine whether they are supported in the
deployment environment and for content information.
5.4.3 Transfer CLID
If the VoiceXML property ‘com.holly.transferclid’ is set in the scope of the transfer element (typically
within the element), its value is supplied as the CLID in the transfer-initiating SIP message.
5.4.4 Recognition During Transfer
The Holly Voice Platform does not support speech or DTMF recognition during transfer.
5.4.5 Whisper Transfer
Holly supports whisper transfer as an extension to the set of standard VoiceXML transfer types. In a
whisper transfer the VoiceXML application is able to interact fully with the C-party (recipient of the
transfer) prior to either completing or rejecting the transfer. Examples of its use:
• Perform a transfer with the application passing information about the call to the C-party (transfer
recipient) so that they can continue the call smoothly;
• Provide the C-party with the option to accept or reject the transfer request (e.g. “I have Joe
Smith on the line. Would you like to take the call?”).
To achieve both a transfer and a separate dialog with the C-party the VoiceXML code for whisper
transfer combines the standard <transfer> elements (used like a bridge transfer) and <subdialog>
element. For reference, <subdialog> element is described in Section 2.3.4 of VoiceXML 2.0 and bridge
transfer is described in Section 2.3.7.2 of VoiceXML 2.0.
Executing whisper transfer causes the following sequence of events:
• Any queued prompts are played to completion to the A-party (caller) before they are put on hold.
• A-party is put on hold and may be played transferaudio (as for standard bridge transfer).
• Platform initiates a new call to the C-party indicated by the dest or destexpr attribute of the
<transfer> element (as for any transfer).
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
• Once a connection to the C-party is established a subdialog is invoked to interact with the
answerer. As with standard subdialogs this is executed in its own execution context. As with
<subdialog> variables may be passed and returned through the <subdialog> mechanism only.
Note: recognition on transfer (or transfer barge-in) is not supported.
To combine <transfer> and <subdialog> capabilities the <transfer> element is extended to allow the
<param> element (which is standard for the <subdialog> element). There are four specific values of
the <param> element within a <transfer> element that have special interpretations. The values may
be provided by either the “value” or “expr” attributes.
Parameter Name
Description
Example
com_holly_uri
URI reference to the subdialog to
execute with the C-party. The URI
may be a fragment referencing the
current document.
./whisper_dialog.vxml#form
#whisper_dialog
com_holly_namelist
The list of ECMAScript variables to
submit as URI query parameters.
Equivalent of namelist attribute on
<subdialog>.
var1 var2 status
com_holly_method
Method of fetch.
get (default)
post
com_holly_enctype
Encoding type of the fetch.
application/x-www-form-
urlencoded (default)
multi-part/form-data
<any other name>
Parameter is passed as a <param>
to the subdialog. There must be a
corresponding <var> element in the
subdialog.
Attributes
The <transfer> maxtime attribute can be used to limit the duration of the transfer. This limits the
combined duration of both the interaction within the subdialog and the subsequent bridge connection.
During the transfer operation and while the subdialog executes the current interpreter session is
suspended. If a transferaudio attribute is provided the audio resource will be played to the caller until
the connections are bridged or the subdialog ends.
Properties
Four standard VoiceXML properties apply to the fetch behavior:
• fetchtimeout
• fetchhint
• maxage
• maxstale
The fetchaudio property does not apply to the fetch since the transferaudio attribute of
the <transfer> extension applies.
Error Handling
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
The following events might be thrown during the attempt to establish the new connection. They will
all be thrown in the context of the calling dialog.
error.semantic: The com_holly_uri parameter is not supplied.
error.badfetch: The URI referenced by the “com_holly_uri” parameter is invalid.
error.connection.baddestination: The URI reference com_holly_uri is malformed.
error.unsupported.uri: The platform does not support the URI scheme in the URI reference.
error.connection.noroute: The platform is not able to make the connection.
error.connection.noresource: The platform cannot allocate resources to create the new connection.
Connecting A-Party to C-Party
A <transfer> element with the type of com.holly.join must be executed in the subdialog to connect the
A-party to the C-party (i.e. to join the whisper transfer). (Alternatively the subdialog can <return> or
<exit> as described in the following section.)
The dest, destexpr, connecttimeout, maxtime and transferaudio attributes of the <transfer> element
are ignored for this type of transfer.
If the application executes a <transfer> of type com.holly.join then the platform will join the A-party
and C-party legs as a bridge transfer which completes the whisper transfer. The A-party and C-party
remain connected until one of the following:
• A-party hangs up
• C-party hangs up
• maxtime property is reached
There can be no failures before the bridge is established with this type of transfer (because connection
to C-party has already been established). The only conditions are thus post-connection conditions, as
listed in Table 2 (but excluding the subdialog disconnect condition). The results of a join transfer are
automatically copied to the corresponding whisper transfer element variable, and once the bridge is
complete control continues with any <filled> element in the whisper <transfer> element in the calling
dialog. If the A-party hangs up, a connection.disconnect.hangup is thrown in the calling dialog. If the
C-party hangs up, the calling dialog <transfer> item variable is populated accordingly.
Once the connections are bridged, no further interaction in the subdialog will take place. The
VoiceXML execution next returns to the VoiceXML context in which the whisper transfer was initiated
and the behavior follows the specification for returning from a bridge transfer.
If the join <transfer> element specifies a com.holly.join.namelist <param> element, the whisper
<transfer> element result shadow variable will be populated with the variable values returned, as if
the subdialog had terminated with a <return> with a namelist attribute.
Whisper Subdialog <return> or <exit>
If the application does not complete the transfer with a join then the subdialog execution will
complete either when the application executes a <return> element or when there is an explicit or
implicit <exit>. With a <return> the control and data are returned to the calling whisper transfer with
behavior following the specification for returning from a sub-dialog. This also results in the connection
to the C-party being closed and the subdialog execution context being deleted. (Note that because the
application has not executed a <transfer> with holly.com.join there will be no connection between the
A-party and C-party.)
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
The <return> element may specify a list of variable values to return (namelist attribute) and these
populate the shadow variable of the calling <transfer> element. Alternatively the <return> element
may specify an event to throw in the calling dialog (event or eventexpr attribute) with an associate
message (message or messageexpr attribute). The standard <transfer> shadow variables are set as in
VoiceXML 2.0.
For subdialogs that end with either a join or a <return> the duration shadow variable is set to the
duration of the whisper transfer which is the sum of the time the caller is on hold and the time the two
connections are bridged. Also, an extension shadow variable, holdduration, is set to the time the
caller spent on hold.
If the subdialog ends, explicitly or implicitly, with an <exit> interpretation terminates and any
remaining connections are released following the VoiceXML <subdialog> behavior.
After Completion of Whisper Transfer
When execution returns to the initiating <transfer> element (of type whisper) the item variable
contains an indication of the final condition of the transfer. Following the specification for bridge
transfers in VoiceXML 2.0 the return behavior is different according to whether connection is
established to the C-party. For whisper transfer there is the additional distinction that the platform
can connect successfully to the C-party but the application does not join the A- and C-parties.
The possible values of the transfer item variable depend on the stage reached. Table 1 shows the
possible values assigned to the item variable before the connection to C-party is established.
Condition
Value
target busy
‘busy’
no answer before connect timeout
`noanswer'
other
‘unknown’
Table 1: Values of the transfer item variable before the new connection is established (prior to
execution of subdialog)
Table 2 shows the possible values after the connection is established. The possible values after the
connections are joined are a subset of these, and are distinguished using a non-standard boolean
shadow variable name.joined that indicates whether the connections were joined or not. This shadow
variable is not defined if the new connection fails to be established.
Condition
Value
C-party disconnects
‘far end disconnect’
maxtime reached
‘maxtime disconnect’
Subdialog disconnects
‘application disconnect’
other
‘unknown’
Table 2: Values of the transfer item variable after the new connection is established but before
joining (i.e. during execution of subdialog)
At any stage, if the A-party disconnects a connection.disconnect.hangup is thrown. If this occurs
during the establishing of the new connection or after the connections are joined, it will be thrown in
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
the context of the calling dialog. If it occurs while the caller is on hold, it will be thrown in the
context of the subdialog as a connection.disconnect.hangup.a.
Example: Initiating Whisper Transfer
<?xml version="1.0"?>
<vxml xmlns="http://www.w3.org/2001/vxml" version="2.1">
<catch event="connection.disconnect.hangup.c">
<log>The C party hung up: caught in whisper-call</log>
<script>result.condition = "disconnect"</script>
<return namelist="result"/>
</catch>
<catch event="connection.disconnect.hangup">
<log>The A party hung up: caught in whisper-call</log>
<exit/>
</catch>
<form>
<!-- At this point we have a connection. -->
<var name="com_holly_uri"/>
<var name="com_holly_namelist"/>
<var name="karma"/>
<var name="result" expr="{}"/>
<field name="proceed" type="boolean">
<prompt>Are you willing to help improve a karma of
<value expr="karma"/>?</prompt>
<filled>
<if cond="proceed">
<script>result.condition = "accepted"</script>
<goto nextitem="join"/>
<else/>
<script>result.condition = "rejected"</script>
<return namelist="result"/>
</if>
</filled>
</field>
<transfer name="join" cond="false" dest="ignored" type="com.holly.join">
<param name="com_holly_namelist" value="result"/>
</transfer>
</form>
</vxml>
Example: Whisper Subdialog
<?xml version="1.0"?>
<vxml xmlns="http://www.w3.org/2001/vxml" version="2.1">
<catch event="connection.disconnect.hangup">
<log>The A party hung up: caught in whisper-main</log>
<exit/>
</catch>
<catch event="error">
<exit/>
</catch>
<form>
<field name="karma" type="digits?minlength=3;maxlength=10">
<prompt>What's your karma?</prompt>
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
</field>
<transfer name="guru" destexpr="karma" type="com.holly.whisper"
connecttimeout="15s" maxtime="60s" transferaudio="chicken.wav">
<param name="com_holly_uri" value="whisper-call.vxml"/>
<param name="com_holly_namelist" value="karma"/>
<param name="karma" expr="karma"/>
<prompt>
Please wait while your karma works for you or against you.
</prompt>
<filled>
<log>whisper transfer returned: guru$=<value expr="guru$"/>,
duration=<value expr="guru$.duration"/>, holdduration=<value
expr="guru$.holdduration"/>, result=<value expr="guru$.result"/></log>
<if cond="guru$.result">
<goto nextitem="subdialogresult"/>
<else/>
<goto nextitem="transferresult"/>
</if>
</filled>
</transfer>
<block name="subdialogresult" cond="false">
<!-- Post-connection conditions -->
<if cond="guru == 'application_disconnect'">
<prompt>application disconnect</prompt>
<elseif cond="guru == 'unknown'"/>
<prompt>outcome is unknown</prompt>
<elseif cond="guru == 'far_end_disconnect'"/>
<prompt>We hope your conversation with the guru improved
your already excellent karma.</prompt>
<elseif cond="guru == 'maxtime_disconnect'"/>
<prompt>Sorry, the guru is a busy, important person, and
he has other karmae to attend to. We hope the abrupt
termination has not karmed your harma too
much.</prompt>
<elseif cond="guru$.result.condition == 'rejected'"/>
<prompt>sorry, the guru does not wish to address your
karma. Try again in your next life.</prompt>
<else/>
<prompt>Well, look at that. A guru meditation error.
We hope your karma hasn't been bruised.</prompt>
</if>
</block>
<block name="transferresult" cond="false">
<!-- Pre-connection conditions -->
<if cond="guru == 'busy'">
<prompt>Sorry, with your karma you're condemned to be a
silent follower, never even observed. Try again in your
next life.</prompt>
<elseif cond="guru == 'noanswer'"/>
<prompt>Sorry, with your karma you are not important enough
to talk to the guru. Try again in your next
life.</prompt>
<elseif cond="guru == 'unknown'"/>
<prompt>Sorry, it seems the guru's karma is worse than
yours, so you probably don't wan't to talk to him
anyway.</prompt>
</if>
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
</block>
<block>
all over, disconnect call
<log>All over.</log>
<exit/>
</block>
</form>
</vxml>
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
6. Logging
The Holly Voice Platform provides in-depth logging, reporting and analytics to facilitate development,
debugging and support for production applications. This section documents the logging as used by
application developers. The Holly Management System Guide provides information on using HMS for
reporting and analytics. The Holly Reference Manual and Holly Operations Guide provide platform
administrators with information on configuration of a platform to enable logging and reporting.
The Holly Voice Platform has two key forms of logging:
• Call Detail Record: stored as LOG_CALLS in Holly, this is a summary table containing around 90
properties. Core information in the CDR includes call start/end times, call duration, ANI, DNIS, use
of ASR and TTS, use of logging and much more.
• Call Event Record: stored as LOG_EVENTS in Holly, this is a event-by-event record that documents
the progress of a call from initiation to completion.
6.1 Events
The Holly Voice Platform can be configured to record over 30 different types of event during execution
of each call. These events are recorded in real-time during the call by the Holly Voice Browser. The
events are submitted to the platform database via the Holly Log Manager. The events can be viewed
and analyzed through the Holly Management System.
The <log> element of VoiceXML is used by application developers to insert a “Log” type event. All the
other event types are generated by the Holly Voice Platform. (The set of events configured for
recording can be configured at the platform and application levels as described in the following
section.)
For optimal platform and database efficiency it is recommended to remove any logging options that are
not required.
EVENTID
PARAM
Description
Answer
result=%s
Indicates the call has been answered or not. The
parameter “result” may be success or failure.
ASR Session
asrengine=%S| sessionid=%S|
address=<host:port>|
server=<server>|
endpoint=<address>
Logs the ASR session event. The endpoint address of
the recognizer is also included.
Call end
cpu=%.3f| normalcpu=%.3f|
callduration=%.3f| reccount=%d|
ttscount=%d| ttsduration=%.3f|
logcount=%d| logbytes=%d
Summary call statistics recorded irrespective of the
level of logging.
Call start
ANI=%s| DNIS=%s| VURL=%s| follow
on,reason=%s
At the start of a call logs the Caller Line Identifier,
DNIS, and the initial applications document identifier.
Disconnect
List of variables and corresponding
values specified with VXML
attribute ‘namelist’.
Records a hang-up event.
Document dump
<a reference to the VoiceXML
document>
Logs the whole VoiceXML document as it is fetched.
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
EVENTID
PARAM
Description
Document
transition
uri=%s| cpu=%.3f| normalcpu=%.3f
Written when document scope changes.
Error Critical
error=%d| msg=%s
Logged when a critical error occurs.
Error Severe
error=%d| msg=%s
Logged when a severe error occurs.
Error Warning
error=%d| msg=%s
Logged when a warning error occurs.
Exit
result=%s
Logs exit event
Fetch
uri=%s| fetchtype=(VXML| |audio|
|grammar| |other)|
incache=(true| |false)|
latency=%.3f| documentsize=%d|
outcome=(success| |no response|
|error| |timeout)| failover=(true|
|false)| localport=%s| hostname=%s
Logged for each document fetch. When external
scripts are fetched they are logged with a fetch type
of 'vxml'.
Grammar
activation
URI=%s
Fetch and/or activation of a new grammar.
Grammar
deactivation
URI=%s
Logged when a grammar is deactivated.
License
mode=(acquire| |resolve)|
key=nnn| outcome=(success|
|maximum licence number reached
or exceeded| |license key not
found| |socket error| |message
encode error| |message decode
error| |application not found|
|affiliate not found| |service
provider not found| |licences do
not reconcile| |licence manager
not seeded| |connect failure)|
service=<service provider
ID>:<affiliate ID>:<application ID>
The first event for a session, preceding the “call start”
event representing the license lookup. Values are:
mode=acquire or resolve.
The value 'acquire' means that this license is being
requested for the duration of the call; this is the
license that authorizes all sessions within the call. The
value 'resolve' means that this license is being
requested solely to obtain application data for a new
session -- the Holly License Manager will not increase
the license allocation for the application.
key=The key for the license lookup.
outcome= outcome of the request
If the key lookup is successful, a fourth field will be
present:
service=<service provider ID>:<affiliate
ID>:<application ID>
Note: by default this event is not included in the Holly
Voice Browser callevents parameter which determines
which events are logged.
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
EVENTID
PARAM
Description
Log Event
EVNT=<event id>| Label=<label in
log tab>| expr=<expression in log
tag>| content=<user defined
parameter
Logged as a result of a <LOG> tag included in the
VoiceXML document. . The VoiceXML log tag format
includes the attributes label and expr in addition to
the log tag content. The format and contents of the
PARAM field are under the control of the application
builder. The VoiceXML log tag format includes the
attributes label and expr.
Note: The browser treats <log> content differently if
it is of the form 'EVNT=<txt>|...'. This content will be
logged to the ASR engine with the event name <txt>
and content all the text after the '|'. If <txt> begins
with 'SWI' or 'calllog:?' the event will NOT be logged to
HMS; otherwise the event will also be logged to HMS
with the event name <txt>.
This format is supported by the
SpeechWorks/ScanSoft/Nuance OSDM (OpenSpeech
DialogModule) products which log a lot of useful dialog
state information with events that start with
EVNT=SWI. This information can then by used by the
OpenSpeech Insight reporting tool to do some
powerful ASR analysis.
Placecall start
remote=<SIP URI>| local=<app
number>
Logs the SIP URI and application number.
Placecall end
result=(no reason supplied| |user
disconnect| |silent| |maximum
duration| |special information
tones| |fax| |busy| |cti| |no
answer| |error| |bad destination|
|bad format| |answering machine)
Logs the result of the outbound call.
Prompt (external)
type=external| URI=%s
Logged when an external prompt is played. URI is the
uri of the audio file
Prompt (SSML)
type=SSML| content=%s
Logged when a TTS prompt is played. Content is the
TTS string.
Prompt
(disconnect )
status=disconnect
Logged when the Holly Voice Gateway attempts to
play a prompt but the call has already disconnected.
This is common in normal operation for most
applications.
Recognition start
inputmodes=(dtmf| |voice|
|dtmf,voice)| threshold=%d|
timeout=%.3f|
bargeintype=(speech| |hotword)
Start of recognition. This event contains information
such as inputmodes, timeout, threshold and bargein
type.
Recognition end
(fail)
result=(no input| |disconnect| |no
match| |error)| bargein=(true|
|false)| inputmode=unknown|
dtmfinput=%s
End of recognition when the recognition fails. The
PARAM value indicates the failure reason. Failed DTMF
input (if applicable) is also logged.
Recognition end
(success)
result=success| utterance=%s|
confidence=%d| bargein=(true|
|false)| inputmode=(dtmf|
|speech)| utterance=%s|
confidence=%d
End of a successful recognition.
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
EVENTID
PARAM
Description
Recording start
maxtime=%.3f| dtmfterm=(true|
|false)| type=%5
Start of voice recording.
Recording end
(fail)
result=(maxtime exceeded||no
input||disconnect||max speech
timeout||error)
End of voice recording when the recording has failed.
The PARAM value indicates the failure reason.
Recording end
(success)
result=success| duration=%d
End of a voice recording when successful.
SIP session
callid=<SIP called>| remote-
rtp=%s| local-rtp=%s
SIP Session Call ID.
System response
latency=%.3f
Logged whenever a recognition event occurs, in
seconds to millisecond resolution.
Transfer start
mode=(network| |blind| |bridge|
|conditional)| {URI=%s|
|destination=%s}
Start of a call transfer. Given either the URI or the
destination.
Transfer end(fail)
result=(bad destination|
|disconnect| |error| |remote
busy| |timeout| |network busy|
|maxtime exceeded)
End of a call transfer. PARAM value indicates the
failure reason.
Transfer
end(success)
result=success| duration=%.3f
End of a call transfer when successful. Duration is only
present for a successful bridge-transfer, value in
seconds to millisecond resolution.
VXML Event
event=%s
or event=%s| message=%s
Logs events thrown by VXML application
6.1.1 Configuring Event Logging
Each event has a default configuration (on or off) at the platform level. This is determined by the
platform administrator and may vary from the factory default settings provided by Holly.
Through the Application Parameters in HMS it is possible to enable or disable individual events
separately for each Application. The parameter name is “client.log.<event name>” and the parameter
value is “true” or “false”. The event name must be lower case and any space should be replaced by
underscore.
For example:
client.log.system_response = true
For optimal platform and database efficiency it is recommended to remove any logging options that are
not required.
6.2 <log> Element
VoiceXML provides the standard <log> element for use in applications. The standard states that the
behavior of <log> is platform-specific.
The Holly Voice Platform logs each <log> event as an event record. All events for a call - including
<log> and many other events - can be displayed and analyzed through the Holly Management System.
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
This section defines Holly's specific behaviors relating to the <log> element. These capabilities have
been implemented to simplify application development and debugging, as well as platform operation.
The format of the ‘log’ event is:
ID
Param ID
log
label=<label in log tab>|expr=<expr in log tag>|content=<user defined parameter>
6.2.1 Label on <log>
The VocieXML <log> element allows a “label” attribute. If this attribute is provided by the application
then the label is inserted into the text of the logged event.
For example, the following code:
<log label="myLabel">
dialog state info
</log>
would result in the following event being recorded in the LOG_EVENTS table.
ID
Param ID
log
label=myLabel|content=dialog state info
6.2.2 Changing the Event Type
Holly can log over 30 different event types. Holly also allows applications to create custom event
types. This is often used when generating custom reports from the platform database.
If the content of the <log> tag is of the form “EVNT={name}|{description}” then the event will be
logged as:
ID
Param ID
name
description
For example the code:
<log>EVNT=myevent|This is the text of myevent.</log>
would result in the following record in the LOG_EVENTS table:
ID
Param ID
myevent
This is the text of myevent
Any spaces in the event ID field are stripped, and any consecutive white space in the Param ID field is
converted to a single space.
6.2.3 Objects and Arrays
For convenience objects and arrays can be inserted into <log> elements using <value> elements. HVP
5.0 can report display object arrays as per the following examples:
The array:
values[0] = zero
values[1].name = one
values[1].value = 1
values[2] = two
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
values[3][0] = cero
values[3][1] = uno
values[3][2] = dos
will be logged as:
values=[ zero, [object], two, [array] ]
The object:
result.fruit = apple
result.pizza.base = thin
result.pizza.topping = hawaiian
result.drink = juice
result.lotto[0] = 21
result.lotto[1] = 17
result.lotto[2] = 31
will be logged as:
result={ fruit:apple; pizza:[object]; drink:juice; lotto:[array] }
6.2.4 ECMAScript Log Function
Holly has extended VoiceXML with the built-in function ‘session.logEvent()’ to enable logging from
ECMAScript scripts within VoiceXML applications. The effect of the function is as though the argument
had been included in a <log> tag in VoiceXML. The log message will appear with a “script=” prefix.
6.3 Call Record: LOG_CALLS
The Call Detail Record (or LOG_CALLS in the Holly database) contains three attributes that can be set
by an application to indicate the characteristics or outcome of a call. These application-specific fields
can be used in HMS to enhance reporting and analysis (e.g. what is the average duration of calls from
“gold card members”).
The values of the field are set by a VoiceXML application during execution of the call at any time from
start to end of VoiceXML execution.
The fields are:
• outcome: If set, it must be one of SUCCESS, FAIL or UNKNOWN. No other values are allowed.
Lower case values will be converted automatically to uppercase. If outcome is not defined then
the record will be “null”.
• calldesc1 and calldesc2: Call Descriptions 1 & 2 are application-defined values and can be any
string up to 100 characters. Strings over 100 characters are truncated. If the values are not
defined then the value will be “null”.
Use of this capability is optional. If not set then “null” values are recorded.
The values can be logged by the <disconnect> and <exit> tags and specifically the ‘namelist’ attribute.
The VoiceXML/ECMAScript variable of the same name (i.e. outcome, calldesc1 or calldesc2) must be
declared and assigned in advance. The variable (or variables) can then be referenced as below:
<var name="calldesc1" expr="Payment Completed by Credit Card"/>
<disconnect namelist="calldesc1"/>
Or
<var name="calldesc1" expr="Payment Completed by Credit Card"/>
<exit namelist="calldesc1"/>
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
More than one attribute can be set using the <disconnect> or <exit> tag by including the variable
names in the namelist expression separated by a space. For example:
<var name="calldesc1" expr="Payment Completed by Credit Card"/>
<var name="calldesc2" expr="Gold Status Account Holder"/>
<var name="outcome" expr="SUCCESS"/>
<disconnect namelist="calldesc1 calldesc2 outcome"/>
The attribute can also be set using the <log> tag. This mechanism is different because it uses the label
section of the <log> tag to reference the attribute rather than a variable name and to derive the value
to be logged in the attribute from the expr section. For example:
<log label=”calldesc1” expr=”Payment Completed by Credit Card”\>
This mechanism does not require a variable to be declared (but a variable can be referenced in the
expr section in the normal way if desired.) Only one attribute can be set in a single <log> expression,
so multiple <log> expressions must be used to set all the attributes:
<log label=”outcome” expr=”SUCCESS”\>
<log label=”calldesc1” expr=”Payment Completed by Credit Card”\>
<log label=”calldesc2” expr=‘”Gold Status Account Holder”\>
If any attribute is set more than once during the execution of a call then previous values are
overwritten.
6.4 Log Suppression
Some deployments of DTMF and speech applications involve the collection and/or presentation of
sensitive information. Examples of sensitive information might include security identifiers (e.g. PIN),
or private caller data (e.g. phone number, credit card number, name, address and bank account
balance).
To protect this information, log suppression can be enabled and disabled around sensitive information
within an application through the VoiceXML extension property ‘suppresslogs’.
When set the Holly Voice Platform will suppress all logging of events. Additionally, Holly will suppress
logging by the following speech recognizers if they are currently in use:
• Nuance 8.5 MRCP v1
• Nuance 9.0 MRCP v1
• Nuance ASR 8.5-20050930
• ScanSoft OSR 3.0.9
• Holly DTMF.
To enable log suppression the following should be included in the VoiceXML document. It affects only
the current scope (field, form, document etc).
<property name="com.holly.suppresslogs" value="true"/>
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
Note: VoiceXML <property> scoping rules apply. The allowable scopes are application, document,
form, menu, form item – see VoiceXML 2.0 S6.3.
The suppression applies to trace logging by all Holly components. Suppression also blocks audio
recording of utterances and data logging by the recognizers stated above.
6.4.1 Exceptions to Suppression
The suppression does not affect the following:
• Explicit requests for logging by an application using the <log> tag of logEvent() function
• Logging of warnings or errors
• Full call recording
• "call start" event
• An initial ASR Session event
• "exit" and "call end" events
• Call detail record (LOG_CALLS record)
6.4.2 Record of Suppression
An event with the event ID “note” is logged to indicate when logging as suppression is applied and as
logging is re-enabled. Both events can be viewed in the Customer Usage report Call Event log and are
controlled by the callevents.browser configuration parameter. The “note” event is enabled by default.
Figure 1 Notification of log suppression
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
© 2009 Holly Connects
48/61
hvp-vxml-0009
Holly 5-1 VoiceXML Developer Guide v1-0.doc
6.4.3 Logging Masked Data
Standard programming techniques can be used by application developers to mask logging of sensitive
data. The following VoiceXML snippet shows the collection of a 16-digit credit number with the logging
of only the last four digits; e.g. a credit card number might be logged as "XXXXXXXXXXXX1234":
<property name="com.holly.suppresslogs" value="true"/>
<form>
<field name="credit_card" type="digits?length=16">
<prompt>
Please enter your credit card number followed by the pound sign.
</prompt>
<filled>
<log>auth_number: <value expr=" 'XXXXXXXXXXXX' + creditcard.substring(12,4)"/></log>
</filled>
6.5 Raising Alarms
The Holly Voice Platform generates alarms that may be monitored by the system administrator. These
alarms, which are described in detail in the Operations Manual and Reference Manual, can result in
SNMP traps, syslog events, file logging or email messages according to the configuration of the platform.
The platform can raise alarms for a range of failure scenarios during VoiceXML application execution
(HTTP issues, VoiceXML parsing errors, missing documents and many more as documented in the
Reference Manual).
It is sometimes required that an application explicitly raise an alarm. This is implemented by a
subdialog call through the Holly VoiceXML Subdialog Server (HVSS) which is a Holly Voice Platform
component that may be activated by the system administrator. The following example code shows how
to raise and then clear an alarm through HVSS.
<form id="raiseAndClearAlarm">
<var name="hollyAlarmType" expr="'appServerConnectErrorAlarm'"/>
<var name="hollyAlarmDescription" expr="'testing HVSS alarm interface'"/>
<var name="hollyAlarmID"/>
<!-- raise alarm via HVSS -->
<subdialog name="raise" src="http://localhost:8030/holly/raise"
namelist="hollyAlarmType hollyAlarmDescription">
<filled>
<assign name="hollyAlarmID" expr="raise.alarmID"/>
</filled>
</subdialog>
<block>
alarm id is <value expr="hollyAlarmID"/>
</block>
<!-- clear alarm via HVSS -->
<subdialog name="clear" src="http://localhost:8030/holly/clear"
namelist="hollyAlarmType hollyAlarmID">
<filled>
<assign name="hollyAlarmID" expr="''"/>
</filled>
</subdialog>
</form>
The alarm type must be one of the alarm types defined in the Reference Manual. The description string
can be customized to pass meaningful information to the system monitor.
The variable names passed in the namelist must be exactly as provided in the example above.
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
© 2009 Holly Connects
49/61
hvp-vxml-0009
Holly 5-1 VoiceXML Developer Guide v1-0.doc
The alarm ID is created by the Holly Foreman. The ID returned from the ‘raise’ subdialog must be
passed to the ‘clear’ subdialog to clear the correct alarm.
Developers should coordinate with the platform administrator so that application-raised alarms are
appropriately monitored in platform operations.
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
© 2009 Holly Connects
50/61
hvp-vxml-0009
Holly 5-1 VoiceXML Developer Guide v1-0.doc
A. Appendix: Application Parameters
This section covers the following categories of application parameters:
• VoiceXML
• Speech Recognition
• DTMF
• Text to Speech
• Logging
• Telephony.
A.1 VoiceXML
Key Permitted Values
Default Value
Description
com.holly.audiobadfet
ch
true
false
false
When set to “true” the interpreter will throw an
error.badfetch event if an audio file fetch fails. Refer to
section on Audio Fetch Failures.
Note that the behavior of the VoiceXML interpreter with
this parameter set to “true” violates the W3C VoiceXML 2.0
Recommendation.
This parameter can also be set by an administrator on the
HMS Applications page.
com.holly.audiofetchal
arm
true
false
false
This property enables SNMP and email alarming for missing
prompts; it can be turned on or off as required. Refer to
section on Audio Fetch Alarms.
Note: In the case of TTS fallback, if an audio fetch fails and
there is no TTS then an alarm of severity = WARNING is
raised.
This parameter can also be set by an administrator on the
HMS Applications page.
com.holly.dtmfbuffercl
ear
true
This property allows a developer to manually clear buffered
digits. It is processed on recognize and record start.
com.holly.xmlspace
normalize
ignore
ignore
This parameter affects how whitespace is handled in
<prompt> and <log> elements. If the value is ‘normalize’,
meaningful whitespace in prompts is preserved; if the value
is “ignore”, some meaningful whitespace may be lost.
Changing this value to “normalize” will impact platform
performance. This parameter can also be set by an
administrator on the HMS Applications page.
audiomaxage,
documentmaxage,
grammarmaxage,
scriptmaxage
[integer]
Set these properties to 0 to disable caching. Refer to
section on Caching.
This parameter can also be set by an administrator on the
HMS Applications page.
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
© 2009 Holly Connects
51/61
hvp-vxml-0009
Holly 5-1 VoiceXML Developer Guide v1-0.doc
Key Permitted
Values Default Value
Description
singlecookieheader
true
false
false
If set, and there is more than one cookie for an HTTP
request, the browser sends the cookies folded into a single
HTTP Cookie header, as described in RFC 2965, section
3.3.4.
To disable this parameter, delete it from the list.
This parameter can also be set by an administrator on the
HMS Applications page.
A.2 Speech Recognition
Key Permitted Values
Default Value
Description
asrengine
[string]
Use this property to switch between ASR engines within a
single VoiceXML document.
Refer to the ASR section for a list of possible values.
This parameter can also be set by an administrator on the
HMS Applications page.
com.holly.collapsesingl
eslot
true
false
false
Some older voice browsers collapse a structured recognition
result to a simple string value if the structure contained a
single element. Setting this property to “true” will cause
the Holly Voice Browser to do this.
Note that the behavior of the Holly VoiceXML interpreter
with this property set to “true” violates the W3C VoiceXML
2.0 Recommendation.
This parameter can also be set by an administrator on the
HMS Applications page.
com.holly.distincttime
out
true
false
false
For recognizers that enable the platform to distinguish the
two timeouts completetimeout and incompletetimeout, the
‘com.holly.distincttimeout’ property can be set to “true”
to permit the timeouts to be treated differently.
This parameter can also be set by an administrator on the
HMS Applications page.
com.holly.grammarfetc
hstyle
default
absolute
relative
default
By default the Holly Voice Browser fetches all grammars
referenced by URI in VoiceXML documents. This behavior is
often not desirable and can be changed using this property.
This parameter can also be set by an administrator on the
HMS Applications page.
com.holly.grammarlab
el
[string]
This property is useful only if the Nuance 8.5 plug-in
recognizer is being used. The supplied string is passed to
the ASR, and included in the Nuance logs. This can be
useful for grammar tuning.
This parameter can also be set by an administrator on the
HMS Applications page. It is expected that this property will
be set in VoiceXML documents rather than as an application
parameter in the Holly Management System.
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
© 2009 Holly Connects
52/61
hvp-vxml-0009
Holly 5-1 VoiceXML Developer Guide v1-0.doc
A.3 DTMF
Key Permitted Values
Default Value
Description
com.holly.fetchaudiodt
mf
true
false
false
If set to “true”, the DTMF buffer will be cleared when the
Holly Voice Browser plays fetch audio.
This parameter can also be set by an administrator on the
HMS Applications page.
interdigittimeout
[integer]
Use interdigittimeout to control DTMF recognition timing.
This parameter can also be set by an administrator on the
HMS Applications page.
termtimeout
[integer]
0
For use with the Nuance 8.5 ASR engine an application
parameter should be set to disable the termtimeout (i.e.
“termtimeout=0”).
This parameter can also be set by an administrator on the
HMS Applications page.
A.4 Text to Speech
Key Permitted Values
Default Value
Description
ttsengine
Use this property to switch between TTS engines within a
single VoiceXML document. Refer to TTS and Prompting
section for a list of possible values.
Note: it is not possible to have prompts in the same queue
using a different TTS setting. A switch will only take place
when the queue is flushed (usually by performing
recognition).
This parameter can also be set by an administrator on the
HMS Applications page.
ttsvoice
This property is used to set a specific TTS voice for an
application. The available values for this parameter are
dependent on the TTS voices installed on the platform.
This parameter can also be set by an administrator on the
HMS Applications page.
A.5 Logging
Key Permitted Values
Default Value
Description
com.holly.suppresslogs
true
false
False
If this property is set to “true”, the diagnostic logging in
the Holly Voice Browser for the channel running the
application will be turned off; so will the diagnostic logging
for the Holly Voice Gateway for the duration of any
recognitions in the scope of the VoiceXML property.
recordutterance
true
false
0
This property uses ‘sr.recordutterances’ to implement the
recording of utterances in a scoped manner.
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
© 2009 Holly Connects
53/61
hvp-vxml-0009
Holly 5-1 VoiceXML Developer Guide v1-0.doc
A.6 Telephony
Key Permitted Values
Default Value
Description
com.holly.transferclid
[string]
The supplied string is used as the user part of the SIP From
header in the transfer INVITE (for bridge transfer) or REFER
(for blind transfer).
This parameter can also be set by an administrator on the
HMS Applications page.
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
© 2009 Holly Connects
54/61
hvp-vxml-0009
Holly 5-1 VoiceXML Developer Guide v1-0.doc
B. Appendix: Re-Recognition from Recorded Utterance
In normal VoiceXML execution speech recognition and DTMF input processing are performed using live
input from the caller. Specifically, the application declares a set of prompts and grammars and then
reaches a “wait state” at which point the application suspends while the prompts are played and input
from the caller is matched against the grammars.
With re-recognition from an utterance this normal behaviour is modified so that a previously recorded
input from the caller, which has been stored as a wavefile, is used during the wait state as input to the
speech/DTMF recognition process (instead of live audio).
The following is an abstraction of the typical use case for re-recognition capability:
1. Prompt the caller to say some information (e.g. their name)
2. Keep a recording of the caller’s response to the prompt (e.g. a recording of the name)
3. Optionally, attempt to recognize this input using a broad grammar (e.g. a list of 100,000 common
names)
4. Further interaction with the caller to gather further information that would help in determining
what the caller said at step (e.g. zipcode/postcode, suburb name, street name)
5. Create a targeted grammar using the further information (e.g. list of names in a specified postcode
+ suburb + address based on a postal database)
6. The VoiceXML application performs a re-recognition using the utterance recorded in step 1/2 and
the grammar created in step 5.
The key features of Holly’s implementation of re-recognition are:
• Standard recognition: All the standard VoiceXML capabilities for speech recognition are available
for re-recognition including parallel grammars, configuration properties, form filling, N-best and
confidence scores. Except for setting the re-recognition property (using a standard VoiceXML tag),
re-recognition is like any recognition ensuring familiarity to developers and enabling the use of
standard development tools and frameworks.
• Re-recognition of speech input: Re-recognition supports speech playback only – DTMF input is not
currently supported.
• Supports any MRCP speech recognizer: Re-recognition supports all MRCP ASR integrations including
Nuance (multiple products), IBM, Loquendo, LumenVox, Siemens and Telisma. (The Holly DTMF
Recognizer and vLingo do not support re-recognition.)
• Real-time or optimised playback: All MRCP integrations will support real-time playback of
recorded audio (i.e. re-recognition at normal speed). Where an MRCP ASR product support faster
than real-time playback Holly will stream the audio faster so that the re-recognition delay is
reduced for faster response to a caller.
• VoiceXML Extension: This capability is an extension because the VoiceXML standard defines no
native means of implementing the capability. Applications that use the capability are not
portable to other platforms.
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
© 2009 Holly Connects
55/61
hvp-vxml-0009
Holly 5-1 VoiceXML Developer Guide v1-0.doc
B.1 Re-recognition in VoiceXML Applications
This section documents the way in which VoiceXML applications are written to use re-recognition. Since
re-recognition is nearly identical to normal speech recognition the model will be familiar to VoiceXML
developers.
B.1.1 Using Re-Recognition
The only required difference between a normal recognition and a re-recognition is the declaration of
the variable name for the recorded utterance to be used in re-recognition:
<property name=”com.holly.rerecognition” value=”nextUtterance”>
This declaration must be placed in the scope of the re-recognition, typically the <field> at which re-
recognition apply.
The value is the name of an ECMAScript variable that contains a recording. This variable must be
assigned the value of a previous recording that it either:
• <record> item variable (See VoiceXML 2.0 Section 2.3.6)
• application.lastresult$.recording (see VoiceXML 2.1 Section 7)
Table 3 presents a sample re-recognition application.
1. <?xml version="1.0" encoding="UTF-8"?>
2.
3. <vxml xmlns="http://www.w3.org/2001/vxml"
4. xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
5. xsi:schemaLocation="http://www.w3.org/2001/vxml
6. http://www.w3.org/TR/voicexml20/vxml.xsd"
7. xml:lang="en-US"
8. version="2.1">
9.
10. <form id="form">
11. <var name="nextUtterance"/>
12. <field name="yesno" type="boolean">
13. <!-- Record yes or no from the caller -->
14. <!-- Use the builtin boolean grammar -->
15. <property name="recordutterance" value="true"/>
16. Say yes or no
17. <filled>
18. <assign name="nextUtterance"
19. expr="application.lastresult$.recording"/>
20. </filled>
21. <field>
22.
23. <field name="yesno2" modal="true">
24. <!-- Perform re-recognition of the utterance -->
25. <!-- Use a custom grammar this time -->
26. <property name="com.holly.rerecognition"
27. value="nextUtterance"/>
28. <grammar mode="voice" version="1.0" root="YN">
29. <rule id="YN" scope="public">
30. <one-of>
31. <item> yes </item>
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
© 2009 Holly Connects
56/61
hvp-vxml-0009
Holly 5-1 VoiceXML Developer Guide v1-0.doc
32. <item> no </item>
33. </one-of>
34. </rule>
35. </grammar>
36. <prompt bargein=”false”>Let me check that</prompt>
37. <field>
38.
39. <block>
40. first recognition was <value expr="yesno"/>
41. re-recognition was <value expr="yesno2"/>
42. </block>
43. </form>
44. </vxml>
Table 3: Example of re-recognition
Commentary on the example:
• Line 8: declare version=“2.1” on the <vxml> tag to enable the recording during recognition feature
of VoiceXML 2.1.
• Lines 11-20: this is a standard VoiceXML recognition <field>. The recordutterance property is set
to true in line 14 so that the application can take a copy of application.lastresult$.recording in
line 17.
• Lines 22-36: this <field> is nearly identical except for (a) the declaration of modal which will often
be used in re-recognition to disable global grammars, (b) the use of an inline grammar which
illustrates the use of a different grammar for re-recognition, and (c) the declaration of
com.holly.rerecognition in line 25 which requests that Holly use the recorded stored in the
“nextUtterance” variable rather than live input.
• Line 36: the prompt played prior to re-recognition is set with barge-in off to ensure that the
complete prompt is played to the caller without interruption.
• Lines 38-41: the processing of normal recognition and re-recognition results is identical.
B.1.2 <nomatch>, <noinput>, disconnect and other re-recognition outcomes
The VoiceXML application should handle all possible outcomes from re-recognition.
• <nomatch>: the re-recognition did not successfully match any active grammar. The application
may wish to try re-recognition against a different grammar or request that the caller provide the
input again.
• <noinput>: the re-recognition did not detect spoken input in the recorded utterance. No-input
will not normally occur if the original collection of the recording checked for noinput and speech
mode. Nevertheless, applications should handle this case because, for example, the recording
may be quiet and not trigger speech detection on the re-recognition. The application developer
may find that adjusting the sensitivity setting or changing the timeout settings will affect the
detection of speech.
• Hang-up: the caller may hang-up whilst a re-recognition is in progress. A connection.disconnect
event will be thrown as in normal VoiceXML execution.
• maxspeechtimeout: the maxspeechtimeout event will be thrown if the recording exceeds the
duration configured in the current scope.
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
© 2009 Holly Connects
57/61
hvp-vxml-0009
Holly 5-1 VoiceXML Developer Guide v1-0.doc
• <help>, <cancel>, <exit>: if the universal grammars are active then these events will be thrown
if the universal grammar is matched. Deactivating universals or using a modal field will prevent
this from occurring.
• Grammar exceptions: an exception will be thrown for any of the normal speech recognition errors
such as illegal grammars and unavailable languages.
B.1.3 Re-recognition Variable Values & Scoping
The example above uses a form-scope variable to pass the utterance to re-recognition. The following
are the full requirements for the ECMAScript variable name passed as the value for the
“com.holly.rerecognition” property.
1. The value must be a legal ECMAScript variable identifier;
2. The variable (e.g. “nextUtterance”) must be accessible from the scope in which the VoiceXML
application enters the wait state to perform re-recognition;
3. The variable may be explicitly scoped (e.g. “applicaton.nextUtterance”, “myform.varName”);
4. The variable must be a reference to a previous recording collected by either a <record> item
variable or from application.lastresult$.recording.
If the ECMAScript variable name does not meet these conditions then a VoiceXML error is thrown. The
following are error conditions that developers should avoid.
• The property value is not a legal ECMAScript variable name (e.g. “9”);
• The variable is undefined;
• The variable is a Number, String or any other non-waveform reference.
B.1.4 Form Filling
The normal form-filling behaviours of VoiceXML apply to re-recognition including the mapping of
semantic results to VoiceXML forms.
B.1.5 Grammar Scope
The normal VoiceXML grammar scoping rules apply to re-recognition.
Both external and inline grammars may be used.
It is expected that <field> grammars in modal fields will be the most common usage of re-recognition.
However, form grammars may be declared as well as mixed-initiative, <link> and <menu> grammars.
B.1.6 Grammar Modes
The re-recognition capability currently enables playback of audio input only and does not playback
DTMF input. DTMF grammars, if declared, will be loaded and enabled as normal but will not be
matched by the playback of the recorded waveform.
B.1.7 Utterance Recording
The application.lastresult$.recording variable is filled for a re-recognition as in a normal recognition
(see VoiceXML 2.1 Section 7). As is normal, the recordutterance must be set true for the recording to
be provided.
The utterance recording will be similar or identical to the recording provided as input for re-
recognition. There may be slight variation due to different end-pointing of the speech input.
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
© 2009 Holly Connects
58/61
hvp-vxml-0009
Holly 5-1 VoiceXML Developer Guide v1-0.doc
B.1.8 Result Processing
The “application.lastresult$” array is filled following the normal VoiceXML behaviour including
completion of the n-best results, confidence score, input mode and interpretation.
B.2 Prompts and Barge-in
Recommendation: bargein=false
An application may play one or many prompts prior to re-recognition. Applications will normally set
barge-in off (i.e. false) for prompts prior to re-recognition so that they are played in their entirety.
This is because (a) barge-in for re-recognition is typically much sooner than for normal spoken input
and (b) unlike a normal recognition the caller cannot choose whether to wait for the prompt to
complete before input is provided.
The sample presented in Table 3 in Section B.1.1 set barge-in off to ensure playback of a prompt.
The period after the end-of-prompt until the availability of the re-recognition result is typically brief
because the original recorded audio will have the leading and trailing silence removed. Furthermore,
most MRCP recognizers (currently all but Nuance 9) allow faster-than-realtime presentation of
recorded audio for re-recognition.
Use bargeintype=speech (default)
It is recommended that applications leave the bargeintype as “speech”. This ensures return to the
application immediately following re-recognition irrespective of whether there is a match, nomatch or
noinput. The use of the alternate “hotword” mode is not recommended because in the event of a
nomatch hotword recognition should continue to retry and this not sensible with re-recognition.
B.3 ASR Configuration
The standard speech recognition properties defined by VoiceXML 2.0 apply to re-recognition (see
VXML2.0 section 6.3.2 and 6.3.6). This includes confidencelevel, sensitivity, speedvsaccuracy (not
supported by all ASR engines), completetimeout, incompletetimeout, maxspeechtimeout and maxnbest.
Holly’s “asrengine” property and engine-specific configurations (e.g. “swirec” and “swiep” for Nuance
9) are supported as usual.
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
© 2009 Holly Connects
59/61
hvp-vxml-0009
Holly 5-1 VoiceXML Developer Guide v1-0.doc
C. Appendix: Holly DTMF Recognizer v2
Holly’s DTMF recognizer recognizes DTMF input in a caller’s audio stream and enables users to specify
grammars using the standard XML form of the Speech Recognition Grammar Specification (SRGS), as
specified by W3C and VoiceXML 2.0. Many applications already have grammars for DTMF specified using
SRGS+XML for Nuance or OSR, these are supported with minimal alteration.
The Holly DTMF Recognizer v2 is part of a shared-object plug-in to the Holly Voice Gateway (HVG). It
can be configured to be available or unavailable per HVG instance. Recognition takes place within the
plug-in on the same machine as the HVG; it does not act as a proxy for a remote recognizer as the
Nuance plug-ins do.
The Holly DTMF Recognizer v2 allows multiple grammars may be activated simultaneously. Grammars
may be activated by URI in order to support the preferred absolute mode of grammar fetching, or
may be supplied as an inline VoiceXML grammar.
The Holly DTMF Recognizer is enabled by setting the parameter ‘asrengine’ to “dtmf”. This property
can be set via the following methods:
• Through the Holly Management System’s Applications page on a per-application basis. or
• As an explicit <property> element within the code of the VoiceXML application.
The Holly DTMF Recognizer v2 supports the VXML <record> element.
The Holly DTMF Recognizer v2 implements the “literals” syntax of the SISR 1.0. Support for the
“semantics/1.0” syntax is planned for a future release.
C.1 SRGS+XML
The Holly DTMF Recognizer v2 is a conforming XML form grammar processor, as specified in SRGS
section 5.4, except that it is not required to support references to rules defined in external grammars.
In particular, the recognizer:
• Parses and processes all XML and XML Namespaces constructs.
• Ignores xml:lang attributes in grammar documents because they are not relevant to DTMF.
• Logs DTMF results in the HMS as with any speech recognition result.
• Ignores grammars whose mode attribute is "voice". Note that the default value for the mode
attribute is "voice", so the Holly DTMF Recognizer v2 will only process grammar documents that
explicitly set the mode to "dtmf".
C.2 Sample grammars
• Basic Menu
• Boolean
• Digits
• Phone
C.2.1 Basic Menu
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
© 2009 Holly Connects
60/61
hvp-vxml-0009
Holly 5-1 VoiceXML Developer Guide v1-0.doc
This sample grammar supports a simple menu collection that enables a caller to enter options 1, 2, 3, 4
or 0. The VoiceXML application will apply the behaviors to the digits, for example, “sales”, “directory”,
“operator” etc. The value returned to the application is the DTMF key; “1”, “2”, “3”, “4” or “0”. If
the caller enters any other DTMF key (e.g. 5 – 9, * #) then VoiceXML will present a “nomatch”.
<?xml version="1.0" encoding="iso-8859-1"?>
<grammar version="1.0" xmlns="http://www.w3.org/2001/06/grammar"
root="menu" mode="dtmf">
<rule id="menu" scope="public">
<one-of>
<item>1</item>
<item>2</item>
<item>3</item>
<item>4</item>
<item>0</item>
</one-of>
</rule>
</grammar>
C.2.2 Boolean
This grammar collects a Boolean input. It demonstrates the use of literal tags to return an application-
defined string rather than the DTMF sequence. If the caller enters “1” then the returned result is
“true”. Similarly, for entry of “2” the result is “false”.
<?xml version="1.0" encoding="iso-8859-1"?>
<grammar version="1.0" xmlns="http://www.w3.org/2001/06/grammar"
root="boolean" mode="dtmf" tag-format="semantics/1.0-literals">
<rule id="boolean" scope="public">
<one-of>
<item>1<tag>true</tag></item>
<item>2<tag>false</tag></item>
</one-of>
</rule>
</grammar>
Note: The return values of the sample boolean grammar are returned as strings.
C.2.3 Digits
This sample grammar supports a digit sequence with 1 or more digits (with no imposed limit). For
example, the following sequences are legal: “1234”, “123456789”. The return value is the entered
DTMF sequence as shown (i.e. without any whitespace).
Since there is no limit on the number of digits, the recognizer will keep waiting for digits until either:
• A DTMF “termchar” is received: (termchar is a VoiceXML property that can be set by the
application with the default value of “#”). The return value does not include the termchar.
• A DTMF interdigit timeout is reached (default value is 3 seconds)
<?xml version="1.0" encoding="iso-8859-1"?>
<grammar version="1.0" xmlns="http://www.w3.org/2001/06/grammar"
root="digits" mode="dtmf">
<rule id="digits" scope="public">
<item repeat="1-">
<one-of>
Confidential
Final
Holly5 VoiceXML Developer Guide, v1-0, 22 December 2009
© 2009 Holly Connects
61/61
hvp-vxml-0009
Holly 5-1 VoiceXML Developer Guide v1-0.doc
<item>0</item>
<item>1</item>
<item>2</item>
<item>3</item>
<item>4</item>
<item>5</item>
<item>6</item>
<item>7</item>
<item>8</item>
<item>9</item>
</one-of>
</item>
</rule>
</grammar>
C.2.4 Phone
This sample grammar re-uses the digits grammar above to recognize telephone numbers with an
optional extension. The “*” key is used to mark the extension.
There are no constraints on the length of the phone number. Making modifications to the repeat value
allows support for specific national phone patterns.
<?xml version="1.0" encoding="iso-8859-1"?>
<grammar version="1.0" xmlns="http://www.w3.org/2001/06/grammar"
root="phone" mode="dtmf" tag-format="semantics/1.0-literals">
<rule id="phone" scope="public">
<ruleref uri="#digits"/>
<item repeat="0-1"> dtmf-
star
<ruleref uri="#digits"/>
</item>
</rule>
<rule id="digits">
<item repeat="1-">
<one-of>
<item>0</item>
<item>1</item>
<item>2</item>
<item>3</item>
<item>4</item>
<item>5</item>
<item>6</item>
<item>7</item>
<item>8</item>
<item>9</item>
</one-of>
</item>
</rule>
</grammar>
*** End of Document ***