Download ppt - XML Output for Sphinx

Transcript
Page 1: XML Output for Sphinx

XML Output for Sphinx

• Motivation: applications may be able to make use of richer information from sphinx including n-best lists, the word lattice, and other features. An xml dtd format will be standard, and easy to parse, express, and modify.

Page 2: XML Output for Sphinx

Proposed DTD

– http://www.cs.cmu.edu/~tkharris/usi/utterance-0.1.dtd

– Sphinx produces utterances, each utterance is an xml document that conforms to the DTD

– An utterance is an n-best list or word-lattice or both

– An n-best list is a list of lists of words

– Each list and the words may have features

– The DTD desperately needs review

Page 3: XML Output for Sphinx

Issues

• Is the motivation justified?

• Computational/Network impact too much?

• API’s are needed to parse XML

• Need to get requirements/observations from Sphinx customers


Recommended