Vocabulary partitioned speech recognition apparatus

5,075,896

43.72.Ne CHARACTER AND PHONEME RECOGNITION BASED ON PROBABILITY CLUSTERING

Lynn D. Wilcox and A. Lawrence Spitz, assignors to Xerox Corporation

24 December 1991 (Class 382/39); filed 25 October 1989

This recognition technique, applicable to either written characters or spoken phoneroes, keeps the IDs of clusters in the phonetic candidate space instead of making specific phoneme choices immediately. In this way, more classification information about the best fitting candidate is retained while maintaining the low bandwidth between recognition components needed for modularity and fiexibility.--DLR

5,146,503

43.72.Ne SPEECH RECOGNITION

Ian R. Cameron and Paul C. Millar, assignors to British Telecommunications public limited company

8 September 1992 (Class 381/43); filed in the United Kingdom 28 August 1987

This method combines the time/spectral analysis matrices from multiple repetitions of a given utterance by one or more speakers to select a representative feature matrix for the word. Representative patternsare selected by computing the distances from each pattern to each other pattern of the same word, and to all patterns of words which compete within the same syntax. A variety of token pronunciations is assured by combining "list" style readings with utterances prompted as answers to questions.--DLR

5,136,653

43.72.Ne ACOUSTIC RECOGNITION SYSTEM

USING ACCUMULATE POWER SERIES

Ryohei Kumagai et aL, assignors to Ezel, Incorporated 4 August 1992 (Class 381/43); filed 11 January 1989

The patent describes the•pplication of certain high-speed, two- dimensional image processing hardware to phonetic recognition. The system relies on local (small domain) logic elements to form a binary quan- tization of the log power levels of the input. Median filtering is used to smooth the log power data and a differentiation produces a two- dimensional pattern of power dips. A two-dimensional associative access system allows retrieval of written characters or other patterns based on the power dip patterns.--DLR

5,136,654

43.72.Ne VOCABULARY PARTITIONED SPEECH

RECOGNITION APPARATUS

William F. Ganong, III, et aL, assignors to Kurzweil Applied Intelligence, Incorporated

4 August 1992 (Class 381/41); filed 19 October 1989

The search time needed to locate an isolated-word, vector-quantized spectral feature pattern in a large-vocabulary reference space is reduced by partitioning the reference space. A lookup table of interframe distances between all entries in a frame codebook simplifies the calculation of in- terpattern distances. The partitioning method iteratively adjusts partition boundaries so as to roughly balance partition sizes while minimizing the distances within each partition to a selected representative pattern. Rec- ognition proceeds by finding distances from a new input to all of the partition representatives, ordering the partitions by this distance, then searching partitions in order until a match criterion is met.--DLR

5,144,672

43.72.Ne SPEECH RECOGNITION APPARATUS

INCLUDING SPEAKER-INDEPENDENT DICTIONARY AND SPEAKER-DEPENDENT

Shoji Kuriki, assignor to Ricoh Company, Limited 1 September 1992 (Class 381/41); filed in Japan 5 October 1989

As I understand it, this isolated-word speech recognizer includes the usual speaker-dependent set of reference vocabulary patterns, but also maintains a set of speaker-independent patterns formed by averaging multiple dependent patterns. Recognition matches are performed on the independent patterns after amplitude normalizing to the maximum value in the time/spectral matrix making up each pattern. How can this be an improvement over a traditional system with separate pattern sets for each speaker?--DLR

5,151,940

43.72.Ne METHOD AND APPARATUS FOR EXTRACTING ISOLATED SPEECH WORD

Makoto Okazaki and Koji Eto, assignorsto Fujitsu Limited 29 September 1992 (Class381/43);,filed in Japan 24 December t987

This method detects the beginning and end of an utterance in con- tinuous input, based on power levels in each of two overlapping frequency bands. The patent refers to power in a low band below 3 KHz as "vowel power", and power in a high bandabove ! KHz as "consonant power." Each band power value is compared to a distinct threshold. A simple decision tree looks at the durations of the intervals in which the power levels were above or below threshold.--DLR

5,179,624

43.72.Ne SPEECH RECOGNITION APPARATUS

USING NEURAL NETWORK AND FUZZY LOGIC

Akio Amano et aL, assignors to Hitachi, Limited 12 January 1993 (Class 395/2); filed in Japan 7 September 1988

This phonetics-learning speech recognizer uses fuzzy logic to select a preliminary group of candidate phonetic categories for each input phonetic segment. Another fuzzy logic system makes a final choice from

,,,3.o• } • ,•2kj •" • •'

•lJ• •'

10-1 10-2 10-3 10-•, 10-16

among the candidates. Whenever correct recognition results are make known to the system, a neural network sensing the fuzzy weightings is trained to the new settings. The result is to reshape the fuzzy logic components.--DLR

1185 J. Acoust. Soc. Am., Vol. 95, No. 2, February 1994 Review of Acoustical Patents 1185

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 134.129.164.186 On: Sat, 20 Dec 2014 16:37:28

Documents

Vocabulary partitioned speech recognition apparatus