Upload
william-f
View
217
Download
0
Embed Size (px)
Citation preview
5,075,896
43.72.Ne CHARACTER AND PHONEME RECOGNITION BASED ON PROBABILITY CLUSTERING
Lynn D. Wilcox and A. Lawrence Spitz, assignors to Xerox Corporation
24 December 1991 (Class 382/39); filed 25 October 1989
This recognition technique, applicable to either written characters or spoken phoneroes, keeps the IDs of clusters in the phonetic candidate space instead of making specific phoneme choices immediately. In this way, more classification information about the best fitting candidate is retained while maintaining the low bandwidth between recognition com- ponents needed for modularity and fiexibility.--DLR
5,146,503
43.72.Ne SPEECH RECOGNITION
Ian R. Cameron and Paul C. Millar, assignors to British Telecommunications public limited company
8 September 1992 (Class 381/43); filed in the United Kingdom 28 August 1987
This method combines the time/spectral analysis matrices from multiple repetitions of a given utterance by one or more speakers to select a representative feature matrix for the word. Representative patternsare selected by computing the distances from each pattern to each other pattern of the same word, and to all patterns of words which compete within the same syntax. A variety of token pronunciations is assured by combining "list" style readings with utterances prompted as answers to questions.--DLR
5,136,653
43.72.Ne ACOUSTIC RECOGNITION SYSTEM
USING ACCUMULATE POWER SERIES
Ryohei Kumagai et aL, assignors to Ezel, Incorporated 4 August 1992 (Class 381/43); filed 11 January 1989
The patent describes the•pplication of certain high-speed, two- dimensional image processing hardware to phonetic recognition. The sys- tem relies on local (small domain) logic elements to form a binary quan- tization of the log power levels of the input. Median filtering is used to smooth the log power data and a differentiation produces a two- dimensional pattern of power dips. A two-dimensional associative access system allows retrieval of written characters or other patterns based on the power dip patterns.--DLR
5,136,654
43.72.Ne VOCABULARY PARTITIONED SPEECH
RECOGNITION APPARATUS
William F. Ganong, III, et aL, assignors to Kurzweil Applied Intelligence, Incorporated
4 August 1992 (Class 381/41); filed 19 October 1989
The search time needed to locate an isolated-word, vector-quantized spectral feature pattern in a large-vocabulary reference space is reduced by partitioning the reference space. A lookup table of interframe distances between all entries in a frame codebook simplifies the calculation of in- terpattern distances. The partitioning method iteratively adjusts partition boundaries so as to roughly balance partition sizes while minimizing the distances within each partition to a selected representative pattern. Rec- ognition proceeds by finding distances from a new input to all of the partition representatives, ordering the partitions by this distance, then searching partitions in order until a match criterion is met.--DLR
5,144,672
43.72.Ne SPEECH RECOGNITION APPARATUS
INCLUDING SPEAKER-INDEPENDENT DICTIONARY AND SPEAKER-DEPENDENT
Shoji Kuriki, assignor to Ricoh Company, Limited 1 September 1992 (Class 381/41); filed in Japan 5 October 1989
As I understand it, this isolated-word speech recognizer includes the usual speaker-dependent set of reference vocabulary patterns, but also maintains a set of speaker-independent patterns formed by averaging mul- tiple dependent patterns. Recognition matches are performed on the in- dependent patterns after amplitude normalizing to the maximum value in the time/spectral matrix making up each pattern. How can this be an improvement over a traditional system with separate pattern sets for each speaker?--DLR
5,151,940
43.72.Ne METHOD AND APPARATUS FOR EXTRACTING ISOLATED SPEECH WORD
Makoto Okazaki and Koji Eto, assignorsto Fujitsu Limited 29 September 1992 (Class381/43);,filed in Japan 24 December t987
This method detects the beginning and end of an utterance in con- tinuous input, based on power levels in each of two overlapping frequency bands. The patent refers to power in a low band below 3 KHz as "vowel power", and power in a high bandabove ! KHz as "consonant power." Each band power value is compared to a distinct threshold. A simple decision tree looks at the durations of the intervals in which the power levels were above or below threshold.--DLR
5,179,624
43.72.Ne SPEECH RECOGNITION APPARATUS
USING NEURAL NETWORK AND FUZZY LOGIC
Akio Amano et aL, assignors to Hitachi, Limited 12 January 1993 (Class 395/2); filed in Japan 7 September 1988
This phonetics-learning speech recognizer uses fuzzy logic to select a preliminary group of candidate phonetic categories for each input pho- netic segment. Another fuzzy logic system makes a final choice from
,,,3.o• } • ,•2kj •" • •'
•lJ• •'
10-1 10-2 10-3 10-•, 10-16
among the candidates. Whenever correct recognition results are make known to the system, a neural network sensing the fuzzy weightings is trained to the new settings. The result is to reshape the fuzzy logic components.--DLR
1185 J. Acoust. Soc. Am., Vol. 95, No. 2, February 1994 Review of Acoustical Patents 1185
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 134.129.164.186 On: Sat, 20 Dec 2014 16:37:28