CN116186201A - Government affair item searching method and device based on voice recognition, medium and equipment - Google Patents

Government affair item searching method and device based on voice recognition, medium and equipment Download PDF

Info

Publication number
CN116186201A
CN116186201A CN202310148316.4A CN202310148316A CN116186201A CN 116186201 A CN116186201 A CN 116186201A CN 202310148316 A CN202310148316 A CN 202310148316A CN 116186201 A CN116186201 A CN 116186201A
Authority
CN
China
Prior art keywords
text
government affair
voice
sequence
representation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310148316.4A
Other languages
Chinese (zh)
Inventor
邢亮
张兆勇
潘震
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Software Co Ltd
Original Assignee
Inspur Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Software Co Ltd filed Critical Inspur Software Co Ltd
Priority to CN202310148316.4A priority Critical patent/CN116186201A/en
Publication of CN116186201A publication Critical patent/CN116186201A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3343Query execution using phonetics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/338Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention provides a government affair item searching method and device based on voice recognition, a medium and equipment. The method comprises the following steps: acquiring voice for searching government affair data; performing recognition processing on the voice to obtain a corresponding representation text; performing full-text search according to the representation text to obtain a plurality of government affair data; calculating the similarity between the representation text and each piece of government affair data; and outputting government affair item data corresponding to the highest similarity as a search result. The invention can improve the convenience of query operation, and especially provides convenience for the elderly population to query data. And full text retrieval is performed, so that text similarity calculation is performed according to the input representation text, and the query efficiency and accuracy can be improved.

Description

Government affair item searching method and device based on voice recognition, medium and equipment
Technical Field
The invention relates to the technical field of voice recognition, in particular to a government affair item searching method, a government affair item searching device, a government affair item searching medium and government affair item searching equipment based on voice recognition.
Background
In recent years, the improvement of government services such as the Internet and government services is greatly promoted, the government field is widely related, the government knowledge regulations are huge in quantity, all government matters cannot be displayed in a proper mode, the capability of accurately positioning the government matters is urgently needed to be improved, and the current mainstream is to manually input and inquire the government matters to implement a list, so that the inquiry efficiency is low, the time consumption is long, and the operation of the old is more inconvenient.
Disclosure of Invention
Aiming at least one technical problem, the embodiment of the invention provides a government affair item searching method, a government affair item searching device, a government affair item searching medium and government affair item searching equipment based on voice recognition.
According to a first aspect, a government affair item searching method based on voice recognition provided by an embodiment of the present invention includes:
acquiring voice for searching government affair data;
performing recognition processing on the voice to obtain a corresponding representation text;
performing full-text search according to the representation text to obtain a plurality of government affair data;
calculating the similarity between the representation text and each piece of government affair data;
and outputting government affair item data corresponding to the highest similarity as a search result.
According to a second aspect, a government affair item searching device based on voice recognition provided by an embodiment of the present invention includes:
the voice acquisition module is used for acquiring voice for searching government affair data;
the text representation module is used for carrying out recognition processing on the voice to obtain a corresponding representation text;
the item searching module is used for carrying out full-text searching according to the representation text to obtain a plurality of government affair item data;
the similarity calculation module is used for calculating the similarity between the representation text and each piece of government affair data;
and the result output module is used for outputting government affair item data corresponding to the highest similarity as a search result.
According to a third aspect, embodiments of the present invention provide a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method provided by the first aspect.
According to a fourth aspect, a computing device provided by an embodiment of the present invention includes a memory and a processor, where the memory stores executable code, and the processor implements the method provided by the first aspect when executing the executable code.
According to the government affair item searching method, device, medium and equipment based on voice recognition, voice for searching government affair item data is firstly obtained, and then recognition processing is carried out on the voice to obtain corresponding representation text. And then carrying out full-text search according to the representation text to obtain a plurality of pieces of government affair data, calculating the similarity between the representation text and each piece of government affair data, and outputting the government affair data corresponding to the highest similarity as a search result. The embodiment of the invention optimizes the human-computer interaction process through the voice recognition technology, improves the convenience of query operation, and particularly provides convenience for the elderly population to query data. And full text retrieval is performed, so that text similarity calculation is performed according to the input representation text, and the query efficiency and accuracy can be improved.
Drawings
Fig. 1 is a flow chart of a government affair searching method based on voice recognition according to an embodiment of the invention.
Detailed Description
In a first aspect, an embodiment of the present invention provides a government affair item searching method based on voice recognition, referring to fig. 1, the method includes steps S110 to S150 as follows:
s110, acquiring voice for searching government affair data;
it is understood that the voice is a voice input by the user and is used for searching the desired government affair data.
S120, carrying out recognition processing on the voice to obtain a corresponding representation text;
it will be appreciated that the recognition process is performed on the speech to obtain text, which is actually converting the information in speech form to that in text.
In one embodiment, S120 may specifically include the following steps S121 to S125:
s121, preprocessing the voice;
in one embodiment, S121 may specifically include: the speech is converted into a spectrogram. That is, the preprocessing is to convert the voice signal in the time domain into a frequency domain map, i.e., a frequency domain signal. For example, the conversion from time domain to frequency domain is performed by preprocessing such as framing, windowing, pre-emphasis, etc.
S122, extracting features of the preprocessed voice to obtain a feature vector sequence;
it will be appreciated that feature extraction is performed on a pre-processed speech, such as the aforementioned spectrogram, to obtain a sequence of feature vectors characterizing the speech, at least one feature vector being included in the sequence of feature vectors.
In one embodiment, S122 may specifically include: and carrying out feature extraction on the spectrogram according to the Mel cepstrum coefficient to obtain the feature vector sequence.
It is understood that Mel-frequency cepstral coefficients, english Mel-Frequency CepstrumCoefficients, abbreviated MFCCs, are linear transforms of the logarithmic energy spectrum based on the nonlinear Mel scale of sound frequencies. The mel-frequency cepstrum coefficient is a coefficient constituting the mel-frequency cepstrum. It is derived from the cepstrum of the audio piece. The band division of the mel-frequency cepstrum is equally divided on the mel scale, which more closely approximates the human auditory system than the linearly spaced bands used in normal cepstrum. Such a non-linear representation may provide a better representation of the sound signal in a number of fields.
S123, converting the feature vector sequence into a pinyin sequence;
it is understood that the sequence of feature vectors is converted into a sequence of pinyin forms.
In one embodiment, S123 may specifically include: converting the feature vector sequence into the pinyin sequence through an acoustic model; wherein the acoustic model comprises a hidden Markov model.
Wherein, the hidden Markov model is a deep neural network-hidden Markov model, the English is Deep Neural Networks-Hidden Markov Model, and the hidden Markov model is abbreviated as DNN-HMM.
The hidden Markov model is trained in advance, input information of the hidden Markov model is a characteristic vector sequence of voice, and the input information is a pinyin sequence of the voice. The hidden Markov model converts the feature vector sequence into at least one possible pinyin sequence, and then selects the pinyin sequence with the highest score from the plurality of pinyin sequences, thereby outputting a final pinyin sequence.
Wherein, because of different dialects and speaking habits, a section of voice may correspond to various pinyin sequences. The hidden markov model outputs a pinyin sequence in which the likelihood is greatest.
S124, converting the pinyin sequence into a phrase sequence;
it is understood that a pinyin may correspond to a plurality of words or phrases, for example, the phrase corresponding to a shoijiao may be a sleep, a dumpling, a cistern, etc., and the phrase sequences may be combined differently due to different pause or split positions in the pinyin sequences.
In one embodiment, S124 may specifically include: converting the pinyin sequence into the phrase sequence through a voice model; wherein the speech model comprises an N-gram model.
Wherein the N-Gram statistical language model is the N-Gram statistical language model.
It can be understood that when the N-gram model needs to convert the pinyin sequence without space into the chinese character strings, the chinese character strings are possible to be multiple, so that the N-gram model can score each chinese character string, and then select one chinese character string with the highest score as the phrase sequence, thereby realizing the conversion from pinyin to chinese character, needing no manual selection by the user, and avoiding the problem of repeated codes of a plurality of chinese characters corresponding to the same pinyin.
The N-element model is obtained through training in advance, input information is a pinyin sequence, and output information is a phrase sequence.
S125, decoding the phrase sequence to obtain the representation text.
That is, the phrase sequence is decoded into text, which is used as the representation text of the speech as the input text for the subsequent full-text search query.
S130, performing full-text search according to the representation text to obtain a plurality of government affair data;
that is, the search is performed in the database of government affair data using the presentation text as a search term, and a plurality of pieces of related government affair data are obtained.
S140, calculating the similarity between the representation text and each piece of government affair data;
it can be understood that, in order to output a piece of government affair data with the highest correlation, it is necessary to select from the pieces of government affair data searched in S130, specifically, select a piece of government affair data with the highest correlation based on the similarity.
In one embodiment, S140 may specifically include S141 to S143:
s141, preprocessing the representation text;
the preprocessing may include word segmentation, error correction, normalization, and the like.
S142, identifying the named entities in the preprocessed representation text, and forming a keyword array by the identified named entities;
where named entities refer to core words, phrases, etc. that represent text.
That is, after keywords such as core words and phrases are identified from the preprocessed representation text, the keywords are formed into a keyword array.
In one embodiment, S142 may specifically include:
and identifying named entities in the preprocessed representation text by adopting a two-way long-short term memory neural network model and/or a conditional random field deep learning model.
The bidirectional long-short-term memory neural network model is Long Short Term Memory neural network model, which is called LSTM neural network model for short.
The conditional random field deep learning model Conditional Random Field is abbreviated as a CRF deep learning model, and the CRF deep learning model is a basic model in natural language processing.
The LSTM neural network model and the CRF deep learning model are obtained through training in advance, input information is a representation text, and output information is a keyword in the representation text.
S143, calculating the text similarity between the keyword array and each piece of government affair data, and taking the text similarity as the similarity between the representation text and the piece of government affair data.
After the keyword group is obtained in S142, the similarity between the keyword group and each piece of government affair data searched in S130 is calculated, for example, the similarity may be calculated by means of cosine similarity, euclidean distance, and the like. The larger the cosine similarity is, the larger the text similarity between the keyword array and the government affair data is. The larger the Euclidean distance is, the smaller the text similarity between the keyword array and the government affair data is.
And S150, outputting government affair item data corresponding to the highest similarity as a search result.
For example, after the similarity between the presentation text and each piece of government matter data is calculated, the pieces of government matter data are ordered in the order of the similarity from large to small, the government matter data corresponding to the highest similarity, which is the first government matter data, are arranged, the government matter data corresponding to the highest similarity are output as search results, and then the user can see the search results.
It can be understood that S120 is actually an automatic speech recognition technology, i.e. Automatic Speech Recognition, abbreviated as ASR, which is a technology for converting human speech into text, and performs feature extraction on an audio signal to be analyzed to provide an appropriate feature vector sequence for an acoustic model; converting the characteristic vector sequence into a pinyin sequence according to the acoustic characteristics in the acoustic model; converting the spelling sequence into phrase sequence by language model; and finally decoding the phrase sequence according to the existing dictionary to obtain the representation text.
Wherein, S140-S150 uses natural language to communicate with computer, english is Natural Language Processing, namely natural language processing. Preprocessing the input representation text, identifying a named entity of the preprocessed representation text, determining a core word or phrase of the representation text, calculating the similarity, and finally sorting and selecting government affair item data corresponding to the highest similarity for output.
It can be understood that for the query function of massive government matters, the government matters to be queried and the implementation list thereof need to be precisely positioned, so that convenience is provided for the query of the data of the aged population. Therefore, the embodiment of the invention provides a government affair data searching method based on voice recognition, which optimizes the man-machine interaction process through the voice recognition technology and improves the convenience of inquiry operation; and then, through a full text retrieval technology, text similarity calculation is carried out according to the input representation text, so that the query efficiency and accuracy are improved.
In a second aspect, an embodiment of the present invention provides a government affair item searching device based on voice recognition, including:
the voice acquisition module is used for acquiring voice for searching government affair data;
the text representation module is used for carrying out recognition processing on the voice to obtain a corresponding representation text;
the item searching module is used for carrying out full-text searching according to the representation text to obtain a plurality of government affair item data;
the similarity calculation module is used for calculating the similarity between the representation text and each piece of government affair data;
and the result output module is used for outputting government affair item data corresponding to the highest similarity as a search result.
In one embodiment, the text representation module includes:
the first preprocessing unit is used for preprocessing the voice;
the feature extraction unit is used for extracting features of the preprocessed voice to obtain a feature vector sequence;
the first conversion unit is used for converting the characteristic vector sequence into a pinyin sequence;
the second conversion unit is used for converting the Pinyin sequence into a phrase sequence;
and the sequence decoding unit is used for decoding the phrase sequence to obtain the representation text.
Further, the first preprocessing unit is specifically configured to: converting the voice into a spectrogram; correspondingly, the feature extraction unit is specifically configured to: and carrying out feature extraction on the spectrogram according to the Mel cepstrum coefficient to obtain the feature vector sequence.
In one embodiment, the first conversion unit is specifically configured to: converting the feature vector sequence into the pinyin sequence through an acoustic model; wherein the acoustic model comprises a hidden Markov model.
In one embodiment, the second conversion unit is specifically configured to: converting the pinyin sequence into the phrase sequence through a voice model; wherein the speech model comprises an N-gram model.
In one embodiment, the similarity calculation module includes:
a second preprocessing unit, configured to preprocess the representation text;
the entity identification unit is used for identifying the named entities in the preprocessed representation text and forming a keyword array from the identified named entities;
and the similarity calculation unit is used for calculating the text similarity between the keyword array and each piece of government affair data, and taking the text similarity as the similarity between the representation text and the piece of government affair data.
Further, the entity identification unit is specifically configured to: and identifying named entities in the preprocessed representation text by adopting a two-way long-short term memory neural network model and/or a conditional random field deep learning model.
It may be understood that, for explanation, specific implementation, beneficial effects, examples, etc. of the content in the apparatus provided by the embodiment of the present invention, reference may be made to corresponding parts in the method provided in the first aspect, which are not repeated herein.
In a third aspect, embodiments of the present invention provide a computer readable medium having stored thereon computer instructions which, when executed by a processor, cause the processor to perform the method provided in the first aspect.
Specifically, a system or apparatus provided with a storage medium on which a software program code realizing the functions of any of the above embodiments is stored, and a computer (or CPU or MPU) of the system or apparatus may be caused to read out and execute the program code stored in the storage medium.
In this case, the program code itself read from the storage medium may realize the functions of any of the above-described embodiments, and thus the program code and the storage medium storing the program code form part of the present invention.
Examples of the storage medium for providing the program code include a floppy disk, a hard disk, a magneto-optical disk, an optical disk (e.g., CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, DVD+RW), a magnetic tape, a nonvolatile memory card, and a ROM. Alternatively, the program code may be downloaded from a server computer by a communication network.
Further, it should be apparent that the functions of any of the above-described embodiments may be implemented not only by executing the program code read out by the computer, but also by causing an operating system or the like operating on the computer to perform part or all of the actual operations based on the instructions of the program code.
Further, it is understood that the program code read out by the storage medium is written into a memory provided in an expansion board inserted into a computer or into a memory provided in an expansion module connected to the computer, and then a CPU or the like mounted on the expansion board or the expansion module is caused to perform part and all of actual operations based on instructions of the program code, thereby realizing the functions of any of the above embodiments.
It may be appreciated that, for explanation, specific implementation, beneficial effects, examples, etc. of the content in the computer readable medium provided by the embodiment of the present invention, reference may be made to corresponding parts in the method provided in the first aspect, and details are not repeated herein.
In a fourth aspect, one embodiment of the present specification provides a computing device comprising a memory having executable code stored therein and a processor that, when executing the executable code, performs the method of any one of the embodiments of the specification.
It may be appreciated that, for explanation, specific implementation, beneficial effects, examples, etc. of the content in the computing device provided by the embodiment of the present invention, reference may be made to corresponding parts in the method provided in the first aspect, which are not repeated herein.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments in part.
Those skilled in the art will appreciate that in one or more of the examples described above, the functions described in the present invention may be implemented in hardware, software, a pendant, or any combination thereof. When implemented in software, these functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.
The foregoing embodiments have been provided for the purpose of illustrating the general principles of the present invention in further detail, and are not to be construed as limiting the scope of the invention, but are merely intended to cover any modifications, equivalents, improvements, etc. based on the teachings of the invention.

Claims (10)

1. A government affair item searching method based on voice recognition is characterized by comprising the following steps:
acquiring voice for searching government affair data;
performing recognition processing on the voice to obtain a corresponding representation text;
performing full-text search according to the representation text to obtain a plurality of government affair data;
calculating the similarity between the representation text and each piece of government affair data;
and outputting government affair item data corresponding to the highest similarity as a search result.
2. The method of claim 1, wherein the performing recognition processing on the speech to obtain corresponding representation text includes:
preprocessing the voice;
extracting features of the preprocessed voice to obtain a feature vector sequence;
converting the characteristic vector sequence into a pinyin sequence;
converting the pinyin sequence into a phrase sequence;
and decoding the phrase sequence to obtain the representation text.
3. The method of claim 2, wherein the preprocessing the speech comprises: converting the voice into a spectrogram;
correspondingly, the feature extraction is performed on the preprocessed voice to obtain a feature vector sequence, which comprises the following steps: and carrying out feature extraction on the spectrogram according to the Mel cepstrum coefficient to obtain the feature vector sequence.
4. The method of claim 2, wherein said converting the sequence of feature vectors into a sequence of pinyin comprises:
converting the feature vector sequence into the pinyin sequence through an acoustic model; wherein the acoustic model comprises a hidden Markov model.
5. The method of claim 2, wherein said converting the pinyin sequence to a phrase sequence comprises:
converting the pinyin sequence into the phrase sequence through a voice model; wherein the speech model comprises an N-gram model.
6. The method of claim 1, wherein said calculating the similarity between the representative text and each piece of government matter data comprises:
preprocessing the representation text;
identifying the named entities in the preprocessed representation text, and forming a keyword array from the identified named entities;
and calculating the text similarity between the keyword array and each piece of government affair data, and taking the text similarity as the similarity between the representation text and the piece of government affair data.
7. The method of claim 6, wherein the identifying named entities in the preprocessed representation text comprises:
and identifying named entities in the preprocessed representation text by adopting a two-way long-short term memory neural network model and/or a conditional random field deep learning model.
8. A government affair item searching device based on voice recognition, comprising:
the voice acquisition module is used for acquiring voice for searching government affair data;
the text representation module is used for carrying out recognition processing on the voice to obtain a corresponding representation text;
the item searching module is used for carrying out full-text searching according to the representation text to obtain a plurality of government affair item data;
the similarity calculation module is used for calculating the similarity between the representation text and each piece of government affair data;
and the result output module is used for outputting government affair item data corresponding to the highest similarity as a search result.
9. A computer readable storage medium, having stored thereon a computer program which, when executed in a computer, causes the computer to perform a method implementing any of claims 1 to 7.
10. A computing device comprising a memory and a processor, the memory having executable code stored therein, the processor, when executing the executable code, implementing the method of any one of claims 1-7.
CN202310148316.4A 2023-02-21 2023-02-21 Government affair item searching method and device based on voice recognition, medium and equipment Pending CN116186201A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310148316.4A CN116186201A (en) 2023-02-21 2023-02-21 Government affair item searching method and device based on voice recognition, medium and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310148316.4A CN116186201A (en) 2023-02-21 2023-02-21 Government affair item searching method and device based on voice recognition, medium and equipment

Publications (1)

Publication Number Publication Date
CN116186201A true CN116186201A (en) 2023-05-30

Family

ID=86447271

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310148316.4A Pending CN116186201A (en) 2023-02-21 2023-02-21 Government affair item searching method and device based on voice recognition, medium and equipment

Country Status (1)

Country Link
CN (1) CN116186201A (en)

Similar Documents

Publication Publication Date Title
CN107590135B (en) Automatic translation method, device and system
CN111710333B (en) Method and system for generating speech transcription
US10210862B1 (en) Lattice decoding and result confirmation using recurrent neural networks
KR101143030B1 (en) Discriminative training of language models for text and speech classification
CN106782560B (en) Method and device for determining target recognition text
US6934683B2 (en) Disambiguation language model
Schuster et al. Japanese and korean voice search
US6910012B2 (en) Method and system for speech recognition using phonetically similar word alternatives
KR20080069990A (en) Speech index pruning
US20220262352A1 (en) Improving custom keyword spotting system accuracy with text-to-speech-based data augmentation
US10482876B2 (en) Hierarchical speech recognition decoder
JP2004005600A (en) Method and system for indexing and retrieving document stored in database
JP2005165272A (en) Speech recognition utilizing multitude of speech features
JPH08328585A (en) Method and device for natural language processing and method and device for voice recognition
JP2004133880A (en) Method for constructing dynamic vocabulary for speech recognizer used in database for indexed document
CN110019741B (en) Question-answering system answer matching method, device, equipment and readable storage medium
US11030999B1 (en) Word embeddings for natural language processing
US9135911B2 (en) Automated generation of phonemic lexicon for voice activated cockpit management systems
Garg et al. Hierarchical Multi-Stage Word-to-Grapheme Named Entity Corrector for Automatic Speech Recognition.
KR20130126570A (en) Apparatus for discriminative training acoustic model considering error of phonemes in keyword and computer recordable medium storing the method thereof
KR100480790B1 (en) Method and apparatus for continous speech recognition using bi-directional n-gram language model
Imperl et al. Clustering of triphones using phoneme similarity estimation for the definition of a multilingual set of triphones
CN116186201A (en) Government affair item searching method and device based on voice recognition, medium and equipment
KR20050101695A (en) A system for statistical speech recognition using recognition results, and method thereof
KR20050101694A (en) A system for statistical speech recognition with grammatical constraints, and method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination