CN109065020B

CN109065020B - Multi-language category recognition library matching method and system

Info

Publication number: CN109065020B
Application number: CN201810849884.6A
Authority: CN
Inventors: 潘晓明
Original assignee: Chongqing Youbanhome Technology Co ltd
Current assignee: Chongqing Youbanhome Technology Co ltd
Priority date: 2018-07-28
Filing date: 2018-07-28
Publication date: 2020-11-20
Anticipated expiration: 2038-07-28
Also published as: CN109065020A

Abstract

The invention relates to the technical field of voice recognition, in particular to a method and a system for matching a recognition library of multiple language categories. The method comprises the following steps: a voice collection step of collecting user voice; a voice recognition step, namely sequentially using each recognition library to recognize the voice of the user within a preset recognition library range to generate a recognition result; a scoring step, namely analyzing and evaluating the identification result of each identification library to generate identification library scores; and a screening step, namely selecting the recognition library with the highest score of the recognition libraries as the recognition library adopted by the current user voice recognition. The method and the system for matching the recognition base of the multi-language category can dynamically adjust the recognition base according to the input of the multi-language type, automatically select the proper recognition base according to the voice of the user, improve the recognition accuracy and solve the problem of poor recognition effect under the condition of multi-language input in the prior voice recognition technology.

Description

Multi-language category recognition library matching method and system

Technical Field

The invention relates to the technical field of voice recognition, in particular to a method and a system for matching a recognition library of multiple language categories.

Background

With the development of multimedia technology, the service items of the multimedia system are also expanded, such as music, video, pictures, real-time traffic signals, destination map navigation, voice navigation, and the like. The wide use of intelligent terminals provides a wide development space for the service projects.

No matter the terminal is provided with the keys or the touch screen, the service items can be used only by manual operation, so that the operation is complicated, and danger can be caused, for example, a driver can operate the vehicle-mounted equipment manually in the driving process to cause danger.

The development of speech recognition technology provides new development directions for such operations. However, in the existing speech recognition, only one language can be recognized, but in actual life, not all people can speak the mandarin, and besides the mandarin, the guangdong style, the northeast and the like are also provided. But this requires manual operation by the user and is extremely inconvenient. Moreover, a language is often influenced by other languages in the process of development, so that a dialect often has slight differences in different areas of the area to which the dialect belongs, for example, although the Chongqing and Sichuan languages generally belong to southwest official languages, the Chongqing and the Sichuan languages are different from each other, the Sichuan languages can be divided into Quanjie languages, self-supporting languages and the like, the user cannot perceive the differences among the languages, and the problem of inaccurate selection is caused, the problems are more obvious to users who live in handover areas of multiple areas, live in places throughout the year or replace multiple living cities, the users are influenced by multiple areas, the daily speaking voice can be the intersection of multiple dialects, and the user does not know which of the self speaking belongs to the dialect A and which belong to the dialect B, if only one language is manually selected for matching and recognition, undoubtedly, the accuracy of some dialect recognition is reduced, and the user experience is affected.

Disclosure of Invention

The invention aims to provide a method and a system for matching a recognition base of a multi-language category, which can dynamically adjust the recognition base aiming at the input of the multi-language category, automatically select a proper recognition base aiming at the voice of a user and improve the recognition accuracy.

In order to solve the technical problem, the patent provides the following technical scheme:

the multilingual-category recognition library matching method comprises the following steps of:

a voice collection step of collecting user voice;

a voice recognition step, namely sequentially using each recognition library to recognize the voice of the user within a preset recognition library range to generate a recognition result;

a scoring step, namely analyzing and evaluating the identification result of each identification library to generate identification library scores;

and a screening step, namely selecting the recognition library with the highest score of the recognition libraries as the recognition library adopted by the current user voice recognition.

In the technical scheme of the invention, each recognition base is used for recognizing the voice of the user in sequence, then the recognition result is analyzed and evaluated, the effect of the current recognition base on the voice recognition of the user is judged by judging the reasonability of the recognition result, the reasonability of the recognition result is quantified in a grading mode, and finally the most appropriate recognition base is selected as the recognition base for the voice recognition of the user, so that the multi-language recognition effect can be effectively improved, the used recognition base can be dynamically adjusted in the grading mode, and the high recognition rate can be still kept under the scene of switching among different languages.

Further, in the screening step, the recognition libraries are sorted from high to low according to the scores of the recognition libraries, and the top N recognition libraries are used as a preselected range according to the sorting result;

the voiceprint recognition step and the storage step, wherein the voiceprint recognition step comprises the following steps:

a voiceprint recognition step, detecting voiceprint information in user voice;

a range setting step, namely judging whether a preselection range corresponding to the voiceprint is stored in the system or not according to the voiceprint information of the user, if so, adopting the preselection range as a preset identification library range, and if not, setting the preset identification library range as the whole identification library;

the storage step is used for storing the screening results of the screening step and the generated preselected range.

Different users are distinguished through voiceprint information, and the preselection range of each user is recorded through the storage step, so that the preset identification library range in the next evaluation can be reduced, and the matching speed of the identification library is accelerated.

Further, the scoring step includes:

a vocabulary matching scoring step, namely matching vocabularies in the recognition result and obtaining vocabulary matching scores according to the matching number of the vocabularies;

a statement reasonableness scoring step, namely calculating the probability of the words appearing in the same statement according to the matched words, and obtaining statement reasonableness scoring according to the probability;

a correlation reasonableness grading step, namely extracting the semantics of the content of each sentence according to the vocabulary matched with each sentence, and calculating the correlation reasonableness of two adjacent sentences according to the semantics of two adjacent sentences of voice to obtain a correlation reasonableness grade;

and a score calculating step, namely calculating a recognition base score according to the vocabulary matching score, the statement rationality score and the association rationality score. Whether reasonable words can be formed among the individual words in the matched sentences or not, the consistency of the words in the single sentence and the consistency of semantics among the sentences are checked, and different recognition libraries are scored more comprehensively and accurately.

Further, the storing step is further configured to store the historical scores of the respective recognition libraries, the scoring step further includes an average score calculating step, the average score calculating step is configured to calculate an average score of the historical scores of the respective recognition libraries, and the score calculating step is further configured to calculate the recognition library scores according to the average score. The accumulated scores of the library are identified through average score response, and the condition that the accuracy of the scores is influenced by the contingency of the existence of single scores is avoided.

Further, the invention also discloses a matching system of the recognition base of the multi-language category, which comprises the following steps:

the voice acquisition module is used for acquiring user voice;

the voice recognition module is used for sequentially using each recognition library in a preset recognition library range to recognize the voice of the user and generate a recognition result;

the scoring module is used for analyzing and evaluating the identification result of each identification library and generating identification library score;

and the screening module is used for selecting the recognition library with the highest recognition library score as the recognition library adopted by the current user voice recognition.

Further, the screening module comprises a sorting module, a selecting module and a preselection range setting module, wherein the sorting module is used for sorting the recognition libraries from high to low according to the scores of the recognition libraries, and the preselection range setting module is used for taking the top N recognition libraries as preselection ranges according to sorting results; the voice print recognition system further comprises a voice print recognition module, a preset recognition library range setting module and a storage module, wherein the voice print recognition module is used for recognizing voice print information of a user, and the storage module is used for storing the voice print information, a screening result of a recognition library corresponding to the voice print and a preselection range;

the preset identification library range setting module is used for judging whether a preselection range corresponding to the voiceprint is stored in the system or not according to the voiceprint information of the user, setting the preselection range corresponding to the voiceprint information as a preset identification library range when the preselection range corresponding to the voiceprint information is detected to exist in the system, and setting the preset identification library range as the whole identification library when the preselection range corresponding to the voiceprint information is detected not to exist in the system.

Further, the scoring module comprises:

the vocabulary matching scoring module is used for matching vocabularies in the recognition result and obtaining vocabulary matching scores according to the matching number of the vocabularies;

the statement rationality scoring module is used for calculating the probability of the words appearing in the same sentence according to the matched words and obtaining statement rationality scoring according to the probability;

the association rationality scoring module is used for extracting the semantics of each sentence content according to the vocabulary matched with each sentence, calculating the association rationality of two adjacent sentences according to the semantics of two adjacent sentences of voice and calculating the association rationality scoring;

and the score calculating module is used for calculating the score of the recognition base according to the vocabulary matching score, the statement rationality score and the association rationality score.

Further, the storage module is further configured to store the historical scores of the respective recognition libraries, the scoring module further includes an average score calculation module, the average score calculation module is configured to calculate an average score of the scores of the historical times of the respective recognition libraries, and the score calculation module is further configured to calculate the recognition library score according to the average score.

Drawings

FIG. 1 is a logic diagram of an embodiment of the multiple language category recognition library matching system of the present invention.

Detailed Description

The following is further detailed by way of specific embodiments:

the embodiment discloses a multilingual-category recognition library matching method and a multilingual-category recognition library matching system using the same, wherein the logical structure of the system is shown in fig. 1, and the system comprises a voice acquisition module, a voice recognition module, a voiceprint recognition module, a scoring module, a screening module, a preset recognition library range setting module and a storage module. Wherein:

the voice acquisition module is used for acquiring user voice; the voice recognition module is used for sequentially using each recognition library in a preset recognition library range to recognize the voice of the user and generate a recognition result, the voice recognition module comprises a recognition library loading module, the recognition library loading module is used for loading the recognition libraries used by the voice recognition, the recognition libraries store the voice recognition rules, such as a Sichuan language recognition library, a Mandarin language recognition library, a Guangdong language recognition library and the like, and different recognition libraries have different voice recognition rules; the voice acquisition module can sequentially use each recognition library in the preset recognition library range to recognize the voice input by the user according to the preset recognition library range. The voiceprint recognition module is used for recognizing the voiceprint information of the user.

The scoring module is used for analyzing and evaluating the recognition result of each recognition base and generating a recognition base score, and comprises a vocabulary matching scoring module, a statement rationality scoring module, an association rationality scoring module, an average score calculating module and a scoring calculating module.

The vocabulary matching scoring module is used for matching vocabularies in the recognition result and obtaining vocabulary matching scores according to the matching number of the vocabularies; specifically, a dictionary base is stored in the storage module, each vocabulary in the dictionary base has own weight, the weights are used for reflecting the importance of the vocabulary, for example, the weights of vocabularies without practical meanings such as vocabularies and the like are smaller, the weights of vocabularies with practical meanings are larger, the vocabulary matching scoring module performs vocabulary matching on the content of the recognition result according to the vocabularies in the dictionary base, records the matched vocabulary and the number, and performs weighted summation according to preset weights to obtain the vocabulary matching score. The statement rationality scoring module is used for calculating the probability of the words appearing in the same statement according to the matched words and obtaining statement rationality scoring according to the probability, the statement rationality scoring module is mainly realized through big data analysis, namely, a storage module of the system stores enough statement databases, the statement rationality scoring module searches all matched words from the statement databases and sentences appearing in the same statement at the same time and obtains the number of the sentences, and the proportion of the number in the total data amount is used as the statement rationality scoring.

The association rationality scoring module is used for extracting the semantics of each sentence content according to the words matched with each sentence, meanwhile, the association rationality of two adjacent sentences is calculated according to the semantics of two adjacent sentences of voice, and the association rationality scoring is calculated. The average score calculation module is used for calculating the average score of scores of all the recognition libraries;

the score calculation module is used for calculating a recognition base score according to the vocabulary matching score, the statement rationality score, the average score and the association rationality score, and in the embodiment, the score calculation module is used for weighting and summing the vocabulary matching score, the statement rationality score, the average score and the association rationality score according to preset weights, and the obtained sum value is used as the recognition base score.

And the screening module is used for selecting the recognition library with the highest recognition library score as the recognition library adopted by the current user voice recognition. The screening module comprises a sorting module and a preselection range setting module, the sorting module is used for sorting the recognition libraries from high to low according to the scores of the recognition libraries, and the preselection range setting module is used for taking the first 5 recognition libraries as preselection ranges according to sorting results;

the storage module is used for storing the voiceprint information, the screening results of the recognition libraries corresponding to the voiceprints, the preselection range and the historical scores of the recognition libraries.

a voice collection step of collecting user voice;

In the screening step, the recognition libraries are sorted from high to low according to the scores of the recognition libraries, and the top 5 recognition libraries are used as a preselection range according to a sorting result;

the method also comprises a voiceprint recognition step and a storage step, wherein the voiceprint recognition step comprises the following steps:

a voiceprint recognition step, detecting voiceprint information in user voice;

in the storage step, the screening results of the screening step, the generated preselection range and the history scores of the identification libraries are stored.

The scoring step comprises the following steps:

an average score calculation step of calculating an average score of scores of respective histories of the recognition base

And a score calculating step of calculating a recognition base score according to the vocabulary matching score, the sentence rationality score, the association rationality score and the average score.

The method comprises the steps of sequentially using each recognition base to recognize the voice of a user, analyzing and evaluating recognition results, judging the effect of the current recognition base on the voice recognition of the user by judging the rationality of the recognition results, quantifying the rationality of the recognition results in a grading mode, and finally selecting the most appropriate recognition base as the recognition base for the voice recognition of the user, so that the multi-language recognition effect can be effectively improved, the used recognition base can be dynamically adjusted in the grading mode, and the high recognition rate can be still kept under the scene of switching among different languages. Different users are distinguished through voiceprint information, and the preselection range of each user is recorded through the storage step, so that the preset identification library range in the next evaluation can be reduced, and the matching speed of the identification library is accelerated. Whether reasonable words can be formed among the individual words in the matched sentences or not, the consistency of the words in the single sentence and the consistency of semantics among the sentences are checked, and different recognition libraries are scored more comprehensively and accurately. The accumulated scores of the library are identified through average score response, and the condition that the accuracy of the scores is influenced by the contingency of the existence of single scores is avoided.

The foregoing are merely exemplary embodiments of the present invention, and no attempt is made to show structural details of the invention in more detail than is necessary for the fundamental understanding of the art, the description taken with the drawings making apparent to those skilled in the art how the several forms of the invention may be embodied in practice with the teachings of the invention. It should be noted that, for those skilled in the art, without departing from the structure of the present invention, several changes and modifications can be made, which should also be regarded as the protection scope of the present invention, and these will not affect the effect of the implementation of the present invention and the practicability of the patent. The scope of the claims of the present application shall be determined by the contents of the claims, and the description of the embodiments and the like in the specification shall be used to explain the contents of the claims.

Claims

1. The matching method of the recognition base of the multilingual categories is characterized in that: the method comprises the following steps:

a voice collection step of collecting user voice;

a scoring step, namely analyzing and evaluating the identification result of each identification library to generate identification library scores; the scoring step comprises a vocabulary matching scoring step, a statement reasonableness scoring step, an association reasonableness scoring step and a scoring calculation step:

a score calculation step, calculating a recognition base score according to the vocabulary matching score, the statement rationality score and the association rationality score;

in the screening step, sorting the recognition libraries according to the scores of the recognition libraries from high to low, and taking the first N recognition libraries as a preselected range according to a sorting result; the method also comprises a voiceprint recognition step and a storage step, wherein the voiceprint recognition step comprises the following steps: a voiceprint recognition step, detecting voiceprint information in user voice; a range setting step, namely judging whether a preselection range corresponding to the voiceprint is stored in the system or not according to the voiceprint information of the user, if so, adopting the preselection range as a preset identification library range, and if not, setting the preset identification library range as the whole identification library; the storage step is used for storing the screening results of the screening step and the generated preselected range.

2. The method of matching a recognition pool of multiple language classes according to claim 1, wherein: the storage step is further used for storing the historical scores of the identification libraries, the scoring step further comprises an average score calculation step, the average score calculation step is used for calculating the average score of the scores of the historical times of the identification libraries, and the score calculation step is also used for calculating the scores of the identification libraries according to the average score.

3. A multilingual-category recognition library matching system, comprising: the method comprises the following steps:

the voice acquisition module is used for acquiring user voice;

the scoring module is used for analyzing and evaluating the identification result of each identification library and generating identification library score; the scoring module comprises a vocabulary matching scoring module, a statement rationality scoring module, an association rationality scoring module and a scoring calculation module;

the score calculating module is used for calculating a recognition base score according to the vocabulary matching score, the statement rationality score and the association rationality score;

the screening module comprises a sorting module, a selecting module and a preselection range setting module, wherein the sorting module is used for sorting the recognition libraries from high to low according to the scores of the recognition libraries, and the preselection range setting module is used for taking the first N recognition libraries as preselection ranges according to sorting results; the voice print recognition system further comprises a voice print recognition module, a preset recognition library range setting module and a storage module, wherein the voice print recognition module is used for recognizing voice print information of a user, and the storage module is used for storing the voice print information, a screening result of a recognition library corresponding to the voice print and a preselection range;

4. The multilingual-category-aware library matching system of claim 3, further comprising: the storage module is further used for storing the historical scores of the identification libraries, the scoring module further comprises an average score calculating module which is used for calculating the average score of the scores of the historical times of the identification libraries, and the score calculating module is further used for calculating the scores of the identification libraries according to the average score.