CN110335628B - Voice test method and device of intelligent equipment and electronic equipment - Google Patents

Voice test method and device of intelligent equipment and electronic equipment Download PDF

Info

Publication number
CN110335628B
CN110335628B CN201910580478.9A CN201910580478A CN110335628B CN 110335628 B CN110335628 B CN 110335628B CN 201910580478 A CN201910580478 A CN 201910580478A CN 110335628 B CN110335628 B CN 110335628B
Authority
CN
China
Prior art keywords
result
matching
sub
recognition result
test
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910580478.9A
Other languages
Chinese (zh)
Other versions
CN110335628A (en
Inventor
余明
刘子祥
陈果果
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Shanghai Xiaodu Technology Co Ltd
Original Assignee
Baidu Online Network Technology Beijing Co Ltd
Shanghai Xiaodu Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Baidu Online Network Technology Beijing Co Ltd, Shanghai Xiaodu Technology Co Ltd filed Critical Baidu Online Network Technology Beijing Co Ltd
Priority to CN201910580478.9A priority Critical patent/CN110335628B/en
Publication of CN110335628A publication Critical patent/CN110335628A/en
Application granted granted Critical
Publication of CN110335628B publication Critical patent/CN110335628B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/01Assessment or evaluation of speech recognition systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a voice test method and device of intelligent equipment and electronic equipment, wherein the method comprises the following steps: acquiring an actual recognition result of the intelligent equipment on the test voice, wherein the actual recognition result comprises a plurality of first sub-results; determining a matching result of the actual recognition result and the standard recognition result according to the matching result of each first sub-result in the actual recognition result and each second sub-result in the standard recognition result corresponding to the test voice; and determining a voice test result of the intelligent equipment according to the matching result of the actual recognition result and the standard recognition result. The method greatly improves the efficiency of the voice recognition test and can also ensure the correctness of the matching result.

Description

Voice test method and device of intelligent equipment and electronic equipment
Technical Field
The embodiment of the invention relates to an intelligent voice technology, in particular to a voice testing method and device of intelligent equipment and electronic equipment.
Background
With the continuous development of voice recognition technology, more and more intelligent devices supporting automatic voice recognition are emerging, such as sound boxes, mobile phones, tablet computers and the like supporting automatic voice recognition. Before the intelligent devices supporting automatic voice recognition are shipped, the automatic voice recognition functions of the intelligent devices need to be tested. The word accuracy is an important index in the voice recognition function test. In the testing process, a series of voices corresponding to the standard recognition results are input into the intelligent equipment in a real person testing mode and the like, and the intelligent equipment obtains a series of actual recognition results according to the input voices. And matching and analyzing the standard recognition result and the actual recognition result to obtain the word-sentence accuracy information of the intelligent equipment.
In the prior art, a standard recognition result and an actual recognition result are matched mainly in a manual comparison mode. Specifically, the matching degree of each standard recognition result and the actual recognition result is manually compared one by a tester, and the word standard sentence information is counted.
However, using prior art methods for voice recognition testing is inefficient.
Disclosure of Invention
The embodiment of the invention provides a voice testing method and device of intelligent equipment and electronic equipment, and aims to solve the problem of low voice recognition testing efficiency in the prior art.
A first aspect of an embodiment of the present invention provides a voice test method for an intelligent device, including:
acquiring an actual recognition result of the intelligent equipment on the test voice, wherein the actual recognition result comprises a plurality of first sub-results;
determining a matching result of the actual recognition result and the standard recognition result according to the matching result of each first sub-result in the actual recognition result and each second sub-result in the standard recognition result corresponding to the test voice;
and determining a voice test result of the intelligent equipment according to the matching result of the actual recognition result and the standard recognition result.
Further, the determining the matching result between the actual recognition result and the standard recognition result according to the matching result between each first sub-result in the actual recognition result and each second sub-result in the standard recognition result corresponding to the test voice includes:
determining a plurality of matching sets and a matching score of each matching set according to the matching result of each first sub-result and each second sub-result, wherein each matching set comprises a plurality of matching pairs, and each matching pair consists of one first sub-result and one second sub-result;
and determining the matching result of the actual recognition result and the standard recognition result according to the matching set with the maximum matching score.
Further, the matching result of each first sub-result and each second sub-result comprises a matching score of each first sub-result and each second sub-result;
the determining a plurality of matching sets and a matching score of each matching set according to the matching result of each first sub-result and each second sub-result includes:
determining a second matching set according to the determined matching scores of the first matching set and the matching scores of the current first sub-result and the current second sub-result;
the current first sub-result and the current second sub-result are a first sub-result and a second sub-result included in a matching pair subsequent to a matching pair in which the first sub-result with the latest time is generated in the first matching set.
Further, the determining the matching result of the actual recognition result and the standard recognition result according to the matching set with the largest matching score includes:
and generating a matching result of the actual recognition result and the standard recognition result according to the matching pair included in the matching set with the maximum matching score.
Further, before determining a matching result between the actual recognition result and the standard recognition result according to a matching result between each first sub-result in the actual recognition result and each second sub-result in the standard recognition result corresponding to the test voice, the method further includes:
sequencing each first sub-result in the actual recognition result according to the generation time;
and determining the matching result of each first sub-result and each second sub-result according to the longest common subsequence between each first sub-result and each second sub-result after sorting.
Further, the acquiring an actual recognition result of the test voice by the intelligent device includes:
acquiring a test log generated when the intelligent equipment identifies the test voice;
and screening out an actual recognition result of the intelligent equipment to the test voice from the test log according to preset keywords and preset regular expression information.
Further, the determining a test result of the intelligent device according to the matching result of the actual recognition result and the standard recognition result includes:
and determining the word accuracy and/or sentence accuracy of the intelligent equipment according to the matching result of the actual recognition result and the standard recognition result.
A second aspect of an embodiment of the present invention provides a voice testing apparatus for an intelligent device, including:
the intelligent device comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring an actual recognition result of the intelligent device on the test voice, and the actual recognition result comprises a plurality of first sub-results;
the first determining module is used for determining the matching result of the actual recognition result and the standard recognition result according to the matching result of each first sub-result in the actual recognition result and each second sub-result in the standard recognition result corresponding to the test voice;
and the second determining module is used for determining the voice test result of the intelligent equipment according to the matching result of the actual recognition result and the standard recognition result.
Further, the first determining module is specifically configured to:
determining a plurality of matching sets and a matching score of each matching set according to the matching result of each first sub-result and each second sub-result, wherein each matching set comprises a plurality of matching pairs, and each matching pair consists of one first sub-result and one second sub-result;
and determining the matching result of the actual recognition result and the standard recognition result according to the matching set with the maximum matching score.
Further, the matching result of each first sub-result and each second sub-result comprises a matching score of each first sub-result and each second sub-result;
the first determining module is specifically configured to:
determining a second matching set according to the determined matching scores of the first matching set and the matching scores of the current first sub-result and the current second sub-result;
the current first sub-result and the current second sub-result are a first sub-result and a second sub-result included in a matching pair subsequent to a matching pair in which the first sub-result with the latest time is generated in the first matching set.
Further, the first determining module is specifically configured to:
and generating a matching result of the actual recognition result and the standard recognition result according to the matching pair included in the matching set with the maximum matching score.
Further, the apparatus further comprises:
the sorting module is used for sorting the first sub-results in the actual recognition result according to the generation time;
and the third determining module is used for determining the matching result of each first sub-result and each second sub-result according to the longest common subsequence between each first sub-result and each second sub-result after sorting.
Further, the obtaining module is specifically configured to:
acquiring a test log generated when the intelligent equipment identifies the test voice;
and screening out an actual recognition result of the intelligent equipment to the test voice from the test log according to preset keywords and preset regular expression information.
Further, the second determining module is specifically configured to:
and determining the word accuracy and/or sentence accuracy of the intelligent equipment according to the matching result of the actual recognition result and the standard recognition result.
A third aspect of embodiments of the present invention provides an electronic device, including:
a memory for storing program instructions;
a processor for calling and executing the program instructions in the memory to perform the method steps of the first aspect.
A fourth aspect of the embodiments of the present invention provides a readable storage medium, in which a computer program is stored, the computer program being configured to execute the method according to the first aspect.
According to the voice testing method and device for the intelligent device and the electronic device, provided by the embodiment of the invention, after the actual recognition result of the intelligent device on the testing voice is obtained, the overall matching result of the actual recognition result and the standard recognition result can be determined based on the matching result of each sub-result in the actual recognition result and each sub-result in the standard recognition result, and then the testing result of the intelligent device can be determined according to the overall matching result of the actual recognition result and the standard recognition result, so that the standard recognition result and the actual recognition result can be automatically matched, and the voice recognition testing efficiency is greatly improved. Meanwhile, the method dynamically determines the overall matching result of the actual recognition result and the standard recognition result based on the matching result of each sub-result in the actual recognition result and each sub-result in the standard recognition result, so that the correctness of the matching result can be ensured.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the following briefly introduces the drawings needed to be used in the description of the embodiments or the prior art, and obviously, the drawings in the following description are some embodiments of the present invention, and those skilled in the art can obtain other drawings according to the drawings without inventive labor.
Fig. 1 is an exemplary system architecture diagram of a voice testing method for a smart device according to an embodiment of the present invention;
fig. 2 is a schematic flowchart of a voice testing method for an intelligent device according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating an example of the matching result;
fig. 4 is a schematic flowchart of a voice testing method for an intelligent device according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating the determination of a match set and a match score for the match set;
fig. 6 is a schematic flowchart of a voice testing method for an intelligent device according to an embodiment of the present invention;
fig. 7 is a block diagram of a voice testing apparatus of an intelligent device according to an embodiment of the present invention;
fig. 8 is a block diagram of a voice testing apparatus of an intelligent device according to an embodiment of the present invention;
fig. 9 is a schematic structural diagram of an electronic device 900 according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the prior art, the standard recognition result and the actual recognition result are matched in a manual comparison mode, and the efficiency of performing the voice recognition test by the method is low. Meanwhile, when the standard identification results are more, the problem of matching errors can also occur, so that the test accuracy is low.
Based on the above problems, embodiments of the present invention provide a voice testing method for an intelligent device, after an actual recognition result of the intelligent device for a testing voice is obtained, based on a matching result of each sub-result in an actual recognition result and each sub-result in a standard recognition result, an overall matching result of the actual recognition result and the standard recognition result can be determined, and then a testing result of the intelligent device can be determined according to the overall matching result of the actual recognition result and the standard recognition result, so that automatic matching of the standard recognition result and the actual recognition result is achieved, and efficiency of a voice recognition test is greatly improved. Meanwhile, the method dynamically determines the overall matching result of the actual recognition result and the standard recognition result based on the matching result of each sub-result in the actual recognition result and each sub-result in the standard recognition result, so that the correctness of the matching result can be ensured.
Fig. 1 is an exemplary system architecture diagram of a voice testing method for a smart device according to an embodiment of the present invention, as shown in fig. 1, the method relates to a smart device to be tested and a testing device. The test equipment acquires the actual recognition result of the test voice from the tested intelligent equipment, performs matching processing by using the method of the embodiment of the invention, and determines the test result.
The testing device may be a PC connected to the tested intelligent device, and the intelligent device and the testing device may be connected in a wired or wireless manner, which is not specifically limited in this embodiment of the present invention.
In addition, the tested smart device may be a smart device with a voice recognition function, such as a smart speaker, a smart phone, or a smart watch, and the specific form of the tested smart device is not particularly limited in the embodiments of the present invention.
Fig. 2 is a schematic flow chart of a voice testing method for an intelligent device according to an embodiment of the present invention, where an execution subject of the method is the testing device, as shown in fig. 2, the method includes:
s201, obtaining an actual recognition result of the intelligent device on the test voice, wherein the actual recognition result comprises a plurality of first sub-results.
The smart device in this embodiment and the following embodiments refers to a tested smart device.
The actual recognition result refers to a recognition result actually obtained after the tested intelligent device performs voice recognition on the test voice.
Optionally, the real person may send the test voice to the intelligent device in a real person test mode, or the real person voice may be pre-recorded to generate the audio, and the audio of the test voice input to the intelligent device by the test device is recognized by the intelligent device.
Optionally, the test speech may include multiple sentences of speech at certain time intervals, and after the speech is recognized by the smart device, the obtained actual recognition result may include multiple first sub-results, where each sub-result is a text formed after the smart device recognizes a certain sentence of speech.
Illustratively, one of the first sub-results in the actual recognition result is "do not go out with umbrella today".
S202, determining a matching result of the actual recognition result and the standard recognition result according to the matching result of each first sub-result in the actual recognition result and each second sub-result in the standard recognition result corresponding to the test voice.
Optionally, the test speech is a preset speech, and the standard recognition result corresponding to the test speech is a correct recognition result that should be recognized for the test speech. The test speech includes a plurality of sentences of speech, and correspondingly, the standard recognition result corresponding to the test speech may include a plurality of sentences of text corresponding to the plurality of sentences of speech, each sentence of text being a second sub-result.
Illustratively, the test voices comprise 1000 voices, wherein one voice is the voice of "not go out with umbrella today", and the second sub-result corresponding to the voice is the text of "not go out with umbrella today".
And generating and storing the standard recognition result in advance according to the test voice.
And matching the plurality of second sub-results in the standard recognition result with the plurality of first sub-results in the actual recognition result, so that a matching result between every two sub-results can be obtained. Based on the matching result between two, the matching result of the standard recognition result and the actual recognition result can be dynamically determined in this embodiment. And the matching result comprises the corresponding relation between each sub-result in the standard recognition result and each sub-result in the actual recognition result.
Fig. 3 is an exemplary diagram of the matching result, as shown in fig. 3, A, B, C, D, E, F, G is included in the standard recognition result in chronological order, A, A, B, G, D, E, G, F is included in the actual recognition result in chronological order, that is, there may be duplication in the actual recognition result and may also be inconsistent with the standard recognition result in order, after the matching process of this embodiment, the duplication may be removed, and a null value is filled in a portion where no sub-result is recognized in the standard recognition result, and at the same time, an erroneous recognition result may be removed, so that the matching result shown in fig. 3 is obtained, that is, a matches a, B matches B, and C does not match, and so on.
And S203, determining a voice test result of the intelligent equipment according to the matching result of the actual recognition result and the standard recognition result.
Optionally, the voice test result of the intelligent device may include indexes such as word accuracy, sentence accuracy, and the like.
As an optional implementation manner, when determining the voice test result of the smart device according to the matching result between the actual recognition result and the standard recognition result, the word accuracy and/or sentence accuracy of the smart device may be determined according to the matching result between the actual recognition result and the standard recognition result.
Taking the intelligent device as an example of the intelligent sound box, after the test device obtains the matching result of the actual recognition result of the intelligent sound box and the standard recognition result, the statistical processing can be performed on the corresponding relation between the matching result including the standard recognition result and each sub-result in the actual recognition result, so that the word accuracy and/or sentence accuracy of the intelligent sound box can be obtained.
Referring to the example shown in fig. 3, it is determined through this embodiment that the correspondence between the sub-results in the standard recognition result and the actual recognition result is a match between a and a, a match between B and B, no match between C, and the like, so that the word accuracy of a in the actual recognition result with respect to a in the standard recognition result and the word accuracy of B in the actual recognition result with respect to B in the standard recognition result can be calculated, and by analogy, the word accuracy of each sub-result is subjected to average equal weighting processing, thereby obtaining the word accuracy of the actual recognition result.
In this embodiment, after the actual recognition result of the test voice by the intelligent device is obtained, based on the matching result of each sub-result in the actual recognition result and each sub-result in the standard recognition result, the overall matching result of the actual recognition result and the standard recognition result can be determined, and then the test result of the intelligent device can be determined according to the overall matching result of the actual recognition result and the standard recognition result, so that the automatic matching of the standard recognition result and the actual recognition result is realized, and the efficiency of the voice recognition test is greatly improved. Meanwhile, the method dynamically determines the overall matching result of the actual recognition result and the standard recognition result based on the matching result of each sub-result in the actual recognition result and each sub-result in the standard recognition result, so that the correctness of the matching result can be ensured.
Fig. 4 is a schematic flowchart of a voice testing method for an intelligent device according to an embodiment of the present invention, and as shown in fig. 4, an optional implementation manner of step S202 includes:
s401, according to the matching results of the first sub-results and the second sub-results, determining a plurality of matching sets and the matching score of each matching set, wherein each matching set comprises a plurality of matching pairs, and each matching pair consists of one first sub-result and one second sub-result.
Each matching set represents a combination of the corresponding relation between the standard recognition result and each sub-result of the actual recognition result.
Illustratively, the standard recognition result includes A, B, C three first sub-results, and the actual recognition result includes 1, 2, and 3 second sub-results, which may include the following two matching sets:
{ A-1, B-2, C-3} is a matching set, i.e., A matches 1 and is a matching pair, B matches 2 and is a matching pair, and C matches 3 and is a matching pair.
{ A-2, B-3, C-1} is another set of matches, i.e., A matches 2 for a matching pair, B matches 3 for a matching pair, and C matches 1 for a matching pair.
Each matching pair contained in each matching set has a matching result, and the matching score of each matching set can be obtained based on the matching result of each matching pair. The higher the matching score of a certain matching set is, the better the matching relationship of the matching pairs contained in the matching set is.
As an alternative implementation, if the matching result of the first sub-results and the second sub-results includes the matching score of the first sub-results and the second sub-results, a plurality of matching sets and the matching score of each matching set may be determined according to the following process:
and determining a second matching set according to the determined matching scores of the first matching set and the matching scores of the current first sub-result and the current second sub-result.
The current first sub-result and the current second sub-result are a first sub-result and a second sub-result included in a matching pair subsequent to a matching pair in which the first sub-result with the latest time is generated in the first matching set.
In this manner, each matching set is determined by the matching score of the previous matching set that has been determined and the most recent matching pair based on the previous matching set.
S402, determining a matching result of the actual recognition result and the standard recognition result according to the matching set with the maximum matching score.
Each matching set determined in S401 has a matching score, and a matching set with the largest matching score can be selected from the matching sets, where the matching relationship of the matching pairs included in the matching set is the optimal matching relationship.
In an alternative manner, a matching result of the actual recognition result and the standard recognition result may be generated according to a matching pair included in the matching set with the largest matching score.
For example, if the matching set with the largest matching score is { A-1, B-2, C-3}, a matching result comprising the matching relationship may be generated, and the matching result comprises: a matches 1, B matches 2, and C matches 3.
The above process of determining a matching set is described below by way of an example.
Fig. 5 is a schematic diagram of determining a matching set and a matching score of the matching set, and as shown in fig. 5, it is assumed that a test speech corresponds to 7 second sub-results, that is, a standard recognition result includes 7 standard sentences, and an intelligent device actually recognizes 5 first sub-results, that is, an actual recognition result includes 5 standard sentences. The table on the left of fig. 5 shows the match scores of pairwise matches of 7 second sub-results with 5 first sub-results, and each entry in the table on the right of fig. 6 shows the match score of the current second match set.
Assuming that the match scores of the second set of matches are represented by c [ i ] [ j ], i represents the ith of the 5 first sub-results, and j represents the jth of the 7 second sub-results, c [ i ] [ j ] can be calculated by the following equation (1):
c[i][j]=max(c[i-1][j-1]+p[i][j],c[i-1][j])(1)
and c [ i-1] [ j-1] and c [ i-1] [ j ] represent a first matching set determined before the second matching set, and p [ i ] [ j ] represents the matching scores of the current first sub-result and the current second result.
Referring to fig. 5, assuming that i is 3 and j is 5, c [ i-1] [ j-1], [ c [2] [4], [ c [ i-1] [ j ], [ c [2] [5], [ 13 ], and p [ i ] [ j ], [ p [3] [5], [4], c [3] [5], [ 16 ] is defined according to the above formula (1).
Based on the above formula (1), the matching score of each matching set in fig. 5 can be obtained, the maximum matching score, that is, 27, is selected from the matching sets, and based on the matching score, each matching pair included in the matching set corresponding to the matching score is determined in a path backtracking manner.
Fig. 6 is a flowchart of a voice testing method for an intelligent device according to an embodiment of the present invention, and as shown in fig. 6, before step S202, a matching result between each first sub-result and each second sub-result may be determined by the following process:
s601, sequencing each first sub-result in the actual recognition result according to the generation time.
Because the standard recognition result is generated in advance according to the test voice, the standard recognition result is sorted according to time, and therefore in the step, the first sub-result in the actual recognition result is sorted according to time, and the matching efficiency when the standard recognition result is matched with the actual recognition result is higher.
S602, determining the matching result of each first sub-result and each second sub-result according to the longest common subsequence between each first sub-result and each second sub-result after sorting.
Wherein the longest common subsequence is the longest subsequence of the subsequences included in both the first and second sub-results.
For example, assume that the first sub-result is "not going out with umbrella today", the second sub-result is "going out without opening umbrella tomorrow", and assume that the two sub-results include two common sub-sequences, one common sub-sequence is "not going out with umbrella sky", the other common sub-sequence is "going out with umbrella sky", and the latter sub-sequence is the longest and thus the longest common sub-sequence.
If the match result is the match score described above, the length of the longest common subsequence may be taken as the match score. For example, in the above example, the match score of the first sub-result and the second sub-result is 5.
In step S201, the testing device may obtain an actual recognition result of the testing voice by the intelligent device. As an alternative, the test device may obtain the actual recognition result through a test log.
Specifically, the test device may first obtain a test log generated when the intelligent device recognizes the test voice, and then, according to a preset keyword and preset regular expression information, screen out an actual recognition result of the intelligent device for the test voice from the test log.
Optionally, when recording the log related to the voice recognition, the intelligent device may first record the corresponding keyword, and therefore, the testing device may determine whether a certain log is a log of the voice recognition result according to the keyword. In addition, the log may record, in addition to the text of the voice recognition result, information such as the time of the result generation, which may be obtained by matching with a regular expression.
Fig. 7 is a block diagram of a voice testing apparatus for an intelligent device according to an embodiment of the present invention, and as shown in fig. 7, the apparatus includes:
the obtaining module 701 is configured to obtain an actual recognition result of the testing speech by the intelligent device, where the actual recognition result includes multiple first sub-results.
A first determining module 702, configured to determine a matching result between the actual recognition result and the standard recognition result according to a matching result between each first sub-result in the actual recognition result and each second sub-result in the standard recognition result corresponding to the test voice.
A second determining module 703 is configured to determine a voice test result of the intelligent device according to a matching result between the actual recognition result and the standard recognition result.
In another embodiment, the first determining module 702 is specifically configured to:
and determining a plurality of matching sets and a matching score of each matching set according to the matching result of each first sub-result and each second sub-result, wherein each matching set comprises a plurality of matching pairs, and each matching pair consists of one first sub-result and one second sub-result.
And determining the matching result of the actual recognition result and the standard recognition result according to the matching set with the maximum matching score.
In another embodiment, the matching result of each first sub-result and each second sub-result includes a matching score of each first sub-result and each second sub-result;
the first determining module 702 is specifically configured to:
determining a second matching set according to the determined matching scores of the first matching set and the matching scores of the current first sub-result and the current second sub-result;
the current first sub-result and the current second sub-result are a first sub-result and a second sub-result included in a matching pair subsequent to a matching pair in which the first sub-result with the latest time is generated in the first matching set.
In another embodiment, the first determining module 702 is specifically configured to:
and generating a matching result of the actual recognition result and the standard recognition result according to the matching pair included in the matching set with the maximum matching score.
Fig. 8 is a block diagram of a voice testing apparatus for an intelligent device according to an embodiment of the present invention, and as shown in fig. 8, the apparatus further includes:
and the sorting module 704 is configured to sort the first sub-results in the actual recognition result according to the generation time.
The third determining module 705 is configured to determine a matching result between each first sub-result and each second sub-result according to the longest common subsequence between each sorted first sub-result and each second sub-result.
In another embodiment, the obtaining module 701 is specifically configured to:
acquiring a test log generated when the intelligent equipment identifies the test voice;
and screening out an actual recognition result of the intelligent equipment to the test voice from the test log according to preset keywords and preset regular expression information.
In another embodiment, the second determining module 703 is specifically configured to:
and determining the word accuracy and/or sentence accuracy of the intelligent equipment according to the matching result of the actual recognition result and the standard recognition result.
It should be noted that the division of the modules of the above apparatus is only a logical division, and the actual implementation may be wholly or partially integrated into one physical entity, or may be physically separated. And these modules can be realized in the form of software called by processing element; or may be implemented entirely in hardware; and part of the modules can be realized in the form of calling software by the processing element, and part of the modules can be realized in the form of hardware. For example, the determining module may be a processing element separately set up, or may be implemented by being integrated in a chip of the apparatus, or may be stored in a memory of the apparatus in the form of program code, and the function of the determining module is called and executed by a processing element of the apparatus. Other modules are implemented similarly. In addition, all or part of the modules can be integrated together or can be independently realized. The processing element described herein may be an integrated circuit having signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in the form of software.
For example, the above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more Application Specific Integrated Circuits (ASICs), or one or more microprocessors (DSPs), or one or more Field Programmable Gate Arrays (FPGAs), among others. For another example, when some of the above modules are implemented in the form of a processing element scheduler code, the processing element may be a general-purpose processor, such as a Central Processing Unit (CPU) or other processor that can call program code. As another example, these modules may be integrated together, implemented in the form of a system-on-a-chip (SOC).
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
Fig. 9 is a schematic structural diagram of an electronic device 900 according to an embodiment of the present invention. As shown in fig. 9, the electronic device may include: the system comprises a processor 91, a memory 92, a communication interface 93 and a system bus 94, wherein the memory 92 and the communication interface 93 are connected with the processor 91 through the system bus 94 and complete mutual communication, the memory 92 is used for storing computer execution instructions, the communication interface 93 is used for communicating with other devices, and the processor 91 implements the scheme of the embodiment shown in fig. 1 to 6 when executing the computer program.
The system bus mentioned in fig. 9 may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The system bus may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus. The communication interface is used for realizing communication between the database access device and other equipment (such as a client, a read-write library and a read-only library). The memory may comprise Random Access Memory (RAM) and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The processor may be a general-purpose processor, including a central processing unit CPU, a Network Processor (NP), and the like; but also a digital signal processor DSP, an application specific integrated circuit ASIC, a field programmable gate array FPGA or other programmable logic device, discrete gate or transistor logic, discrete hardware components.
Optionally, an embodiment of the present invention further provides a storage medium, where instructions are stored in the storage medium, and when the storage medium is run on a computer, the storage medium causes the computer to execute the method according to the embodiment shown in fig. 1 to 6.
Optionally, an embodiment of the present invention further provides a chip for executing the instruction, where the chip is configured to execute the method in the embodiment shown in fig. 1 to 6.
Embodiments of the present invention further provide a program product, where the program product includes a computer program, where the computer program is stored in a storage medium, and at least one processor may read the computer program from the storage medium, and when the at least one processor executes the computer program, the at least one processor may implement the method in the embodiments shown in fig. 1 to 6.
In the embodiments of the present invention, "at least one" means one or more, "a plurality" means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone, wherein A and B can be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship; in the formula, the character "/" indicates that the preceding and following related objects are in a relationship of "division". "at least one of the following" or similar expressions refer to any combination of these items, including any combination of the singular or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or multiple.
It is to be understood that the various numerical references referred to in the embodiments of the present invention are merely for convenience of description and distinction and are not intended to limit the scope of the embodiments of the present invention.
It should be understood that, in the embodiment of the present invention, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation on the implementation process of the embodiment of the present invention.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (9)

1. A voice test method of intelligent equipment is characterized by comprising the following steps:
acquiring an actual recognition result of the intelligent equipment on the test voice, wherein the actual recognition result comprises a plurality of first sub-results;
determining a plurality of matching sets and a matching score of each matching set according to a matching result of each first sub-result and each second sub-result, wherein each matching set comprises a plurality of matching pairs, each matching pair consists of one first sub-result and one second sub-result, and the second sub-result is a result in the standard recognition result corresponding to the test voice;
determining a matching result of the actual recognition result and the standard recognition result according to the matching set with the maximum matching score;
and determining a voice test result of the intelligent equipment according to the matching result of the actual recognition result and the standard recognition result.
2. The method of claim 1, wherein the matching result of each first sub-result and each second sub-result comprises a matching score of each first sub-result and each second sub-result;
the determining a plurality of matching sets and a matching score of each matching set according to the matching result of each first sub-result and each second sub-result includes:
determining a second matching set according to the determined matching scores of the first matching set and the matching scores of the current first sub-result and the current second sub-result;
the current first sub-result and the current second sub-result are a first sub-result and a second sub-result included in a matching pair subsequent to a matching pair in which the first sub-result with the latest time is generated in the first matching set.
3. The method according to claim 1 or 2, wherein the determining the matching result of the actual recognition result and the standard recognition result according to the matching set with the largest matching score comprises:
and generating a matching result of the actual recognition result and the standard recognition result according to the matching pair included in the matching set with the maximum matching score.
4. The method according to claim 3, wherein before determining the matching result between the actual recognition result and the standard recognition result according to the matching result between each first sub-result in the actual recognition result and each second sub-result in the standard recognition result corresponding to the test speech, the method further comprises:
sequencing each first sub-result in the actual recognition result according to the generation time;
and determining the matching result of each first sub-result and each second sub-result according to the longest common subsequence between each first sub-result and each second sub-result after sorting.
5. The method of claim 4, wherein obtaining the actual recognition result of the test speech by the smart device comprises:
acquiring a test log generated when the intelligent equipment identifies the test voice;
and screening out an actual recognition result of the intelligent equipment to the test voice from the test log according to preset keywords and preset regular expression information.
6. The method according to claim 5, wherein the determining the test result of the smart device according to the matching result of the actual recognition result and the standard recognition result comprises:
and determining the word accuracy and/or sentence accuracy of the intelligent equipment according to the matching result of the actual recognition result and the standard recognition result.
7. A speech testing device of intelligent equipment is characterized by comprising:
the intelligent device comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring an actual recognition result of the intelligent device on the test voice, and the actual recognition result comprises a plurality of first sub-results;
a first determining module, configured to determine, according to matching results of each first sub-result and each second sub-result, a plurality of matching sets and a matching score of each matching set, where each matching set includes a plurality of matching pairs, each matching pair includes a first sub-result and a second sub-result, and the second sub-result is a result in a standard recognition result corresponding to the test speech;
determining a matching result of the actual recognition result and the standard recognition result according to the matching set with the maximum matching score;
and the second determining module is used for determining the voice test result of the intelligent equipment according to the matching result of the actual recognition result and the standard recognition result.
8. An electronic device, comprising:
a memory for storing program instructions;
a processor for invoking and executing program instructions in said memory for performing the method steps of any of claims 1-6.
9. A readable storage medium, characterized in that a computer program is stored in the readable storage medium for performing the method of any of claims 1-6.
CN201910580478.9A 2019-06-28 2019-06-28 Voice test method and device of intelligent equipment and electronic equipment Active CN110335628B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910580478.9A CN110335628B (en) 2019-06-28 2019-06-28 Voice test method and device of intelligent equipment and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910580478.9A CN110335628B (en) 2019-06-28 2019-06-28 Voice test method and device of intelligent equipment and electronic equipment

Publications (2)

Publication Number Publication Date
CN110335628A CN110335628A (en) 2019-10-15
CN110335628B true CN110335628B (en) 2022-03-18

Family

ID=68143702

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910580478.9A Active CN110335628B (en) 2019-06-28 2019-06-28 Voice test method and device of intelligent equipment and electronic equipment

Country Status (1)

Country Link
CN (1) CN110335628B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111081252A (en) * 2019-12-03 2020-04-28 深圳追一科技有限公司 Voice data processing method and device, computer equipment and storage medium
CN114822501B (en) * 2022-04-18 2023-07-25 四川虹美智能科技有限公司 Automatic test method and system for intelligent equipment voice recognition and semantic recognition

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106228986A (en) * 2016-07-26 2016-12-14 北京奇虎科技有限公司 The automated testing method of a kind of speech recognition engine, device and system
CN108538296A (en) * 2017-03-01 2018-09-14 广东神马搜索科技有限公司 Speech recognition test method and test terminal
CN109102797A (en) * 2018-07-06 2018-12-28 平安科技(深圳)有限公司 Speech recognition test method, device, computer equipment and storage medium
CN109493852A (en) * 2018-12-11 2019-03-19 北京搜狗科技发展有限公司 A kind of evaluating method and device of speech recognition

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101154175A (en) * 2006-09-25 2008-04-02 佛山市顺德区顺达电脑厂有限公司 Audio testing method and system combining with voice recognition
US9659564B2 (en) * 2014-10-24 2017-05-23 Sestek Ses Ve Iletisim Bilgisayar Teknolojileri Sanayi Ticaret Anonim Sirketi Speaker verification based on acoustic behavioral characteristics of the speaker

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106228986A (en) * 2016-07-26 2016-12-14 北京奇虎科技有限公司 The automated testing method of a kind of speech recognition engine, device and system
CN108538296A (en) * 2017-03-01 2018-09-14 广东神马搜索科技有限公司 Speech recognition test method and test terminal
CN109102797A (en) * 2018-07-06 2018-12-28 平安科技(深圳)有限公司 Speech recognition test method, device, computer equipment and storage medium
CN109493852A (en) * 2018-12-11 2019-03-19 北京搜狗科技发展有限公司 A kind of evaluating method and device of speech recognition

Also Published As

Publication number Publication date
CN110335628A (en) 2019-10-15

Similar Documents

Publication Publication Date Title
US11322153B2 (en) Conversation interaction method, apparatus and computer readable storage medium
CN109408526B (en) SQL sentence generation method, device, computer equipment and storage medium
CN108304375A (en) A kind of information identifying method and its equipment, storage medium, terminal
CN110853648B (en) Bad voice detection method and device, electronic equipment and storage medium
CN113139387B (en) Semantic error correction method, electronic device and storage medium
US11017774B2 (en) Cognitive audio classifier
CN111382255A (en) Method, apparatus, device and medium for question and answer processing
KR20190136911A (en) method and device for retelling text, server and storage medium
CN111210842A (en) Voice quality inspection method, device, terminal and computer readable storage medium
CN110335628B (en) Voice test method and device of intelligent equipment and electronic equipment
CN111192601A (en) Music labeling method and device, electronic equipment and medium
CN109144879B (en) Test analysis method and device
CN111177307A (en) Test scheme and system based on semantic understanding similarity threshold configuration
CN111354354B (en) Training method, training device and terminal equipment based on semantic recognition
CN110781673B (en) Document acceptance method and device, computer equipment and storage medium
CN115392235A (en) Character matching method and device, electronic equipment and readable storage medium
US11822589B2 (en) Method and system for performing summarization of text
CN111581388A (en) User intention identification method and device and electronic equipment
CN112527967A (en) Text matching method, device, terminal and storage medium
CN110287284B (en) Semantic matching method, device and equipment
CN110263346B (en) Semantic analysis method based on small sample learning, electronic equipment and storage medium
CN109993190B (en) Ontology matching method and device and computer storage medium
CN113656575B (en) Training data generation method and device, electronic equipment and readable medium
CN109522542B (en) Method and device for identifying automobile fault statement
CN113704452A (en) Data recommendation method, device, equipment and medium based on Bert model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20210518

Address after: 100085 Baidu Building, 10 Shangdi Tenth Street, Haidian District, Beijing

Applicant after: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY Co.,Ltd.

Applicant after: Shanghai Xiaodu Technology Co.,Ltd.

Address before: 100085 Baidu Building, 10 Shangdi Tenth Street, Haidian District, Beijing

Applicant before: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY Co.,Ltd.

GR01 Patent grant
GR01 Patent grant