CN110928583A

CN110928583A - Terminal awakening method, device, equipment and computer readable storage medium

Info

Publication number: CN110928583A
Application number: CN201910959549.6A
Authority: CN
Inventors: 姜梦一; 马颖江; 张轶
Original assignee: Gree Electric Appliances Inc of Zhuhai
Current assignee: Gree Electric Appliances Inc of Zhuhai
Priority date: 2019-10-10
Filing date: 2019-10-10
Publication date: 2020-03-27
Anticipated expiration: 2039-10-10
Also published as: CN110928583B

Abstract

The invention discloses a terminal awakening method, a terminal awakening device, terminal equipment and a computer readable storage medium. The method comprises the following steps: acquiring fuzzy voice information and biological characteristic information of a user; determining a voice matching degree between the fuzzy voice information and preset standard voice information and a feature similarity between the biological feature information and preset standard biological feature information; determining the confidence of the user according to the voice matching degree and the feature similarity; and if the confidence of the user is greater than a preset confidence threshold, executing terminal awakening operation. In the invention, when the user forgets the awakening words or the memory of the awakening words is inaccurate, the confidence coefficient of the user is determined by voice matching and assisting in a characteristic matching mode, and the terminal equipment is awakened only when the confidence coefficient of the user is greater than a confidence coefficient threshold value, so that the success rate of awakening the terminal is improved, the awakening process is convenient to operate, and the user experience is better.

Description

Terminal awakening method, device, equipment and computer readable storage medium

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a terminal wake-up method, apparatus, device, and computer-readable storage medium.

Background

After the terminal enters the sleep mode, the terminal can be restored to the working state only by waking up the terminal. The traditional terminal wake-up mode is to wake up the terminal by clicking a physical button.

With the continuous development of the voice recognition technology, a new experience is added to the user by waking up the terminal through the voice recognition technology. However, the existing method of waking up the terminal by using the voice recognition technology has a low success rate for the user. For example: and carrying out voice recognition on the awakening words input by the user, and executing terminal awakening operation when the awakening words are determined to be correct. However, when the user forgets the content of the wakeup word or the memory of the content of the wakeup word is blurred, the terminal cannot be waken, which brings great inconvenience to the use.

Disclosure of Invention

The invention mainly aims to provide a terminal awakening method, a terminal awakening device, terminal equipment and a computer readable storage medium, and aims to solve the problem that the success rate of terminal awakening by adopting a voice recognition technology is low in the prior art.

Aiming at the technical problems, the invention solves the technical problems by the following technical scheme:

the invention provides a terminal awakening method, which comprises the following steps: acquiring fuzzy voice information and biological characteristic information of a user; determining a voice matching degree between the fuzzy voice information and preset standard voice information and a feature similarity between the biological feature information and preset standard biological feature information; determining the confidence of the user according to the voice matching degree and the feature similarity; and if the confidence of the user is greater than a preset confidence threshold, executing terminal awakening operation.

Wherein, the voice matching degree at least comprises one of the following degrees: character similarity and semantic relatedness.

Wherein the determining the confidence level of the user according to the voice matching degree and the feature similarity comprises: taking the weighted sum of the character similarity, the semantic relevance and the feature similarity as the confidence of the user; or determining reference values corresponding to the character similarity, the semantic relevance and the feature similarity respectively, and taking the weighted sum of the reference values corresponding to the character similarity, the semantic relevance and the feature similarity respectively as the confidence of the user.

Wherein the method further comprises: if the character similarity between the fuzzy voice information and the standard voice information is zero and the semantic correlation between the fuzzy voice information and the standard voice information is smaller than a preset correlation threshold, prohibiting the terminal awakening operation from being executed; and if the confidence of the user is less than or equal to the confidence threshold, prohibiting the terminal awakening operation.

Wherein, gather user's fuzzy speech information and biological characteristic information, include: acquiring fuzzy voice information of a user and at least one biological characteristic information of the user for multiple times; determining a voice matching degree between the fuzzy voice information and preset standard voice information and a feature similarity between the biological feature information and the preset standard biological feature information, including: determining a voice matching degree between each fuzzy voice message and the standard voice message and a feature similarity between each biological feature message and a corresponding type of standard biological feature message; determining the confidence of the user according to the voice matching degree and the feature similarity, wherein the determining comprises the following steps: for each voice matching degree, determining a candidate confidence degree according to the voice matching degree and each feature similarity; and selecting the candidate confidence coefficient with the maximum value from the plurality of determined candidate confidence coefficients as the confidence coefficient of the user, or taking the average value of the plurality of candidate confidence coefficients as the confidence coefficient of the user.

Before the fuzzy voice information and the biological characteristic information of the user are collected, the method further comprises the following steps: collecting accurate voice information of a user; if the accurate voice information is matched with the standard voice information, executing terminal awakening operation; and if the accurate voice information acquired for the continuous preset times is not matched with the standard voice information, acquiring fuzzy voice information and biological characteristic information of the user.

Wherein, the type of the biological characteristic information at least comprises one of the following types: face feature information, tone feature information, iris feature information, interpupillary distance feature information, and voiceprint feature information.

The invention also provides a terminal awakening device, which comprises: the acquisition module is used for acquiring fuzzy voice information and biological characteristic information of a user; the first determining module is used for determining the voice matching degree between the fuzzy voice information and preset standard voice information and the feature similarity between the biological feature information and preset standard biological feature information; the second determining module is used for determining the confidence of the user according to the voice matching degree and the feature similarity; and the awakening module is used for executing terminal awakening operation under the condition that the confidence coefficient of the user is greater than a preset confidence coefficient threshold value.

The invention also provides terminal awakening equipment, which comprises a processor and a memory; the processor is used for executing the terminal awakening program stored in the memory so as to realize the terminal awakening method.

The present invention also provides a computer-readable storage medium storing one or more programs, which are executable by one or more processors to implement the above-described terminal wake-up method.

The invention has the following beneficial effects:

in the invention, when the user forgets the awakening words or the memory of the awakening words is inaccurate, the confidence coefficient of the user is determined by voice matching and assisting in a characteristic matching mode, and the terminal equipment is awakened only when the confidence coefficient of the user is greater than a confidence coefficient threshold value, so that the success rate of awakening the terminal is improved, the awakening process is convenient to operate, and the user experience is better.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:

fig. 1 is a flowchart of a terminal wake-up method according to an embodiment of the present invention;

fig. 2 is a detailed flowchart of a terminal wake-up method according to an embodiment of the present invention;

fig. 3 is a structural diagram of a terminal wake-up apparatus according to an embodiment of the present invention;

fig. 4 is a structural diagram of a terminal wake-up device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail below with reference to the accompanying drawings and specific embodiments.

According to an embodiment of the invention, a terminal wake-up method is provided. Fig. 1 is a flowchart illustrating a terminal wake-up method according to an embodiment of the invention.

Step S110, collecting fuzzy voice information and biological characteristic information of the user.

Specifically, the fuzzy speech information of the user and the at least one biometric information of the user may be collected multiple times.

The type of the biological characteristic information at least comprises one of the following types: face feature information, tone feature information, iris feature information, interpupillary distance feature information, and voiceprint feature information.

Step S120, determining a voice matching degree between the fuzzy voice information and preset standard voice information, and a feature similarity between the biometric information and preset standard biometric information.

The standard voice message is a wakeup word input by a legal user in advance.

The standard biometric information is biometric information of a legitimate user collected in advance.

Specifically, if a plurality of pieces of fuzzy speech information and a plurality of pieces of biometric information are collected, a speech matching degree between each piece of fuzzy speech information and the standard speech information and a feature similarity between each piece of biometric information and a corresponding kind of standard biometric information are determined.

Further, the voice matching degree at least comprises one of the following: character similarity and semantic relatedness.

The character similarity refers to the similarity of characters between the fuzzy speech information and the standard speech information. And determining the character similarity between the fuzzy voice information and the standard voice information by using a preset character similarity algorithm. For example: determining the number of characters in the standard voice information as n, detecting the number of characters which are the same as the number of characters of the fuzzy voice information and the standard voice information as m, and determining the similarity of the characters as m/n; according to different character similarity algorithms, the same character segments of the fuzzy voice information and the standard voice information can be detected, the total number k of characters contained in all the same character segments is determined, and the character pixel point is k/n.

The semantic relevance refers to the semantic relevance between the fuzzy voice information and the standard voice information. And determining the semantic relevance between the fuzzy voice information and the standard voice information by utilizing a preset semantic relevance algorithm. For example: and determining semantic relevance between the fuzzy voice information and the standard voice information by utilizing a semantic relevance algorithm based on a tree.

Step S130, determining the confidence of the user according to the voice matching degree and the feature similarity.

Specifically, if a plurality of fuzzy speech information and a plurality of biological feature information are collected, so that a plurality of speech matching degrees and a plurality of feature similarities are obtained, determining a candidate confidence level according to the speech matching degrees and the feature similarities for each speech matching degree; and selecting the candidate confidence coefficient with the maximum value from the plurality of determined candidate confidence coefficients as the confidence coefficient of the user, or taking the average value of the plurality of candidate confidence coefficients as the confidence coefficient of the user.

Further, the voice matching degree includes character similarity and semantic relevance, and then for each voice matching degree, determining candidate confidence degrees according to the voice matching degree and the feature similarities includes: taking the weighted sum of the character similarity, the semantic relevance and the feature similarity as a candidate confidence coefficient; or determining reference values corresponding to the character similarity, the semantic relevance and the feature similarities respectively, and taking the weighted sum of the reference values corresponding to the character similarity, the semantic relevance and the feature similarities as candidate confidence.

Step S140, if the confidence of the user is greater than a preset confidence threshold, performing a terminal wake-up operation.

The confidence threshold may be used to gauge whether the user is a legitimate user. The confidence threshold may be an empirical value or a value obtained experimentally.

If the confidence of the user is greater than the preset confidence threshold, the user is a legal user, and the terminal can be awakened for the user at this time, so that the terminal can recover the working state.

And if the confidence level of the user is less than or equal to the confidence level threshold value, the user is an illegal user, and the terminal awakening operation is forbidden to be executed.

In this embodiment, when the user forgets the awakening word or the memory of the awakening word is inaccurate, the confidence level of the user is determined in a voice matching and assisted feature matching mode, and only when the confidence level of the user is greater than a confidence level threshold value, the terminal equipment is awakened, so that the accuracy and the safety of terminal awakening are improved, the success rate of terminal awakening is improved, the awakening process is convenient to operate, and the user experience is good.

A more specific embodiment is provided below to further describe the terminal wake-up method of the present invention.

Fig. 2 is a specific flowchart of a terminal wake-up method according to an embodiment of the invention.

And step S210, collecting accurate voice information of the user.

Before collecting accurate voice information of a user, the method further comprises the following steps: and detecting whether the input object is a living body or not by using a living body detection technology, if so, determining that the input object is a user, and starting to acquire accurate voice information of the user.

Step S220, judging whether the accurate voice information is matched with the standard voice information; if yes, go to step S280; if not, step S230 is performed.

And if the accurate voice information is the same as the standard voice information, judging that the accurate voice information is matched with the standard voice information, otherwise, judging that the accurate voice information is not matched with the standard voice information.

Step S230, if the accurate voice information is not matched with the standard voice information, acquiring fuzzy voice information and biological characteristic information of the user.

Of course, the user may also be provided with the opportunity to acquire accurate voice information for multiple times, and if the accurate voice information acquired for the consecutive preset times is not matched with the standard voice information, the fuzzy voice information and the biological feature information of the user are acquired.

Further, if the accurate voice information is not matched with the standard voice information, accumulating the number of mismatching times, and judging whether the number of mismatching times is greater than a preset time threshold value; if yes, go to step S240; if not, step S210 is performed. For example: and if the threshold value of the times is three times, if the accurate voice information acquired for three consecutive times is not matched with the standard voice information, entering a terminal fuzzy awakening mode, and starting to acquire the fuzzy voice information and the biological characteristic information of the user.

Step S240, determining the character similarity between the fuzzy speech information and the standard speech information by using a preset character similarity algorithm.

And determining the same character segments between the fuzzy voice information and the standard voice information, and taking the ratio k/n of the number k of the characters contained in all the same character segments and the number n of the characters of the standard voice information as the character similarity.

Step S250, determining semantic relevance between the fuzzy voice information and the standard voice information by using a preset semantic relevance algorithm.

And step S260, determining the confidence of the user according to the character similarity, the semantic relevance and the feature similarity.

Taking the weighted sum of the character similarity, the semantic relevance and the feature similarity as the confidence coefficient of the user; or determining reference values corresponding to the character similarity, the semantic relevance and the feature similarity respectively, and taking the weighted sum of the reference values corresponding to the character similarity, the semantic relevance and the feature similarity as the confidence of the user. The respective weights of the character similarity, the semantic relevance and the feature similarity can be set to 1 or determined according to requirements. The weight of each reference value may be set to 1 or may be determined as needed.

Further, according to the character similarity, determining a character reference value; determining a semantic reference value according to the semantic relevance; determining a characteristic reference value according to the characteristic similarity; and taking the weighted sum of the character reference value, the semantic reference value degree and the characteristic reference value as the confidence of the user.

The character similarity can be set to be equal to a character reference value, and the semantic relevance is set to be equal to a semantic reference value; the feature similarity is equal to the feature reference value. Or, a plurality of character similarity ranges can be set, each character similarity range corresponds to one character reference value, and the character reference value corresponding to the character similarity is determined according to the character similarity range in which the character similarity is located; similarly, a plurality of semantic relevancy ranges can be set, each semantic relevancy corresponds to one semantic reference value, and the semantic reference value corresponding to the semantic relevancy is determined according to the semantic relevancy range where the semantic relevancy is located; a plurality of feature similarity ranges can be set for each type of biological feature information, each feature similarity range corresponds to one feature reference value, and the feature reference value corresponding to the feature similarity is determined according to the feature similarity range in which the feature similarity is located.

The larger the end of the range, the larger the reference value. That is, the greater the character similarity, the greater the character reference value, the greater the semantic relatedness, the greater the semantic reference value, the greater the feature similarity, and the greater the feature reference value.

Step S270, judging whether the confidence of the user is greater than a preset confidence threshold; if yes, go to step S280; if not, step S290 is performed.

Step S280, if the accurate voice information is matched with the standard voice information, or the confidence of the user is greater than a preset confidence threshold, a terminal awakening operation is executed.

Step S290, if the confidence of the user is less than or equal to the confidence threshold, prohibiting the terminal wakeup operation.

In this embodiment, if the character similarity between the fuzzy speech information and the standard speech information is zero, and the semantic correlation between the fuzzy speech information and the standard speech information is smaller than a preset correlation threshold, the terminal wake-up operation is prohibited from being executed. Further, the character similarity between the fuzzy speech information and the standard speech information is zero, which indicates that there is no identical character between the fuzzy speech information and the standard speech information.

This embodiment is based on the terminal equipment who has intelligent speech recognition function of awakening up to ensure that user's user experience is the starting point, when the user forgets awakening up the word or to awakening up the memory of word inaccurate, adopt the mode based on the confidence, through the voice match and supplementary with the mode of feature matching, awaken up terminal equipment, increased the success rate and the security that the terminal was awakened up, promoted the terminal convenience of awakening up and awaken up process convenient operation, user experience is better.

The following provides a terminal wake-up apparatus. The device can be arranged on the side of the terminal equipment with the intelligent voice recognition awakening function.

Fig. 3 is a block diagram of a terminal wake-up apparatus according to an embodiment of the invention.

This terminal awakening device includes: an acquisition module 310, a first determination module 320, a second determination module 330, and a wake-up module 340.

The collecting module 310 is configured to collect the fuzzy speech information and the biometric information of the user.

The first determining module 320 is configured to determine a voice matching degree between the fuzzy voice information and preset standard voice information, and a feature similarity between the biometric information and preset standard biometric information.

A second determining module 330, configured to determine a confidence level of the user according to the voice matching degree and the feature similarity.

And the wake-up module 340 is configured to execute a terminal wake-up operation when the confidence of the user is greater than a preset confidence threshold.

The second determining module 330 is configured to use a weighted sum of the character similarity, the semantic relevance, and the feature similarity as the confidence of the user; or determining reference values corresponding to the character similarity, the semantic relevance and the feature similarity respectively, and taking the weighted sum of the reference values corresponding to the character similarity, the semantic relevance and the feature similarity respectively as the confidence of the user.

The wake-up module 340 is configured to prohibit the terminal wake-up operation from being executed when the character similarity between the fuzzy speech information and the standard speech information is zero, and the semantic correlation between the fuzzy speech information and the standard speech information is smaller than a preset correlation threshold; and if the confidence of the user is less than or equal to the confidence threshold, prohibiting the terminal awakening operation.

The acquisition module 310 is configured to acquire fuzzy speech information of a user and acquire at least one piece of biometric information of the user for multiple times; the first determining module 320 is configured to determine a voice matching degree between each of the fuzzy voice information and the standard voice information, and a feature similarity between each of the biometric information and a corresponding category of standard biometric information; the second determining module 330 is configured to determine, for each of the voice matching degrees, a candidate confidence according to the voice matching degree and each of the feature similarities; and selecting the candidate confidence coefficient with the maximum value from the plurality of determined candidate confidence coefficients as the confidence coefficient of the user, or taking the average value of the plurality of candidate confidence coefficients as the confidence coefficient of the user.

The acquisition module 310 is configured to acquire accurate voice information of a user before acquiring fuzzy voice information and biometric information of the user; the wake-up module 340 is configured to execute a terminal wake-up operation when the accurate voice information matches the standard voice information; the collecting module 310 is further configured to collect the fuzzy voice information and the biometric information of the user under the condition that the accurate voice information collected for the consecutive preset times is not matched with the standard voice information.

The embodiment provides a terminal wake-up device. Fig. 4 is a block diagram of a terminal wake-up device according to an embodiment of the present invention.

In this embodiment, the terminal wake-up device includes, but is not limited to: a processor 410, a memory 420.

The processor 410 is configured to execute the terminal wake-up program stored in the memory 420 to implement the terminal wake-up method described above.

Specifically, the processor 410 is configured to execute the terminal wake-up program stored in the memory 420 to implement the following steps: acquiring fuzzy voice information and biological characteristic information of a user; determining a voice matching degree between the fuzzy voice information and preset standard voice information and a feature similarity between the biological feature information and preset standard biological feature information; determining the confidence of the user according to the voice matching degree and the feature similarity; and if the confidence of the user is greater than a preset confidence threshold, executing terminal awakening operation.

The embodiment of the invention also provides a computer readable storage medium. The computer-readable storage medium herein stores one or more programs. Among other things, computer-readable storage media may include volatile memory, such as random access memory; the memory may also include non-volatile memory, such as read-only memory, flash memory, a hard disk, or a solid state disk; the memory may also comprise a combination of memories of the kind described above.

When one or more programs in the computer-readable storage medium are executable by one or more processors to implement the terminal wake-up method described above.

Specifically, the processor is configured to execute a terminal wake-up program stored in the memory to implement the following steps: acquiring fuzzy voice information and biological characteristic information of a user; determining a voice matching degree between the fuzzy voice information and preset standard voice information and a feature similarity between the biological feature information and preset standard biological feature information; determining the confidence of the user according to the voice matching degree and the feature similarity; and if the confidence of the user is greater than a preset confidence threshold, executing terminal awakening operation.

The above description is only an example of the present invention, and is not intended to limit the present invention, and it is obvious to those skilled in the art that various modifications and variations can be made in the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.

Claims

1. A terminal wake-up method is characterized by comprising the following steps:

acquiring fuzzy voice information and biological characteristic information of a user;

determining a voice matching degree between the fuzzy voice information and preset standard voice information and a feature similarity between the biological feature information and preset standard biological feature information;

determining the confidence of the user according to the voice matching degree and the feature similarity;

and if the confidence of the user is greater than a preset confidence threshold, executing terminal awakening operation.

2. The method of claim 1,

the voice matching degree at least comprises one of the following degrees: character similarity and semantic relatedness.

3. The method of claim 2, wherein determining the confidence level of the user according to the speech matching degree and the feature similarity comprises:

taking the weighted sum of the character similarity, the semantic relevance and the feature similarity as the confidence of the user; alternatively, the first and second electrodes may be,

and determining reference values corresponding to the character similarity, the semantic relevance and the feature similarity respectively, and taking the weighted sum of the reference values corresponding to the character similarity, the semantic relevance and the feature similarity respectively as the confidence of the user.

4. The method of claim 2, further comprising:

if the character similarity between the fuzzy voice information and the standard voice information is zero and the semantic correlation between the fuzzy voice information and the standard voice information is smaller than a preset correlation threshold, prohibiting the terminal awakening operation from being executed;

and if the confidence of the user is less than or equal to the confidence threshold, prohibiting the terminal awakening operation.

5. The method of claim 1,

the collecting of the fuzzy voice information and the biological characteristic information of the user comprises the following steps:

acquiring fuzzy voice information of a user and at least one biological characteristic information of the user for multiple times;

determining a voice matching degree between the fuzzy voice information and preset standard voice information and a feature similarity between the biological feature information and the preset standard biological feature information, including:

determining a voice matching degree between each fuzzy voice message and the standard voice message and a feature similarity between each biological feature message and a corresponding type of standard biological feature message;

determining the confidence of the user according to the voice matching degree and the feature similarity, wherein the determining comprises the following steps:

for each voice matching degree, determining a candidate confidence degree according to the voice matching degree and each feature similarity;

and selecting the candidate confidence coefficient with the maximum value from the plurality of determined candidate confidence coefficients as the confidence coefficient of the user, or taking the average value of the plurality of candidate confidence coefficients as the confidence coefficient of the user.

6. The method of claim 1, further comprising, prior to collecting the user's ambiguous speech information and biometric information:

collecting accurate voice information of a user;

if the accurate voice information is matched with the standard voice information, executing terminal awakening operation;

and if the accurate voice information acquired for the continuous preset times is not matched with the standard voice information, acquiring fuzzy voice information and biological characteristic information of the user.

7. The method according to any one of claims 1 to 6, wherein the category of the biometric information includes at least one of:

face feature information, tone feature information, iris feature information, interpupillary distance feature information, and voiceprint feature information.

8. A terminal wake-up apparatus, comprising:

the acquisition module is used for acquiring fuzzy voice information and biological characteristic information of a user;

the first determining module is used for determining the voice matching degree between the fuzzy voice information and preset standard voice information and the feature similarity between the biological feature information and preset standard biological feature information;

the second determining module is used for determining the confidence of the user according to the voice matching degree and the feature similarity;

and the awakening module is used for executing terminal awakening operation under the condition that the confidence coefficient of the user is greater than a preset confidence coefficient threshold value.

9. A terminal wake-up device, characterized in that the terminal wake-up device comprises a processor, a memory; the processor is used for executing the terminal wake-up program stored in the memory to realize the terminal wake-up method of any one of claims 1 to 7.

10. A computer readable storage medium storing one or more programs, the one or more programs being executable by one or more processors to implement the terminal wake-up method of any one of claims 1 to 7.