CN117612527A - Equipment awakening method and device, storage medium and electronic equipment - Google Patents

Equipment awakening method and device, storage medium and electronic equipment Download PDF

Info

Publication number
CN117612527A
CN117612527A CN202311582019.7A CN202311582019A CN117612527A CN 117612527 A CN117612527 A CN 117612527A CN 202311582019 A CN202311582019 A CN 202311582019A CN 117612527 A CN117612527 A CN 117612527A
Authority
CN
China
Prior art keywords
voice
user
awakening
wake
voices
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311582019.7A
Other languages
Chinese (zh)
Inventor
李其浪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen TCL New Technology Co Ltd
Original Assignee
Shenzhen TCL New Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen TCL New Technology Co Ltd filed Critical Shenzhen TCL New Technology Co Ltd
Priority to CN202311582019.7A priority Critical patent/CN117612527A/en
Publication of CN117612527A publication Critical patent/CN117612527A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/4401Bootstrapping
    • G06F9/4418Suspend and resume; Hibernate and awake
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Electric Clocks (AREA)

Abstract

The application discloses a device wake-up method, a device, a storage medium and electronic equipment, and relates to the technical field of Internet of things, wherein the method comprises the following steps: obtaining wake-up evaluation information corresponding to a plurality of user voices of a target user; extracting voice characteristic information at least comprising the awakening confidence coefficient of each user voice from awakening evaluation information corresponding to the plurality of user voices, wherein the awakening confidence coefficient is obtained by analyzing the user voices by a voice awakening model; and analyzing and processing the voice characteristic information to obtain a user voice awakening threshold corresponding to the target user, wherein the user voice awakening threshold is used for judging whether equipment awakening is carried out according to awakening voice of the target user. The voice wake-up effect of the equipment can be improved, and the user experience is improved.

Description

Equipment awakening method and device, storage medium and electronic equipment
Technical Field
The application relates to the technical field of the internet of things, in particular to a device wake-up method, a device, a storage medium and electronic equipment.
Background
In a far-field voice awakening scene, collected audio data for voice are sent into a voice awakening model in real time to be analyzed, in the continuous analysis process, the model can score and evaluate whether the voice of the user contains awakening words such as small T and small T, little college classmates and the like, and when the awakening confidence reaches a preset voice awakening threshold value, the awakening action is judged to be triggered to awaken equipment.
At present, a preset voice wake-up threshold is generally actively set when a voice wake-up model is initialized, and the same preset voice wake-up threshold is generally adopted for the voice wake-up model of the same version, wherein the voice wake-up model is generally fixed and uniform in setting, and in actual application, the applicant finds that the actual situation of a single user cannot be effectively changed, so that the voice wake-up effect of equipment is poor, and the experience difference of different users is large.
Disclosure of Invention
The embodiment of the application provides a scheme, which can effectively improve the voice awakening effect of equipment and improve the experience of a user.
The embodiment of the application provides the following technical scheme:
according to one embodiment of the present application, a device wake-up method includes: obtaining wake-up evaluation information corresponding to a plurality of user voices of a target user; extracting voice characteristic information at least comprising the awakening confidence coefficient of each user voice from awakening evaluation information corresponding to the plurality of user voices, wherein the awakening confidence coefficient is obtained by analyzing the user voices by a voice awakening model; and analyzing and processing the voice characteristic information to obtain a user voice awakening threshold corresponding to the target user, wherein the user voice awakening threshold is used for judging whether equipment awakening is carried out according to awakening voice of the target user.
In some embodiments of the present application, the analyzing the voice feature information to obtain a user voice wake-up threshold corresponding to the target user includes: averaging the awakening confidence coefficient of the user voice contained in the voice characteristic information to obtain a first average value; and obtaining a user voice wake-up threshold corresponding to the target user according to the first average value.
In some embodiments of the present application, the voice feature information further includes wake scenes corresponding to the voices of the users; the step of analyzing and processing the voice characteristic information to obtain a user voice wake-up threshold corresponding to the target user comprises the following steps: dividing the awakening confidence coefficient of the user voice included in the voice characteristic information into awakening confidence coefficients in different awakening scenes according to the awakening scenes corresponding to the user voice; averaging the awakening confidence degrees under different awakening scenes respectively to obtain second average values under different awakening scenes respectively; and obtaining a user voice wake-up threshold corresponding to the target user in different wake-up scenes according to the second average value in the different wake-up scenes.
In some embodiments of the present application, the voice feature information further includes a volume corresponding to each of the user voices; the step of averaging the awakening confidence coefficient of the user voice included in the voice characteristic information to obtain a first average value includes: inquiring the volume weighting coefficient of each user voice according to the volume corresponding to each user voice; and carrying out weighted average on the wake-up confidence coefficient and the volume weighting coefficient corresponding to the user voice included in the voice characteristic information to obtain the first average value.
In some embodiments of the present application, the voice feature information further includes an environmental noise corresponding to each of the user voices; the step of averaging the awakening confidence coefficient of the user voice included in the voice characteristic information to obtain a first average value includes: inquiring the environment weighting coefficient of each user voice according to the environment noise corresponding to each user voice; and carrying out weighted average on the wake-up confidence coefficient and the environment weighting coefficient corresponding to the user voice included in the voice characteristic information to obtain the first average value.
In some embodiments of the present application, the obtaining wake-up assessment information corresponding to a plurality of user voices of a target user includes: and acquiring wake-up evaluation information corresponding to the user voices with the wake-up confidence coefficient being larger than or equal to a suspected voice wake-up threshold value, and acquiring the wake-up evaluation information corresponding to the plurality of user voices, wherein the suspected voice wake-up threshold value is smaller than a preset voice wake-up threshold value corresponding to the voice wake-up model.
In some embodiments of the present application, after the analyzing the voice feature information to obtain the user voice wake-up threshold corresponding to the target user, the method further includes: receiving wake-up voice of the target user aiming at target equipment; analyzing and processing the awakening voice by adopting the voice awakening model to obtain awakening confidence corresponding to the awakening voice; and if the awakening confidence coefficient corresponding to the awakening voice is greater than or equal to the user voice awakening threshold value, awakening the target equipment.
According to one embodiment of the present application, a device wake-up apparatus, the apparatus comprises: the acquisition module is used for acquiring wake-up evaluation information corresponding to a plurality of user voices of the target user; the extraction module is used for extracting voice characteristic information at least comprising the awakening confidence coefficient of each user voice from the awakening evaluation information corresponding to the plurality of user voices, wherein the awakening confidence coefficient is obtained by analyzing the user voices by a voice awakening model; the analysis module is used for analyzing and processing the voice characteristic information to obtain a user voice awakening threshold corresponding to the target user, and the user voice awakening threshold is used for judging whether equipment awakening is carried out according to awakening voice of the target user.
In some embodiments of the present application, the analysis module is configured to: averaging the awakening confidence coefficient of the user voice contained in the voice characteristic information to obtain a first average value; and obtaining a user voice wake-up threshold corresponding to the target user according to the first average value.
In some embodiments of the present application, the voice feature information further includes wake scenes corresponding to the voices of the users; the analysis module is used for: dividing the awakening confidence coefficient of the user voice included in the voice characteristic information into awakening confidence coefficients in different awakening scenes according to the awakening scenes corresponding to the user voice; averaging the awakening confidence degrees under different awakening scenes respectively to obtain second average values under different awakening scenes respectively; and obtaining a user voice wake-up threshold corresponding to the target user in different wake-up scenes according to the second average value in the different wake-up scenes.
In some embodiments of the present application, the voice feature information further includes a volume corresponding to each of the user voices; the analysis module is used for: inquiring the volume weighting coefficient of each user voice according to the volume corresponding to each user voice; and carrying out weighted average on the wake-up confidence coefficient and the volume weighting coefficient corresponding to the user voice included in the voice characteristic information to obtain the first average value.
In some embodiments of the present application, the voice feature information further includes an environmental noise corresponding to each of the user voices; the analysis module is used for: inquiring the environment weighting coefficient of each user voice according to the environment noise corresponding to each user voice; and carrying out weighted average on the wake-up confidence coefficient and the environment weighting coefficient corresponding to the user voice included in the voice characteristic information to obtain the first average value.
In some embodiments of the present application, the acquiring module is configured to: and acquiring wake-up evaluation information corresponding to the user voices with the wake-up confidence coefficient being larger than or equal to a suspected voice wake-up threshold value, and acquiring the wake-up evaluation information corresponding to the plurality of user voices, wherein the suspected voice wake-up threshold value is smaller than a preset voice wake-up threshold value corresponding to the voice wake-up model.
In some embodiments of the present application, after the analyzing and processing the voice feature information to obtain a user voice wake-up threshold corresponding to the target user, the apparatus further includes a wake-up module configured to: receiving wake-up voice of the target user aiming at target equipment; analyzing and processing the awakening voice by adopting the voice awakening model to obtain awakening confidence corresponding to the awakening voice; and if the awakening confidence coefficient corresponding to the awakening voice is greater than or equal to the user voice awakening threshold value, awakening the target equipment.
According to another embodiment of the present application, a storage medium has stored thereon a computer program which, when executed by a processor of a computer, causes the computer to perform the method described in the embodiments of the present application.
According to another embodiment of the present application, an electronic device may include: a memory storing a computer program; and the processor reads the computer program stored in the memory to execute the method according to the embodiment of the application.
According to another embodiment of the present application, a computer program product or computer program includes computer instructions stored in a computer readable storage medium. The computer instructions are read from the computer-readable storage medium by a processor of a computer device, and executed by the processor, cause the computer device to perform the methods provided in the various alternative implementations described in the embodiments of the present application.
In the embodiment of the application, wake-up evaluation information corresponding to a plurality of user voices of a target user is acquired; extracting voice characteristic information at least comprising the awakening confidence coefficient of each user voice from awakening evaluation information corresponding to the plurality of user voices, wherein the awakening confidence coefficient is obtained by analyzing the user voices by a voice awakening model; and analyzing and processing the voice characteristic information to obtain a user voice awakening threshold corresponding to the target user, wherein the user voice awakening threshold is used for judging whether equipment awakening is carried out according to awakening voice of the target user.
In this way, aiming at a target user, voice characteristic information comprising awakening confidence degrees corresponding to a plurality of user voices of the target user is collected, collected information is analyzed to obtain a user voice awakening threshold corresponding to the target user, the awakening threshold which can be better matched with the target user is dynamically generated, a customizing effect of a threshold of the user is achieved, and furthermore, when the target user wakes up by voice, the user voice awakening threshold is more in line with the actual situation of the user, so that the voice awakening effect of the device can be effectively improved, and the user experience is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 shows a flow chart of a device wake-up method according to an embodiment of the present application.
Fig. 2 shows a block diagram of a device wake-up apparatus according to an embodiment of the present application.
Fig. 3 shows a block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The present disclosure is further described in detail below with reference to the drawings and examples. It should be understood that the examples provided herein are merely illustrative of the present disclosure and are not intended to limit the present disclosure. In addition, the embodiments provided below are some of the embodiments for implementing the present disclosure, and not all of the embodiments for implementing the present disclosure, and the technical solutions described in the embodiments of the present disclosure may be implemented in any combination without conflict.
It should be noted that, in the embodiments of the present disclosure, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a method or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such method or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other related elements (e.g., a step in a method or a unit in an apparatus, e.g., a unit may be a part of a circuit, a part of a processor, a part of a program or software, etc.) in a method or apparatus comprising the element.
For example, the device wake-up method provided in the embodiment of the present disclosure includes a series of steps, but the device wake-up method provided in the embodiment of the present disclosure is not limited to the described steps, and similarly, the device wake-up apparatus provided in the embodiment of the present disclosure includes a series of units, but the device provided in the embodiment of the present disclosure is not limited to including the explicitly described units, and may also include units that are required to be set when acquiring related information or performing processing based on the information.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure.
Fig. 1 schematically shows a flow chart of a device wake-up method according to an embodiment of the present application. The execution main body of the device wake-up method can be any device or server with processing capability, such as a computer, a mobile phone, a smart watch, a household appliance and the like, and a server such as a cloud server or a physical server and the like.
As shown in fig. 1, the device wake-up method may include steps S110 to S130.
Step S110, wake-up evaluation information corresponding to a plurality of user voices of a target user is obtained; step S120, extracting voice characteristic information at least comprising the awakening confidence coefficient of each user voice from awakening evaluation information corresponding to the plurality of user voices, wherein the awakening confidence coefficient is obtained by analyzing the user voices by a voice awakening model; step S130, analyzing and processing the voice characteristic information to obtain a user voice wake-up threshold corresponding to the target user, wherein the user voice wake-up threshold is used for judging whether to wake up equipment according to wake-up voice of the target user.
For the target user, the wake-up rating information corresponding to each user voice in the plurality of user voices can be obtained, and the wake-up rating information corresponding to the user voice can comprise information such as wake-up confidence level, wake-up scene, volume and the like corresponding to the user voice. The voice wake-up model can analyze the voices of the plurality of users respectively to obtain wake-up confidence degrees corresponding to the voices of the users.
And extracting needed information such as the awakening confidence coefficient from the awakening evaluation information corresponding to the plurality of user voices to obtain voice characteristic information at least comprising the awakening confidence coefficient of each user voice. And analyzing and processing voice characteristic information at least comprising the awakening confidence coefficient of each user voice to obtain the personalized user voice awakening threshold value corresponding to the target user.
The user voice wake-up threshold may be used to determine whether to wake up the device according to the wake-up voice of the target user, for example, after receiving the wake-up voice of the target user, audio data of the wake-up voice may be input into the voice wake-up model to analyze the wake-up voice, so as to obtain a wake-up confidence corresponding to the wake-up voice output by the voice wake-up model, and if the wake-up confidence corresponding to the wake-up voice is greater than the user voice wake-up threshold, the device that the wake-up voice needs to wake up may be woke up.
In this way, based on steps S110 to S130, for the target user, voice feature information including the wake-up confidence coefficient corresponding to the plurality of user voices of the target user is collected, the collected information is analyzed to obtain the user voice wake-up threshold corresponding to the target user, the wake-up threshold which can be better matched with the target user is dynamically generated, and a customizing effect of a user threshold is achieved.
Further alternative embodiments of the steps performed when waking up the device under the embodiment of fig. 1 are described below.
In one embodiment, the obtaining wake assessment information corresponding to a plurality of user voices of the target user includes: and acquiring wake-up evaluation information corresponding to the user voices with the wake-up confidence coefficient being larger than or equal to a suspected voice wake-up threshold value, and acquiring the wake-up evaluation information corresponding to the plurality of user voices, wherein the suspected voice wake-up threshold value is smaller than a preset voice wake-up threshold value corresponding to the voice wake-up model.
The voice wake-up model can analyze voices of the target user respectively to obtain corresponding wake-up confidence coefficient of the target user, and if the corresponding wake-up confidence coefficient of a certain voice is larger than or equal to a suspected voice wake-up threshold value, the certain voice is regarded as the user voice with the wake-up confidence coefficient larger than or equal to the suspected voice wake-up threshold value.
For example, if the suspected voice wake-up threshold is smaller than a predetermined voice wake-up threshold corresponding to the voice wake-up model, for example, if the preset voice wake-up threshold preset by the algorithm is 90 and the suspected voice wake-up threshold mwake v is 80, if the wake-up confidence coefficient of a certain voice is greater than or equal to 80, wake-up evaluation information of the certain voice is obtained.
The suspected voice awakening threshold can be set according to actual conditions, awakening evaluation information corresponding to the user voice with the awakening confidence coefficient larger than or equal to the suspected voice awakening threshold is obtained, and the accuracy of the analyzed user voice awakening threshold can be further improved.
In one embodiment, the analyzing the voice feature information to obtain a user voice wake-up threshold corresponding to the target user includes:
averaging the awakening confidence coefficient of the user voice contained in the voice characteristic information to obtain a first average value; and obtaining a user voice wake-up threshold corresponding to the target user according to the first average value.
For example, the number of the plurality of user voices is 100, and the average value of the wake-up confidence degrees of the 100 user voices is calculated, and the obtained average value is the first average value.
Obtaining a user voice wake-up threshold corresponding to the target user according to the first average value may specifically include: if the first average value is larger than or equal to a preset voice awakening threshold value corresponding to the voice awakening model, the preset voice awakening threshold value is used as a user voice awakening threshold value corresponding to a final target user; and if the first average value is smaller than the preset voice awakening threshold value corresponding to the voice awakening model, taking the first average value as the user voice awakening threshold value corresponding to the target user.
Further, in one implementation manner, the voice characteristic information further includes a volume corresponding to each of the user voices; the step of averaging the awakening confidence coefficient of the user voice included in the voice characteristic information to obtain a first average value includes: inquiring the volume weighting coefficient of each user voice according to the volume corresponding to each user voice; and carrying out weighted average on the wake-up confidence coefficient and the volume weighting coefficient corresponding to the user voice included in the voice characteristic information to obtain the first average value.
Specifically, in this embodiment, a first average value may be calculated according to the formula y1= (a1×x1+a2×x2.+an×xn)/n, where Y1 is the first average value, A1 to An are the wake-up confidence degrees corresponding to each user's voice, and X1 to Xn are the volume weighting coefficients corresponding to each user's voice.
Under the implementation mode, the user voice wake-up threshold value is obtained by further considering the volume of different user voices, the accuracy of the calculated user voice wake-up threshold value is further improved, and the wake-up effect is improved.
Further, in one embodiment, the voice feature information further includes an environmental noise corresponding to each of the user voices; the step of averaging the awakening confidence coefficient of the user voice included in the voice characteristic information to obtain a first average value includes:
inquiring the environment weighting coefficient of each user voice according to the environment noise corresponding to each user voice; and carrying out weighted average on the wake-up confidence coefficient and the environment weighting coefficient corresponding to the user voice included in the voice characteristic information to obtain the first average value.
Specifically, in this embodiment, the first average value may be calculated according to the formula y1= (a1+m1+a2.+anx Mn)/n, where Y1 is the first average value, A1 to An are the wake-up confidence degrees corresponding to the voices of the users, and M1 to Mn are the environmental weighting coefficients corresponding to the voices of the users, respectively. The noise level in the voice can be reflected by the environmental noise level, and the noise level in the voice is higher as the environmental noise level is higher.
Under the implementation mode, the user voice wake-up threshold value is obtained by further considering the environmental noise of different user voices, the accuracy of the calculated user voice wake-up threshold value can be further improved, and the wake-up effect is improved.
Further, in an embodiment, the averaging the wake confidence of the user voice included in the voice feature information to obtain a first average value includes: according to the formula y1= (a1+a2+ & An)/n, calculating to obtain a first average value, wherein Y1 is the first average value, and A1 to An are the wake-up confidence degrees corresponding to the voices of the users respectively.
In one embodiment, the voice feature information further includes wake scenes corresponding to the voices of the users; the step of analyzing and processing the voice characteristic information to obtain a user voice wake-up threshold corresponding to the target user comprises the following steps: dividing the awakening confidence coefficient of the user voice included in the voice characteristic information into awakening confidence coefficients in different awakening scenes according to the awakening scenes corresponding to the user voice; averaging the awakening confidence degrees under different awakening scenes respectively to obtain second average values under different awakening scenes respectively; and obtaining a user voice wake-up threshold corresponding to the target user in different wake-up scenes according to the second average value in the different wake-up scenes.
Different wake-up scenarios may be divided according to the actual situation. For example, the different wake-up scenarios may be home, factory, or office, etc., or the different wake-up scenarios may be high-traffic, medium-traffic, or low-traffic, etc., by division by the number of people.
According to the wake-up scenes corresponding to the user voices, the wake-up confidence coefficient of the user voices contained in the voice characteristic information is divided into wake-up confidence coefficients in different wake-up scenes. And respectively averaging the awakening confidence degrees in different awakening scenes to respectively obtain second average values in different awakening scenes.
According to the second average value under different wake-up scenes, a user voice wake-up threshold corresponding to the target user under different wake-up scenes can be obtained, which specifically includes: if the second average value in a certain awakening scene is larger than or equal to a preset voice awakening threshold value corresponding to the voice awakening model, the preset voice awakening threshold value is used as a user voice awakening threshold value corresponding to a final target user in the certain awakening scene; and if the second average value in a certain wake-up scene is smaller than the preset voice wake-up threshold corresponding to the voice wake-up model, taking the second average value as the user voice wake-up threshold corresponding to the target user in the certain wake-up scene.
Further, in one implementation manner, the voice characteristic information further includes a volume corresponding to each of the user voices; the step of respectively averaging the wake-up confidence coefficients under different wake-up scenes to respectively obtain second average values under different wake-up scenes comprises the following steps: under each awakening scene, inquiring the volume weighting coefficient of each user voice according to the volume corresponding to each user voice under each awakening scene; and carrying out weighted average on the awakening confidence coefficient and the volume weighting coefficient corresponding to the user voice included in the voice characteristic information to obtain a second average value in the awakening scene.
Further, in one embodiment, the voice feature information further includes an environmental noise corresponding to each of the user voices; the step of respectively averaging the wake-up confidence coefficients under different wake-up scenes to respectively obtain second average values under different wake-up scenes comprises the following steps: under each awakening scene, inquiring the environment weighting coefficient of each user voice according to the environment noise corresponding to each user voice under each awakening scene; and carrying out weighted average on the wake-up confidence coefficient and the environment weighting coefficient corresponding to the user voice included in the voice characteristic information to obtain a second average value in a wake-up scene.
Further, in an embodiment, the averaging the wake confidence degrees under different wake scenes respectively, to obtain second average values under different wake scenes respectively, includes: under each wake-up scene, according to a formula y2= (b1+b2+ & gt Bn)/n, calculating to obtain a second average value under the wake-up scene, wherein Y2 is the second average value under the wake-up scene, and B1 to Bn are wake-up confidence degrees corresponding to voices of each user under the wake-up scene respectively.
In one embodiment, after the analyzing the voice feature information to obtain the user voice wake-up threshold corresponding to the target user, the method further includes: receiving wake-up voice of the target user aiming at target equipment; analyzing and processing the awakening voice by adopting the voice awakening model to obtain awakening confidence corresponding to the awakening voice; and if the awakening confidence coefficient corresponding to the awakening voice is greater than or equal to the user voice awakening threshold value, awakening the target equipment.
After receiving the wake-up voice of the target user, the audio data of the wake-up voice can be input into the voice wake-up model for analysis, so that the wake-up confidence corresponding to the wake-up voice output by the voice wake-up model is obtained, and if the wake-up confidence corresponding to the wake-up voice is greater than the voice wake-up threshold of the user, the equipment which needs to be awakened by the wake-up voice can be awakened.
The dynamic generation can better match the wake-up threshold value of the target user, and achieves the customizing effect of a threshold value of a user, and furthermore, when the target user wakes up the equipment by using voice, the voice wake-up threshold value of the user is more in line with the actual situation of the user, and the voice wake-up effect of the equipment is good.
In order to facilitate better implementation of the device wake-up method provided by the embodiment of the application, the embodiment of the application also provides a device wake-up device based on the device wake-up method. Where the meaning of the term is the same as in the above-described device wake-up method, specific implementation details may be referred to the description in the method embodiment. Fig. 2 shows a block diagram of a device wake-up apparatus according to an embodiment of the present application.
As shown in fig. 2, the device wake-up apparatus 200 may include: the obtaining module 210 may be configured to obtain wake assessment information corresponding to a plurality of user voices of the target user; the extracting module 220 may be configured to extract, from wake assessment information corresponding to the plurality of user voices, voice feature information at least including wake confidence degrees of the user voices, where the wake confidence degrees are obtained by analyzing the user voices by a voice wake model; the analysis module 230 may be configured to perform analysis processing on the voice feature information to obtain a user voice wake-up threshold corresponding to the target user, where the user voice wake-up threshold is used to determine whether to wake up the device according to the wake-up voice of the target user.
In some embodiments of the present application, the analysis module is configured to: averaging the awakening confidence coefficient of the user voice contained in the voice characteristic information to obtain a first average value; and obtaining a user voice wake-up threshold corresponding to the target user according to the first average value.
In some embodiments of the present application, the voice feature information further includes wake scenes corresponding to the voices of the users; the analysis module is used for: dividing the awakening confidence coefficient of the user voice included in the voice characteristic information into awakening confidence coefficients in different awakening scenes according to the awakening scenes corresponding to the user voice; averaging the awakening confidence degrees under different awakening scenes respectively to obtain second average values under different awakening scenes respectively; and obtaining a user voice wake-up threshold corresponding to the target user in different wake-up scenes according to the second average value in the different wake-up scenes.
In some embodiments of the present application, the voice feature information further includes a volume corresponding to each of the user voices; the analysis module is used for: inquiring the volume weighting coefficient of each user voice according to the volume corresponding to each user voice; and carrying out weighted average on the wake-up confidence coefficient and the volume weighting coefficient corresponding to the user voice included in the voice characteristic information to obtain the first average value.
In some embodiments of the present application, the voice feature information further includes an environmental noise corresponding to each of the user voices; the analysis module is used for: inquiring the environment weighting coefficient of each user voice according to the environment noise corresponding to each user voice; and carrying out weighted average on the wake-up confidence coefficient and the environment weighting coefficient corresponding to the user voice included in the voice characteristic information to obtain the first average value.
In some embodiments of the present application, the acquiring module is configured to: and acquiring wake-up evaluation information corresponding to the user voices with the wake-up confidence coefficient being larger than or equal to a suspected voice wake-up threshold value, and acquiring the wake-up evaluation information corresponding to the plurality of user voices, wherein the suspected voice wake-up threshold value is smaller than a preset voice wake-up threshold value corresponding to the voice wake-up model.
In some embodiments of the present application, after the analyzing and processing the voice feature information to obtain a user voice wake-up threshold corresponding to the target user, the apparatus further includes a wake-up module configured to: receiving wake-up voice of the target user aiming at target equipment; analyzing and processing the awakening voice by adopting the voice awakening model to obtain awakening confidence corresponding to the awakening voice; and if the awakening confidence coefficient corresponding to the awakening voice is greater than or equal to the user voice awakening threshold value, awakening the target equipment.
It should be noted that although in the above detailed description several modules or units of a device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functions of two or more modules or units described above may be embodied in one module or unit, in accordance with embodiments of the present application. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.
In addition, the embodiment of the application further provides an electronic device, as shown in fig. 3, which shows a schematic structural diagram of the electronic device according to the embodiment of the application, specifically:
the electronic device may include one or more processing cores 'processors 301, one or more computer-readable storage media's memory 302, power supply 303, and input unit 304, among other components. Those skilled in the art will appreciate that the electronic device structure shown in fig. 3 is not limiting of the electronic device and may include more or fewer components than shown, or may combine certain components, or may be arranged in different components.
Wherein:
the processor 301 is the control center of the electronic device, connects the various parts of the overall computer device using various interfaces and lines, and performs various functions of the computer device and processes data by running or executing software programs and/or modules stored in the memory 302, and invoking data stored in the memory 302, thereby performing overall monitoring of the electronic device. Optionally, processor 301 may include one or more processing cores; preferably, the processor 301 may integrate an application processor and a modem processor, wherein the application processor primarily handles operating systems, user pages, applications, etc., and the modem processor primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 301.
The memory 302 may be used to store software programs and modules, and the processor 301 executes various functional applications and data processing by executing the software programs and modules stored in the memory 302. The memory 302 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data created according to the use of the computer device, etc. In addition, memory 302 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device. Accordingly, the memory 302 may also include a memory controller to provide the processor 301 with access to the memory 302.
The electronic device further comprises a power supply 303 for powering the various components, preferably the power supply 303 is logically connected to the processor 301 by a power management system, whereby the functions of managing charging, discharging, and power consumption are performed by the power management system. The power supply 303 may also include one or more of any components, such as a direct current or alternating current power supply, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.
In particular, in this embodiment, the processor 301 in the electronic device loads executable files corresponding to the processes of one or more computer programs into the memory 302 according to the following instructions, and the processor 301 executes the computer programs stored in the memory 302, so as to implement the functions in the foregoing embodiments of the present application, where the processor 301 may perform the following steps:
obtaining wake-up evaluation information corresponding to a plurality of user voices of a target user; extracting voice characteristic information at least comprising the awakening confidence coefficient of each user voice from awakening evaluation information corresponding to the plurality of user voices, wherein the awakening confidence coefficient is obtained by analyzing the user voices by a voice awakening model; and analyzing and processing the voice characteristic information to obtain a user voice awakening threshold corresponding to the target user, wherein the user voice awakening threshold is used for judging whether equipment awakening is carried out according to awakening voice of the target user.
In some embodiments of the present application, the analyzing the voice feature information to obtain a user voice wake-up threshold corresponding to the target user includes: averaging the awakening confidence coefficient of the user voice contained in the voice characteristic information to obtain a first average value; and obtaining a user voice wake-up threshold corresponding to the target user according to the first average value.
In some embodiments of the present application, the voice feature information further includes wake scenes corresponding to the voices of the users; the step of analyzing and processing the voice characteristic information to obtain a user voice wake-up threshold corresponding to the target user comprises the following steps: dividing the awakening confidence coefficient of the user voice included in the voice characteristic information into awakening confidence coefficients in different awakening scenes according to the awakening scenes corresponding to the user voice; averaging the awakening confidence degrees under different awakening scenes respectively to obtain second average values under different awakening scenes respectively; and obtaining a user voice wake-up threshold corresponding to the target user in different wake-up scenes according to the second average value in the different wake-up scenes.
In some embodiments of the present application, the voice feature information further includes a volume corresponding to each of the user voices; the step of averaging the awakening confidence coefficient of the user voice included in the voice characteristic information to obtain a first average value includes: inquiring the volume weighting coefficient of each user voice according to the volume corresponding to each user voice; and carrying out weighted average on the wake-up confidence coefficient and the volume weighting coefficient corresponding to the user voice included in the voice characteristic information to obtain the first average value.
In some embodiments of the present application, the voice feature information further includes an environmental noise corresponding to each of the user voices; the step of averaging the awakening confidence coefficient of the user voice included in the voice characteristic information to obtain a first average value includes: inquiring the environment weighting coefficient of each user voice according to the environment noise corresponding to each user voice; and carrying out weighted average on the wake-up confidence coefficient and the environment weighting coefficient corresponding to the user voice included in the voice characteristic information to obtain the first average value.
In some embodiments of the present application, the obtaining wake-up assessment information corresponding to a plurality of user voices of a target user includes: and acquiring wake-up evaluation information corresponding to the user voices with the wake-up confidence coefficient being larger than or equal to a suspected voice wake-up threshold value, and acquiring the wake-up evaluation information corresponding to the plurality of user voices, wherein the suspected voice wake-up threshold value is smaller than a preset voice wake-up threshold value corresponding to the voice wake-up model.
In some embodiments of the present application, after the analyzing the voice feature information to obtain the user voice wake-up threshold corresponding to the target user, the method further includes: receiving wake-up voice of the target user aiming at target equipment; analyzing and processing the awakening voice by adopting the voice awakening model to obtain awakening confidence corresponding to the awakening voice; and if the awakening confidence coefficient corresponding to the awakening voice is greater than or equal to the user voice awakening threshold value, awakening the target equipment.
It will be appreciated by those of ordinary skill in the art that all or part of the steps of the various methods of the above embodiments may be performed by a computer program, or by computer program control related hardware, which may be stored in a computer readable storage medium and loaded and executed by a processor.
To this end, the present embodiments also provide a storage medium having stored therein a computer program that can be loaded by a processor to perform the steps of any of the methods provided by the embodiments of the present application.
Wherein the storage medium may include: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.
Since the computer program stored in the storage medium may perform any of the steps in the method provided in the embodiment of the present application, the beneficial effects that can be achieved by the method provided in the embodiment of the present application may be achieved, which are detailed in the previous embodiments and are not described herein.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the embodiments disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains.
It will be understood that the present application is not limited to the embodiments that have been described above and shown in the drawings, but that various modifications and changes can be made without departing from the scope thereof.

Claims (10)

1. A method of waking up a device, comprising:
obtaining wake-up evaluation information corresponding to a plurality of user voices of a target user;
extracting voice characteristic information at least comprising the awakening confidence coefficient of each user voice from awakening evaluation information corresponding to the plurality of user voices, wherein the awakening confidence coefficient is obtained by analyzing the user voices by a voice awakening model;
and analyzing and processing the voice characteristic information to obtain a user voice awakening threshold corresponding to the target user, wherein the user voice awakening threshold is used for judging whether equipment awakening is carried out according to awakening voice of the target user.
2. The method of claim 1, wherein the analyzing the voice feature information to obtain the user voice wake-up threshold corresponding to the target user comprises:
averaging the awakening confidence coefficient of the user voice contained in the voice characteristic information to obtain a first average value;
and obtaining a user voice wake-up threshold corresponding to the target user according to the first average value.
3. The method of claim 1, wherein the voice feature information further comprises wake scenes corresponding to each of the user voices;
the step of analyzing and processing the voice characteristic information to obtain a user voice wake-up threshold corresponding to the target user comprises the following steps:
dividing the awakening confidence coefficient of the user voice included in the voice characteristic information into awakening confidence coefficients in different awakening scenes according to the awakening scenes corresponding to the user voice;
averaging the awakening confidence degrees under different awakening scenes respectively to obtain second average values under different awakening scenes respectively;
and obtaining a user voice wake-up threshold corresponding to the target user in different wake-up scenes according to the second average value in the different wake-up scenes.
4. The method of claim 2, wherein the voice characteristic information further comprises a volume level corresponding to each of the user voices; the step of averaging the awakening confidence coefficient of the user voice included in the voice characteristic information to obtain a first average value includes:
inquiring the volume weighting coefficient of each user voice according to the volume corresponding to each user voice;
and carrying out weighted average on the wake-up confidence coefficient and the volume weighting coefficient corresponding to the user voice included in the voice characteristic information to obtain the first average value.
5. The method of claim 2, wherein the voice characteristic information further comprises an ambient noise corresponding to each of the user voices; the step of averaging the awakening confidence coefficient of the user voice included in the voice characteristic information to obtain a first average value includes:
inquiring the environment weighting coefficient of each user voice according to the environment noise corresponding to each user voice;
and carrying out weighted average on the wake-up confidence coefficient and the environment weighting coefficient corresponding to the user voice included in the voice characteristic information to obtain the first average value.
6. The method according to claim 1, wherein the obtaining wake up assessment information corresponding to a plurality of user voices of the target user includes:
and acquiring wake-up evaluation information corresponding to the user voices with the wake-up confidence coefficient being larger than or equal to a suspected voice wake-up threshold value, and acquiring the wake-up evaluation information corresponding to the plurality of user voices, wherein the suspected voice wake-up threshold value is smaller than a preset voice wake-up threshold value corresponding to the voice wake-up model.
7. The method according to any one of claims 1 to 6, wherein after the analyzing the voice feature information to obtain the user voice wake-up threshold corresponding to the target user, the method further comprises:
receiving wake-up voice of the target user aiming at target equipment;
analyzing and processing the awakening voice by adopting the voice awakening model to obtain awakening confidence corresponding to the awakening voice;
and if the awakening confidence coefficient corresponding to the awakening voice is greater than or equal to the user voice awakening threshold value, awakening the target equipment.
8. A device wake-up apparatus, comprising:
the acquisition module is used for acquiring wake-up evaluation information corresponding to a plurality of user voices of the target user;
the extraction module is used for extracting voice characteristic information at least comprising the awakening confidence coefficient of each user voice from the awakening evaluation information corresponding to the plurality of user voices, wherein the awakening confidence coefficient is obtained by analyzing the user voices by a voice awakening model;
the analysis module is used for analyzing and processing the voice characteristic information to obtain a user voice awakening threshold corresponding to the target user, and the user voice awakening threshold is used for judging whether equipment awakening is carried out according to awakening voice of the target user.
9. A storage medium having stored thereon a computer program which, when executed by a processor of a computer, causes the computer to perform the method of any of claims 1 to 7.
10. An electronic device, comprising: a memory storing a computer program; a processor reading a computer program stored in a memory to perform the method of any one of claims 1 to 7.
CN202311582019.7A 2023-11-23 2023-11-23 Equipment awakening method and device, storage medium and electronic equipment Pending CN117612527A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311582019.7A CN117612527A (en) 2023-11-23 2023-11-23 Equipment awakening method and device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311582019.7A CN117612527A (en) 2023-11-23 2023-11-23 Equipment awakening method and device, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN117612527A true CN117612527A (en) 2024-02-27

Family

ID=89949214

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311582019.7A Pending CN117612527A (en) 2023-11-23 2023-11-23 Equipment awakening method and device, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN117612527A (en)

Similar Documents

Publication Publication Date Title
CN107564518B (en) Intelligent device control method and device and computer device
CN107943583B (en) Application processing method and device, storage medium and electronic equipment
CN107567083B (en) Method and device for performing power-saving optimization processing
CN110070857B (en) Model parameter adjusting method and device of voice awakening model and voice equipment
CN111091813B (en) Voice wakeup model updating and wakeup method, system, device, equipment and medium
CN110634468B (en) Voice wake-up method, device, equipment and computer readable storage medium
KR20150084941A (en) Systems and methods for classification of audio environments
CN108847216B (en) Voice processing method, electronic device and storage medium
CN111722696B (en) Voice data processing method and device for low-power-consumption equipment
CN110737322A (en) Information processing method and electronic equipment
CN110706691B (en) Voice verification method and device, electronic equipment and computer readable storage medium
CN117395699A (en) Monitoring factor energy-saving communication method and system based on Internet of things
CN117612527A (en) Equipment awakening method and device, storage medium and electronic equipment
US20200264683A1 (en) Electronic device and method for determining operating frequency of processor
CN113055984A (en) Terminal control method and device, mobile terminal and storage medium
CN116012439A (en) Control method and control system of intelligent control console based on multiple sensors
CN115295004A (en) Noise detection method, terminal equipment and storage medium
CN113889109A (en) Method for adjusting voice wake-up mode, storage medium and electronic device
CN114090054A (en) Intelligent equipment upgrading method and device, storage medium and electronic equipment
CN106886486B (en) Method and device for evaluating user terminal use attribute
CN113051126A (en) Image construction method, device and equipment and storage medium
CN111464644A (en) Data transmission method and electronic equipment
CN113383311A (en) Application processing method and device, storage medium and electronic equipment
CN112163709B (en) Method and device for electricity utilization promotion, storage medium, and electronic device
CN116128463B (en) Item reminding method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination