CN112133296A - Full-duplex voice control method, device, storage medium and voice equipment - Google Patents

Full-duplex voice control method, device, storage medium and voice equipment Download PDF

Info

Publication number
CN112133296A
CN112133296A CN202010881215.4A CN202010881215A CN112133296A CN 112133296 A CN112133296 A CN 112133296A CN 202010881215 A CN202010881215 A CN 202010881215A CN 112133296 A CN112133296 A CN 112133296A
Authority
CN
China
Prior art keywords
target object
voice
information
pronunciation direction
condition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010881215.4A
Other languages
Chinese (zh)
Inventor
陈士勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiaomi Mobile Software Co Ltd
Original Assignee
Beijing Xiaomi Mobile Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xiaomi Mobile Software Co Ltd filed Critical Beijing Xiaomi Mobile Software Co Ltd
Priority to CN202010881215.4A priority Critical patent/CN112133296A/en
Publication of CN112133296A publication Critical patent/CN112133296A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/166Detection; Localisation; Normalisation using acquisition arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Abstract

The disclosure relates to a full-duplex voice control method, a device, a storage medium and a voice device, which solve the technical problem that in the interaction process of full-duplex voice in the related technology, the false recognition and the false execution are caused due to the fact that the interaction process is easily influenced by environmental factors. The method comprises the following steps: under the condition that the voice equipment is in a radio reception state, responding to the received voice instruction sent by the target object, and collecting biological characteristic information of the target object; acquiring pronunciation direction information of the target object under the condition that the biological characteristic information is matched with the preset characteristic information; determining whether the pronunciation direction of the target object faces the voice equipment or not according to the pronunciation direction information; under the condition that the pronunciation direction of the target object faces the voice equipment, executing the operation corresponding to the voice instruction, and prolonging the radio reception time length of the voice equipment; and under the condition that the pronunciation direction of the target object is not towards the voice equipment, discarding the operation corresponding to the voice instruction, and shortening the sound reception time length of the voice equipment.

Description

Full-duplex voice control method, device, storage medium and voice equipment
Technical Field
The present disclosure relates to the field of voice interaction technologies, and in particular, to a full duplex voice control method, apparatus, storage medium, and voice device.
Background
Voice interaction has become the indispensable people's interactive mode of people, can realize a sentence of speech and turn on light, transfer TV station etc. consequently, how to improve voice interaction's experience, make voice interaction more natural, become the topic that the user was concerned about, and full duplex pronunciation is exactly the direction that makes voice interaction more natural.
In the related technology, the principle of full duplex voice is to always turn on mic reception or extend reception time within a certain reception time period, which is easily affected by environmental factors and causes the problem of false recognition and false execution.
Disclosure of Invention
To overcome technical problems in the related art, the present disclosure provides a full-duplex voice control method, apparatus, storage medium, and voice device.
According to a first aspect of the embodiments of the present disclosure, a full-duplex voice control method is provided, including:
under the condition that the voice equipment is in a radio reception state, responding to the received voice instruction sent by a target object, and collecting biological characteristic information of the target object;
acquiring pronunciation direction information of the target object under the condition that the biological characteristic information is matched with preset characteristic information;
determining whether the pronunciation direction of the target object faces the voice equipment or not according to the pronunciation direction information;
under the condition that the pronunciation direction of the target object faces the voice equipment, executing the operation corresponding to the voice instruction, and prolonging the radio reception time length of the voice equipment;
and under the condition that the pronunciation direction of the target object is not towards the voice equipment, discarding the operation corresponding to the voice instruction, and shortening the sound receiving time length of the voice equipment.
Optionally, the acquiring the biometric information of the target object includes:
and acquiring the voiceprint information of the target object according to the voice instruction.
Optionally, the acquiring of the pronunciation direction information of the target object includes:
acquiring image information of the target object through a camera, and determining face characteristic information and mouth shape characteristic information of the target object according to the image information;
determining a face orientation of the target object according to the image information, wherein the pronunciation direction information comprises the face orientation.
Optionally, the acquiring the biometric information of the target object includes:
acquiring image information of the target object through a camera, and determining face characteristic information and mouth shape characteristic information of the target object according to the image information;
and acquiring the face characteristic information and the mouth shape characteristic information of the target object.
Optionally, the acquiring of the pronunciation direction information of the target object includes:
acquiring the acquired face characteristic information and mouth shape characteristic information of the target object;
determining a face orientation of the target object according to the image information, wherein the pronunciation direction information comprises the face orientation.
Optionally, the extending the radio reception duration of the voice device includes:
prolonging the radio reception time according to a preset growth gradient, wherein the growth gradient comprises a plurality of growth proportions, and the growth proportion of the next time is larger than that of the previous time;
the shortening of the radio reception time of the voice device comprises:
and shortening the sound receiving time length according to a preset shortening gradient, wherein the shortening gradient comprises a plurality of shortening proportions, and the shortening proportion of the next time is larger than that of the previous time.
Optionally, after shortening the sound reception duration of the speech device, the method further includes:
and controlling the voice equipment to stop receiving the sound under the condition that the shortened sound receiving time is less than a preset shortest sound receiving time threshold.
According to a second aspect of the embodiments of the present disclosure, there is provided a full-duplex voice control apparatus, including:
the first information acquisition module is configured to respond to the receiving of a voice instruction sent by a target object under the condition that the voice equipment is in a sound receiving state, and acquire biological characteristic information of the target object;
the second information acquisition module is configured to acquire pronunciation direction information of the target object under the condition that the biological characteristic information is matched with preset characteristic information;
a judging module configured to determine whether the pronunciation direction of the target object is towards the voice device according to the pronunciation direction information;
the first execution module is configured to execute the operation corresponding to the voice instruction and prolong the sound receiving duration of the voice equipment under the condition that the pronunciation direction of the target object faces the voice equipment;
and the second execution module is configured to discard the operation corresponding to the voice instruction and shorten the sound receiving time of the voice equipment under the condition that the pronunciation direction of the target object is not towards the voice equipment.
Optionally, the first information obtaining module is configured to collect voiceprint information of the target object according to the voice instruction.
Optionally, the second information obtaining module is configured to collect image information of the target object through a camera, and determine face feature information and mouth shape feature information of the target object according to the image information;
determining a face orientation of the target object according to the image information, wherein the pronunciation direction information comprises the face orientation.
Optionally, the first information acquisition module is configured to acquire image information of the target object through a camera, and determine face feature information and mouth shape feature information of the target object according to the image information;
and acquiring the face characteristic information and the mouth shape characteristic information of the target object.
Optionally, the second information obtaining module is configured to obtain the acquired face feature information and mouth shape feature information of the target object;
determining a face orientation of the target object according to the image information, wherein the pronunciation direction information comprises the face orientation.
Optionally, the first execution module is configured to extend the sound reception time according to a preset increase gradient, where the increase gradient includes a plurality of increase proportions, and the increase proportion at the next time is greater than the increase proportion at the previous time;
the second execution module is configured to shorten the sound reception time according to a preset shortening gradient, the shortening gradient comprises a plurality of shortening proportions, and the shortening proportion of the next time is larger than that of the previous time.
Optionally, the apparatus further includes a first sound reception control module configured to control the voice device to stop receiving sound if the shortened sound reception time is less than a preset shortest sound reception time threshold.
According to a third aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the steps of the full-duplex voice control method provided by the first aspect of the present disclosure.
According to a fourth aspect of the embodiments of the present disclosure, there is provided a full-duplex voice control apparatus, including:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to:
under the condition that the voice equipment is in a radio reception state, responding to the received voice instruction sent by a target object, and collecting biological characteristic information of the target object;
acquiring pronunciation direction information of the target object under the condition that the biological characteristic information is matched with preset characteristic information;
determining whether the pronunciation direction of the target object faces the voice equipment or not according to the pronunciation direction information;
under the condition that the pronunciation direction of the target object faces the voice equipment, executing the operation corresponding to the voice instruction, and prolonging the radio reception time length of the voice equipment;
and under the condition that the pronunciation direction of the target object is not towards the voice equipment, discarding the operation corresponding to the voice instruction, and shortening the sound receiving time length of the voice equipment.
According to a fifth aspect of the embodiments of the present disclosure, there is provided a voice device, which includes the full-duplex voice control apparatus provided in the second aspect of the present disclosure.
The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects: by identifying the biological characteristic information of the target object which sends the voice command, the voice device can only respond to the voice command of the user (the user can be the awakening person of the voice device for example) which is specified by the preset characteristic information in the process of continuous conversation with the user, the probability of false recognition and false execution is reduced, and in addition, under the condition that the user which is not specified by the preset characteristic information sends the voice command, the voice device can shorten the radio reception time, reduce the radio reception time of the voice device under the condition of excessive environmental noise and further reduce the probability of false recognition and false execution.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
Fig. 1 is a flow diagram illustrating a full duplex voice control method in accordance with an example embodiment.
Fig. 2 is another flow chart illustrating a full duplex voice control method in accordance with an example embodiment.
Fig. 3 is another flow chart illustrating a method of full duplex voice control according to an example embodiment.
Fig. 4 is a block diagram illustrating a full-duplex voice control apparatus according to an example embodiment.
Fig. 5 is another block diagram illustrating a full-duplex voice control apparatus according to an example embodiment.
Fig. 6 is another block diagram illustrating a full-duplex voice control apparatus according to an example embodiment.
Fig. 7 is a block diagram illustrating a full-duplex voice control apparatus according to an example embodiment.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
Fig. 1 is a flowchart illustrating a full-duplex voice control method according to an exemplary embodiment, where the full-duplex voice control method may be used in a voice device, which may be a mobile terminal, a smart speaker, a voice tv, or the like, for example, and the disclosure is not limited thereto. As shown in fig. 1, the method comprises the steps of:
in step S110, in a case that the voice device is in a sound receiving state, in response to receiving a voice instruction issued by a target object, collecting biometric information of the target object;
in step S120, acquiring pronunciation direction information of the target object when the biometric information matches preset feature information;
in step S130, determining whether the pronunciation direction of the target object is towards the voice device according to the pronunciation direction information;
in step S140, when the pronunciation direction of the target object is towards the voice device, executing an operation corresponding to the voice instruction, and extending a sound reception duration of the voice device;
in step S150, when the pronunciation direction of the target object is not oriented to the voice device, the operation corresponding to the voice instruction is discarded, and the sound reception duration of the voice device is shortened.
Optionally, the biometric information may include different types of feature information, such as voiceprint information, facial information, mouth shape feature information, and the like;
the preset feature information may include preset voiceprint information, preset face information, preset mouth shape feature information, and the like corresponding to the biometric feature information.
The embodiment determines that the target object is a user (the user may be, for example, a person who wakes up a voice device) specified by preset feature information through voiceprint recognition, face recognition and mouth shape recognition of the target object, and then determines whether the target object is sending an instruction to the voice device according to the pronunciation direction of the target object; under the condition that the pronunciation direction of the target object faces the voice equipment, executing the operation corresponding to the voice instruction, and prolonging the radio reception time length of the voice equipment; under the condition that the pronunciation direction of the target object is not towards the voice equipment (for example, the target object can be in conversation with other users), the operation corresponding to the voice instruction is discarded, and the radio reception duration of the voice equipment is shortened, so that intelligent radio reception of the voice equipment is realized, the radio reception duration of the voice equipment under the condition of excessive environmental noise is reduced, the voice equipment only responds to the instruction initiated by the user specified by the set characteristic information in the interaction process, and the probability of false recognition and false execution is reduced. On the basis of judging the sound receiving time window of the voice equipment, by combining voiceprint recognition, facial recognition and mouth shape recognition, the voice equipment can be continuously conversed with a user specified by the set characteristic information after being awakened once, and the experience of voice interaction is improved.
Optionally, the collecting the biometric information of the target object in step S110 may be implemented by:
and acquiring the voiceprint information of the target object according to the voice instruction.
Optionally, the collecting the biometric information of the target object in step S110 may be further implemented by:
acquiring image information of the target object through a camera, and determining face characteristic information and mouth shape characteristic information of the target object according to the image information;
the preset feature information comprises preset voiceprint information, face feature information and mouth shape feature information of a person awakened by the voice equipment.
Optionally, in step S120, in a case that the biometric information matches preset feature information, acquiring pronunciation direction information of the target object may be implemented by:
acquiring image information of the target object through a camera, and determining face characteristic information and mouth shape characteristic information of the target object according to the image information;
determining a face orientation of the target object according to the image information, wherein the pronunciation direction information comprises the face orientation.
Optionally, in step S120, in a case that the biometric information matches preset feature information, obtaining pronunciation direction information of the target object may further be implemented by:
acquiring the acquired face characteristic information and mouth shape characteristic information of the target object;
determining a face orientation of the target object according to the image information, wherein the pronunciation direction information comprises the face orientation.
In the embodiment, when the target object is not in the camera acquisition range, the target object is determined to be a awakener of the voice equipment through voiceprint recognition; or when the target object is in the camera acquisition range, determining that the target object is a awakener of the voice equipment by combining facial recognition and mouth shape recognition; so that the voice device only responds to instructions initiated by the awakener. According to the actual conditions of the voice equipment and the target object in the current scene, the identity information of the target object is verified in a corresponding mode, so that the flexibility of identity verification of the awakener by the voice equipment is improved, and the interference of other objects on the interactive process of the voice equipment and the awakener is avoided.
Further, under the condition that the target object sends an instruction to the voice equipment, the obtained pronunciation direction of the target object faces the voice equipment, and at the moment, the voice equipment executes the operation corresponding to the voice instruction sent by the target object currently and prolongs the radio reception time length of the voice equipment; under the condition that the target object is communicated with other objects, the obtained face of the target object faces to the voice equipment, at the moment, the voice equipment discards the operation corresponding to the voice instruction sent by the target object currently, and the sound receiving time length of the voice equipment is shortened. Therefore, the voice equipment can intelligently identify whether the target object sends an instruction or not, and the error identification misoperation of the voice equipment according to the communication content of the target object and other objects is avoided.
Optionally, in step S140, the sound reception duration of the speech device is extended, the sound reception duration may be extended according to a preset increase gradient, where the increase gradient includes a plurality of increase ratios, and the increase ratio of the next time is greater than the increase ratio of the previous time;
in step S150, the sound reception duration of the speech device is shortened, and the sound reception duration may be shortened according to a preset shortening gradient, where the shortening gradient includes a plurality of shortening proportions, and the shortening proportion at the next time is greater than the shortening proportion at the previous time.
The preset increasing gradient and the preset shortening gradient can be preset according to specific conditions in the human-computer interaction process, and the disclosure is not particularly limited.
For example, in the embodiment, the preset increasing gradient is 5%, 10%, 15%, and 30%, the preset decreasing gradient is 5%, 10%, 15%, and 30%, and the initial sound receiving time of the speech device is 10 s;
under the condition of receiving a target object sending instruction for the first time and determining that the pronunciation direction of the target object faces to the voice equipment, executing the operation corresponding to the voice instruction and prolonging the radio reception time of the voice equipment to 10.5 s; under the condition of receiving the target object sending instruction for the second time and determining whether the pronunciation direction of the target object faces the voice equipment or not, executing the operation corresponding to the voice instruction and prolonging the sound receiving time length of the voice equipment to 11.5 s; and under the condition that the target object sending instruction is received for the third time and the pronunciation direction of the target object is determined not to face the voice equipment, discarding the operation corresponding to the voice instruction and shortening the sound receiving time length of the voice equipment to 11 s.
In step S140, extending the sound reception duration of the voice device, and also extending the sound reception duration of the voice device to a first preset sound reception duration;
in step S150, the radio reception duration of the voice device is shortened, and the radio reception duration of the voice device may also be shortened to a second preset radio reception duration;
and the first preset radio receiving time length is longer than the second preset radio receiving time length.
The first preset reception time and the second preset reception time can be preset according to specific conditions in the human-computer interaction process, and the disclosure is not particularly limited.
For example, in this embodiment, the first preset sound receiving time period is 30s, the second preset sound receiving time period is 8s, and the initial sound receiving time period of the voice device is 10 s.
Under the condition that the pronunciation direction of the target object is determined to face the voice equipment, executing the operation corresponding to the voice instruction, and prolonging the sound receiving time of the voice equipment to 30 s; and under the condition that the pronunciation direction of the target object is determined not to face the voice equipment, executing the operation corresponding to the voice instruction, and shortening the sound reception time length of the voice equipment to 8 s.
Whether this embodiment is towards voice device according to the pronunciation direction of target object and adjusts voice device's radio reception duration for voice device can carry out the radio reception more intelligently, thereby responds the pronunciation instruction that the target object sent more flexibly, improves the interactive experience of pronunciation.
Fig. 2 is another flowchart illustrating a full-duplex voice control method according to an exemplary embodiment, and as shown in fig. 2, the full-duplex voice control method may be used in a voice device, for example, a mobile terminal, a smart audio device, a voice tv, and the like, which is not limited by the present disclosure, and the method includes the following steps:
in step S110, in a case that the voice device is in a sound receiving state, in response to receiving a voice instruction issued by a target object, collecting biometric information of the target object;
in step S120, acquiring pronunciation direction information of the target object when the biometric information matches preset feature information;
in step S130, determining whether the pronunciation direction of the target object is towards the voice device according to the pronunciation direction information;
in step S140, when the pronunciation direction of the target object is towards the voice device, executing an operation corresponding to the voice instruction, and extending a sound reception duration of the voice device;
in step S150, when the pronunciation direction of the target object is not oriented to the voice device, the operation corresponding to the voice instruction is discarded, and the sound reception duration of the voice device is shortened.
In step S160, the voice device is controlled to stop receiving the sound when the shortened sound receiving time is less than the preset shortest sound receiving time threshold.
In step S160, the shortest reception time threshold may be the shortest time for the voice device to acquire an effective voice instruction, which is set according to the capability of the voice device, and may be a preset time according to a specific situation in the human-computer interaction process, which is not limited in this disclosure.
For example, in this case, if the sound pickup time length is shortened to 1.5S after the step S150 is executed, the audio device may be directly controlled to stop the sound pickup.
In this embodiment, by limiting the shortest reception duration threshold, it is avoided that the reception time of the voice device is too short to receive a complete instruction sent by the target object, and power consumption of the voice device is saved.
Fig. 3 is another flowchart illustrating a full-duplex voice control method according to an exemplary embodiment, and as shown in fig. 3, the full-duplex voice control method may be used in a voice device, for example, a mobile terminal, a smart audio device, a voice tv, and the like, which is not limited by the present disclosure, and the method includes the following steps:
in step S110, in a case that the voice device is in a sound receiving state, in response to receiving a voice instruction issued by a target object, collecting biometric information of the target object;
in step S120, acquiring pronunciation direction information of the target object when the biometric information matches preset feature information;
in step S130, determining whether the pronunciation direction of the target object is towards the voice device according to the pronunciation direction information;
in step S140, when the pronunciation direction of the target object is towards the voice device, executing an operation corresponding to the voice instruction, and extending a sound reception duration of the voice device;
in step S150, when the pronunciation direction of the target object is not oriented to the voice device, the operation corresponding to the voice instruction is discarded, and the sound reception duration of the voice device is shortened.
In step S170, at the end of the sound reception duration, controlling the audio apparatus to stop receiving sound.
For example, in this embodiment, the sound reception duration of the audio device may be 10s, and at time 0s, the audio device is controlled to stop receiving sound.
In this embodiment, the voice device is controlled to stop receiving the sound at the end time of the sound receiving time, so that the voice device is prevented from receiving the sound after the end time of the sound receiving time, and the power consumption of the voice device is saved.
Fig. 4 is a block diagram illustrating a full-duplex voice control apparatus according to an exemplary embodiment, which may implement part or all of a voice device in software, hardware or a combination of both, as shown in fig. 4, the full-duplex voice control apparatus 400 includes:
the first information acquisition module 401 is configured to, in a case where the voice device is in a sound reception state, acquire biometric information of a target object in response to receiving a voice instruction issued by the target object;
a second information obtaining module 402 configured to obtain pronunciation direction information of the target object if the biometric information matches preset feature information;
a judging module 403 configured to determine whether the pronunciation direction of the target object is towards the voice device according to the pronunciation direction information;
the first execution module 404 is configured to, in a case that the pronunciation direction of the target object is towards the voice device, execute an operation corresponding to the voice instruction, and extend a sound reception duration of the voice device;
the second executing module 405 is configured to discard the operation corresponding to the voice instruction and shorten the sound receiving time of the voice device when the pronunciation direction of the target object is not towards the voice device.
In this embodiment, the full-duplex voice control apparatus determines, through the biometric information of the target object, that the target object is a user specified by the preset feature information (the user may be, for example, a person who wakes up a voice device), and then determines, according to the pronunciation direction of the target object, whether the target object is sending an instruction to the voice device; under the condition that the pronunciation direction of the target object faces the voice equipment, executing the operation corresponding to the voice instruction, and prolonging the radio reception time length of the voice equipment; under the condition that the pronunciation direction of the target object is not towards the voice equipment (for example, the target object can be in the condition of talking with other users), the operation corresponding to the voice instruction is discarded, and the radio reception time of the voice equipment is shortened, so that the intelligent radio reception of the voice equipment is realized, the radio reception time of the voice equipment under the condition of excessive environmental noise is reduced, the voice equipment only responds to the instruction initiated by the user specified by the set characteristic information in the conversation process, the misrecognition and mis-execution are avoided, and the experience of voice interaction is improved.
Optionally, the first information obtaining module 401 may specifically collect voiceprint information of the target object according to the voice instruction.
Optionally, the first information obtaining module 401 may be further specifically configured to obtain image information of the target object through a camera, and determine, according to the image information, face feature information and mouth shape feature information of the target object;
and acquiring the face characteristic information and the mouth shape characteristic information of the target object.
Optionally, the second information obtaining module 402 may be specifically configured to collect image information of the target object through a camera, and determine face feature information and mouth shape feature information of the target object according to the image information;
determining a face orientation of the target object according to the image information, wherein the pronunciation direction information comprises the face orientation.
Optionally, the second information obtaining module 402 may be further configured to obtain the acquired face feature information and mouth shape feature information of the target object;
determining a face orientation of the target object according to the image information, wherein the pronunciation direction information comprises the face orientation.
Optionally, the first executing module 404 may specifically prolong the sound reception time according to a preset growth gradient, where the growth gradient includes a plurality of growth proportions, and the growth proportion at the next time is greater than the growth proportion at the previous time.
Optionally, the second executing module 405 may specifically shorten the sound reception time according to a preset shortening gradient, where the shortening gradient includes a plurality of shortening proportions, and the shortening proportion of the next time is greater than the shortening proportion of the previous time.
Fig. 5 is another block diagram illustrating a full-duplex voice control apparatus according to an exemplary embodiment, which may implement part or all of a voice device in software, hardware or a combination of both, and as shown in fig. 5, the full-duplex voice control apparatus 400 may further include:
the first information acquisition module 401 is configured to, in a case where the voice device is in a sound reception state, acquire biometric information of a target object in response to receiving a voice instruction issued by the target object;
a second information obtaining module 402 configured to obtain pronunciation direction information of the target object if the biometric information matches preset feature information;
a judging module 403 configured to determine whether the pronunciation direction of the target object is towards the voice device according to the pronunciation direction information;
the first execution module 404 is configured to, in a case that the pronunciation direction of the target object is towards the voice device, execute an operation corresponding to the voice instruction, and extend a sound reception duration of the voice device;
the second executing module 405 is configured to discard the operation corresponding to the voice instruction and shorten the sound receiving time of the voice device when the pronunciation direction of the target object is not towards the voice device.
And the first sound reception control module 406 is configured to control the voice device to stop receiving sound if the shortened sound reception time is less than a preset shortest sound reception time threshold.
The first radio reception control module 406 controls the voice device to stop radio reception by limiting the shortest radio reception time threshold, so that the situation that the voice device has too short radio reception time and cannot receive a complete instruction sent by a target object can be avoided.
Fig. 6 is another block diagram illustrating a full-duplex voice control apparatus according to an exemplary embodiment, which may implement part or all of a voice device in software, hardware or a combination of both, and as shown in fig. 6, the full-duplex voice control apparatus 400 may further include:
the first information acquisition module 401 is configured to, in a case where the voice device is in a sound reception state, acquire biometric information of a target object in response to receiving a voice instruction issued by the target object;
a second information obtaining module 402 configured to obtain pronunciation direction information of the target object if the biometric information matches preset feature information;
a judging module 403 configured to determine whether the pronunciation direction of the target object is towards the voice device according to the pronunciation direction information;
the first execution module 404 is configured to, in a case that the pronunciation direction of the target object is towards the voice device, execute an operation corresponding to the voice instruction, and extend a sound reception duration of the voice device;
the second executing module 405 is configured to discard the operation corresponding to the voice instruction and shorten the sound receiving time of the voice device when the pronunciation direction of the target object is not towards the voice device.
And the second sound receiving control module 407 is configured to control the voice device to stop receiving sound at the end of the sound receiving time length.
The second sound reception control module 407 controls the sound reception of the sound equipment to stop at the end time of the sound reception time, so that the sound equipment is prevented from receiving sound after the end time of the sound reception time.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
The present disclosure also provides a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the steps of the full-duplex voice control method provided by the present disclosure.
Specifically, the computer-readable storage medium may be a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, a server, etc.
With regard to the computer-readable storage medium in the above-described embodiments, the method steps when the computer program stored thereon is executed will be described in detail in relation to the embodiments of the method, and will not be elaborated upon here.
The present disclosure also provides a full duplex voice control apparatus, which may be a computer, a platform device, etc., the full duplex voice control apparatus includes:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to:
under the condition that the voice equipment is in a radio reception state, responding to the received voice instruction sent by a target object, and collecting biological characteristic information of the target object;
acquiring pronunciation direction information of the target object under the condition that the biological characteristic information is matched with preset characteristic information;
determining whether the pronunciation direction of the target object faces the voice equipment or not according to the pronunciation direction information;
under the condition that the pronunciation direction of the target object faces the voice equipment, executing the operation corresponding to the voice instruction, and prolonging the radio reception time length of the voice equipment;
and under the condition that the pronunciation direction of the target object is not towards the voice equipment, discarding the operation corresponding to the voice instruction, and shortening the sound receiving time length of the voice equipment.
The full-duplex voice control device identifies the target object as a user (the user can be a awakener of the voice equipment for example) specified by preset characteristic information through the identification of the biological characteristic information of the target object, and then determines whether the target object is sending an instruction to the voice equipment according to the pronunciation direction of the target object; under the condition that the pronunciation direction of the target object faces the voice equipment, executing the operation corresponding to the voice instruction, and prolonging the radio reception time length of the voice equipment; under the condition that the pronunciation direction of the target object is not towards the voice equipment (for example, the target object can be in the condition of talking with other users), the operation corresponding to the voice instruction is discarded, and the radio reception time of the voice equipment is shortened, so that the intelligent radio reception of the voice equipment is realized, the radio reception time of the voice equipment under the condition of excessive environmental noise is reduced, the voice equipment only responds to the instruction initiated by the user specified by the set characteristic information in the conversation process, the misrecognition and mis-execution are avoided, and the experience of voice interaction is improved
Fig. 7 is a block diagram illustrating a full-duplex voice control apparatus 800 according to an example embodiment. As shown in fig. 7, the full-duplex voice control apparatus 800 may include one or more of the following components: a processing component 802, a memory 804, a power component 806, a multimedia component 808, an audio component 810, an input/output (I/O) interface 812, a sensor component 814, and a communication component 816.
Among other things, the processing component 802 generally controls overall operations of the apparatus 800, such as operations associated with imaging operations and interactive recording operations. The processing component 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the full-duplex voice control method described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.
The memory 804 is configured to store various types of data to support operations at the apparatus 800. Examples of such data include instructions for any application or method operating on the device 800, voiceprint information for waking a person, face feature information, mouth feature information, and the like. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
Power component 806 provides power to the various components of device 800. The power components 806 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the device 800.
In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. When the device 800 is in an operating mode, such as a shooting mode or a video mode, the front-facing camera and/or the rear-facing camera may capture image information of the target object. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
The audio component 810 is configured to output and/or input audio signals. For example, audio component 810 includes a Microphone (MIC) configured to receive voice commands from a target object when apparatus 800 is in an operational mode, such as a speech recognition mode. The received voice instructions may further be stored in memory 804 or transmitted via communications component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals responsive to voice commands.
The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, buttons, etc. These buttons may include, but are not limited to: a volume button, a start button, and a lock button.
The sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for the device 800. For example, the sensor component 814 can detect the relative positioning of the apparatus 800 and the target object. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact.
The communication component 816 is configured to facilitate communications between the apparatus 800 and other devices in a wired or wireless manner. The device 800 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the apparatus 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the full-duplex voice control methods described above.
In an exemplary embodiment, a non-transitory computer-readable storage medium comprising instructions, such as the memory 804 comprising instructions, executable by the processor 820 of the device 800 to perform the full-duplex voice control method described above is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
In another exemplary embodiment, there is also provided a voice apparatus including the full-duplex voice control device described above.
The voice device determines that the target object is a user (the user can be a awakener of the voice device) specified by the preset characteristic information through the recognition of the biological characteristic information of the target object, and then determines whether the target object is sending an instruction to the voice device according to the pronunciation direction of the target object; under the condition that the pronunciation direction of the target object faces the voice equipment, executing the operation corresponding to the voice instruction, and prolonging the radio reception time length of the voice equipment; under the condition that the pronunciation direction of the target object is not towards the voice equipment (for example, the target object can be in the condition of talking with other users), the operation corresponding to the voice instruction is discarded, and the radio reception time of the voice equipment is shortened, so that the intelligent radio reception of the voice equipment is realized, the radio reception time of the voice equipment under the condition of excessive environmental noise is reduced, the voice equipment only responds to the instruction initiated by the user specified by the set characteristic information in the conversation process, the misrecognition and mis-execution are avoided, and the experience of voice interaction is improved.
Optionally, the voice device may be a sound box, an air conditioner, a television, or the like.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (17)

1. A full duplex voice control method, the method comprising:
under the condition that the voice equipment is in a radio reception state, responding to the received voice instruction sent by a target object, and collecting biological characteristic information of the target object;
acquiring pronunciation direction information of the target object under the condition that the biological characteristic information is matched with preset characteristic information;
determining whether the pronunciation direction of the target object faces the voice equipment or not according to the pronunciation direction information;
under the condition that the pronunciation direction of the target object faces the voice equipment, executing the operation corresponding to the voice instruction, and prolonging the radio reception time length of the voice equipment;
and under the condition that the pronunciation direction of the target object is not towards the voice equipment, discarding the operation corresponding to the voice instruction, and shortening the sound receiving time length of the voice equipment.
2. The method of claim 1, wherein the acquiring biometric information of the target object comprises:
and acquiring the voiceprint information of the target object according to the voice instruction.
3. The method according to claim 1, wherein the obtaining pronunciation direction information of the target object comprises:
acquiring image information of the target object through a camera, and determining face characteristic information and mouth shape characteristic information of the target object according to the image information;
determining a face orientation of the target object according to the image information, wherein the pronunciation direction information comprises the face orientation.
4. The method of claim 1, wherein the acquiring biometric information of the target object comprises:
acquiring image information of the target object through a camera, and determining face characteristic information and mouth shape characteristic information of the target object according to the image information;
and acquiring the face characteristic information and the mouth shape characteristic information of the target object.
5. The method according to claim 4, wherein the obtaining of the pronunciation direction information of the target object comprises:
acquiring the acquired face characteristic information and mouth shape characteristic information of the target object;
determining a face orientation of the target object according to the image information, wherein the pronunciation direction information comprises the face orientation.
6. The method according to any one of claims 1 to 5,
the prolonging of the radio reception time of the voice equipment comprises the following steps: prolonging the radio reception time according to a preset growth gradient, wherein the growth gradient comprises a plurality of growth proportions, and the growth proportion of the next time is larger than that of the previous time;
the shortening of the radio reception time of the voice device comprises: and shortening the sound receiving time length according to a preset shortening gradient, wherein the shortening gradient comprises a plurality of shortening proportions, and the shortening proportion of the next time is larger than that of the previous time.
7. The method according to any one of claims 1-5, wherein after shortening the radio reception duration of the speech device, the method further comprises:
and controlling the voice equipment to stop receiving the sound under the condition that the shortened sound receiving time is less than a preset shortest sound receiving time threshold.
8. A full-duplex voice control apparatus, the apparatus comprising:
the first information acquisition module is configured to respond to the receiving of a voice instruction sent by a target object under the condition that the voice equipment is in a sound receiving state, and acquire biological characteristic information of the target object;
the second information acquisition module is configured to acquire pronunciation direction information of the target object under the condition that the biological characteristic information is matched with preset characteristic information;
a judging module configured to determine whether the pronunciation direction of the target object is towards the voice device according to the pronunciation direction information;
the first execution module is configured to execute the operation corresponding to the voice instruction and prolong the sound receiving duration of the voice equipment under the condition that the pronunciation direction of the target object faces the voice equipment;
and the second execution module is configured to discard the operation corresponding to the voice instruction and shorten the sound receiving time of the voice equipment under the condition that the pronunciation direction of the target object is not towards the voice equipment.
9. The apparatus of claim 8, wherein the first information obtaining module is configured to collect voiceprint information of the target object according to the voice instruction.
10. The apparatus according to claim 8, wherein the second information obtaining module is configured to collect image information of the target object through a camera, and determine face feature information and mouth shape feature information of the target object according to the image information;
determining a face orientation of the target object according to the image information, wherein the pronunciation direction information comprises the face orientation.
11. The apparatus according to claim 8, wherein the first information acquisition module is configured to acquire image information of the target object through a camera, and determine facial feature information and mouth shape feature information of the target object according to the image information;
and acquiring the face characteristic information and the mouth shape characteristic information of the target object.
12. The apparatus according to claim 11, wherein the second information obtaining module is configured to obtain the acquired face feature information and mouth shape feature information of the target object;
determining a face orientation of the target object according to the image information, wherein the pronunciation direction information comprises the face orientation.
13. The apparatus according to any one of claims 8-12, wherein the first execution module is configured to extend the sound reception time period according to a preset increase gradient, the increase gradient comprises a plurality of increase proportions, and the increase proportion of the next time is larger than the increase proportion of the previous time;
the second execution module is configured to shorten the sound reception time according to a preset shortening gradient, the shortening gradient comprises a plurality of shortening proportions, and the shortening proportion of the next time is larger than that of the previous time.
14. The apparatus according to any one of claims 8-12, further comprising a first sound reception control module configured to control the voice device to stop receiving sound if the shortened sound reception time is less than a preset shortest sound reception time threshold.
15. A computer readable storage medium having computer program instructions stored thereon, which when executed by a processor implement the steps of the full duplex voice control method of any of claims 1 to 7.
16. A full-duplex voice control apparatus, the apparatus comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to:
under the condition that the voice equipment is in a radio reception state, responding to the received voice instruction sent by a target object, and collecting biological characteristic information of the target object;
acquiring pronunciation direction information of the target object under the condition that the biological characteristic information is matched with preset characteristic information;
determining whether the pronunciation direction of the target object faces the voice equipment or not according to the pronunciation direction information;
under the condition that the pronunciation direction of the target object faces the voice equipment, executing the operation corresponding to the voice instruction, and prolonging the radio reception time length of the voice equipment;
and under the condition that the pronunciation direction of the target object is not towards the voice equipment, discarding the operation corresponding to the voice instruction, and shortening the sound receiving time length of the voice equipment.
17. A speech device, characterized in that the speech device comprises a full duplex speech control apparatus according to any of the claims 16.
CN202010881215.4A 2020-08-27 2020-08-27 Full-duplex voice control method, device, storage medium and voice equipment Pending CN112133296A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010881215.4A CN112133296A (en) 2020-08-27 2020-08-27 Full-duplex voice control method, device, storage medium and voice equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010881215.4A CN112133296A (en) 2020-08-27 2020-08-27 Full-duplex voice control method, device, storage medium and voice equipment

Publications (1)

Publication Number Publication Date
CN112133296A true CN112133296A (en) 2020-12-25

Family

ID=73848659

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010881215.4A Pending CN112133296A (en) 2020-08-27 2020-08-27 Full-duplex voice control method, device, storage medium and voice equipment

Country Status (1)

Country Link
CN (1) CN112133296A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113539295A (en) * 2021-06-10 2021-10-22 联想(北京)有限公司 Voice processing method and device
CN115086095A (en) * 2021-03-10 2022-09-20 Oppo广东移动通信有限公司 Equipment control method and related device
CN113539295B (en) * 2021-06-10 2024-04-23 联想(北京)有限公司 Voice processing method and device

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017083713A (en) * 2015-10-29 2017-05-18 シャープ株式会社 Interaction device, interaction equipment, control method for interaction device, control program, and recording medium
CN107291451A (en) * 2017-05-25 2017-10-24 深圳市冠旭电子股份有限公司 Voice awakening method and device
US20180146048A1 (en) * 2016-11-18 2018-05-24 Lenovo (Singapore) Pte. Ltd. Contextual conversation mode for digital assistant
CN109067628A (en) * 2018-09-05 2018-12-21 广东美的厨房电器制造有限公司 Sound control method, control device and the intelligent appliance of intelligent appliance
WO2019007245A1 (en) * 2017-07-04 2019-01-10 阿里巴巴集团控股有限公司 Processing method, control method and recognition method, and apparatus and electronic device therefor
US20190371343A1 (en) * 2018-06-05 2019-12-05 Samsung Electronics Co., Ltd. Voice assistant device and method thereof
CN110689889A (en) * 2019-10-11 2020-01-14 深圳追一科技有限公司 Man-machine interaction method and device, electronic equipment and storage medium
CN111367491A (en) * 2020-03-02 2020-07-03 成都极米科技股份有限公司 Voice interaction method and device, electronic equipment and storage medium
CN111370004A (en) * 2018-12-25 2020-07-03 阿里巴巴集团控股有限公司 Man-machine interaction method, voice processing method and equipment
CN111383633A (en) * 2018-12-29 2020-07-07 深圳Tcl新技术有限公司 Voice recognition continuity control method and device, intelligent terminal and storage medium
CN111402900A (en) * 2018-12-29 2020-07-10 华为技术有限公司 Voice interaction method, device and system
CN111583926A (en) * 2020-05-07 2020-08-25 珠海格力电器股份有限公司 Continuous voice interaction method and device based on cooking equipment and cooking equipment

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017083713A (en) * 2015-10-29 2017-05-18 シャープ株式会社 Interaction device, interaction equipment, control method for interaction device, control program, and recording medium
US20180146048A1 (en) * 2016-11-18 2018-05-24 Lenovo (Singapore) Pte. Ltd. Contextual conversation mode for digital assistant
CN107291451A (en) * 2017-05-25 2017-10-24 深圳市冠旭电子股份有限公司 Voice awakening method and device
WO2019007245A1 (en) * 2017-07-04 2019-01-10 阿里巴巴集团控股有限公司 Processing method, control method and recognition method, and apparatus and electronic device therefor
US20190371343A1 (en) * 2018-06-05 2019-12-05 Samsung Electronics Co., Ltd. Voice assistant device and method thereof
CN109067628A (en) * 2018-09-05 2018-12-21 广东美的厨房电器制造有限公司 Sound control method, control device and the intelligent appliance of intelligent appliance
CN111370004A (en) * 2018-12-25 2020-07-03 阿里巴巴集团控股有限公司 Man-machine interaction method, voice processing method and equipment
CN111383633A (en) * 2018-12-29 2020-07-07 深圳Tcl新技术有限公司 Voice recognition continuity control method and device, intelligent terminal and storage medium
CN111402900A (en) * 2018-12-29 2020-07-10 华为技术有限公司 Voice interaction method, device and system
CN110689889A (en) * 2019-10-11 2020-01-14 深圳追一科技有限公司 Man-machine interaction method and device, electronic equipment and storage medium
CN111367491A (en) * 2020-03-02 2020-07-03 成都极米科技股份有限公司 Voice interaction method and device, electronic equipment and storage medium
CN111583926A (en) * 2020-05-07 2020-08-25 珠海格力电器股份有限公司 Continuous voice interaction method and device based on cooking equipment and cooking equipment

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115086095A (en) * 2021-03-10 2022-09-20 Oppo广东移动通信有限公司 Equipment control method and related device
CN113539295A (en) * 2021-06-10 2021-10-22 联想(北京)有限公司 Voice processing method and device
CN113539295B (en) * 2021-06-10 2024-04-23 联想(北京)有限公司 Voice processing method and device

Similar Documents

Publication Publication Date Title
CN107582028B (en) Sleep monitoring method and device
CN110730115B (en) Voice control method and device, terminal and storage medium
US10230891B2 (en) Method, device and medium of photography prompts
CN109087650B (en) Voice wake-up method and device
CN108806714B (en) Method and device for adjusting volume
CN105976821B (en) Animal language identification method and device
JP2017521024A (en) Audio signal optimization method and apparatus, program, and recording medium
CN106409317B (en) Method and device for extracting dream speech
CN111063354B (en) Man-machine interaction method and device
CN110619873A (en) Audio processing method, device and storage medium
CN111696553A (en) Voice processing method and device and readable medium
CN110705356B (en) Function control method and related equipment
CN115132224A (en) Abnormal sound processing method, device, terminal and storage medium
CN111009239A (en) Echo cancellation method, echo cancellation device and electronic equipment
CN109522058B (en) Wake-up method, device, terminal and storage medium
CN111580773A (en) Information processing method, device and storage medium
CN112133296A (en) Full-duplex voice control method, device, storage medium and voice equipment
CN112509596A (en) Wake-up control method and device, storage medium and terminal
CN105244037B (en) Audio signal processing method and device
CN111988704B (en) Sound signal processing method, device and storage medium
CN108830194B (en) Biological feature recognition method and device
US20160142885A1 (en) Voice call prompting method and device
CN112019948A (en) Intercommunication device communication method, intercommunication device and storage medium
CN112863511A (en) Signal processing method, signal processing apparatus, and storage medium
CN115731923A (en) Command word response method, control equipment and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination