CN109346067B - Voice information processing method and device and storage medium - Google Patents

Voice information processing method and device and storage medium Download PDF

Info

Publication number
CN109346067B
CN109346067B CN201811307605.XA CN201811307605A CN109346067B CN 109346067 B CN109346067 B CN 109346067B CN 201811307605 A CN201811307605 A CN 201811307605A CN 109346067 B CN109346067 B CN 109346067B
Authority
CN
China
Prior art keywords
voice information
voice
sound source
information
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811307605.XA
Other languages
Chinese (zh)
Other versions
CN109346067A (en
Inventor
王慧君
刘健军
毛跃辉
张新
韩雪
廖海霖
郑文成
李保水
文皓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Gree Electric Appliances Inc of Zhuhai
Original Assignee
Gree Electric Appliances Inc of Zhuhai
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Gree Electric Appliances Inc of Zhuhai filed Critical Gree Electric Appliances Inc of Zhuhai
Priority to CN201811307605.XA priority Critical patent/CN109346067B/en
Publication of CN109346067A publication Critical patent/CN109346067A/en
Application granted granted Critical
Publication of CN109346067B publication Critical patent/CN109346067B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/088Word spotting

Abstract

The invention provides a method and a device for processing voice information and a storage medium; wherein, the method comprises the following steps: acquiring first voice information and second voice information generated within a preset range; wherein, the first voice information carries a designated voice awakening word; and according to the de-weighting determined by the relative distance between the sound source position of the first voice information and the sound source position of the second voice information, carrying out de-weighting processing on the first voice information and the second voice information to obtain third voice information. The invention solves the problem of the recognition accuracy rate of the voice control information in the related technology, thereby achieving the effect of improving the voice recognition rate.

Description

Voice information processing method and device and storage medium
Technical Field
The invention relates to the field of computers, in particular to a method and a device for processing voice information and a storage medium.
Background
Due to the attenuation problem of sound wave transmission, the controllable range of the voice covered by a single positioning microphone is limited, the environmental noise is low in a single-user environment, and the accuracy of voice control semantic recognition can be met through a single microphone acquisition device. However, under the multi-user condition, because users are more and the environment is relatively noisy, the speech control information recognition accuracy of the single-microphone acquisition device started based on the position of the user is low, and the false recognition is easily caused.
In view of the above problems in the related art, no effective solution exists at present.
Disclosure of Invention
The embodiment of the invention provides a method and a device for processing voice information and a storage medium, which are used for at least solving the problem of the recognition accuracy rate of voice control information in the related technology.
According to an embodiment of the present invention, there is provided a method for processing voice information, including: acquiring first voice information and second voice information generated within a preset range; wherein, the first voice information carries a designated voice awakening word; and according to the de-weighting determined by the relative distance between the sound source position of the first voice information and the sound source position of the second voice information, carrying out de-weighting processing on the first voice information and the second voice information to obtain third voice information.
Optionally, performing de-duplication processing on the first voice message and the second voice message according to a de-duplication weight determined by a relative distance between a sound source position of the first voice message and a sound source position of the second voice message to obtain third voice message includes: acquiring coincident voice information in which the first voice information and the second voice information are coincident with each other; and carrying out duplication elimination processing on the first voice information according to the duplication elimination weight and the coincidence voice information to obtain processed third voice information.
Optionally, the magnitude of the de-emphasis weight is inversely proportional to the distance of the relative distance.
Optionally, the acquiring the first voice message and the second voice message generated within the preset range includes: starting a microphone closest to the first voice information sound source position to acquire the first voice information; and opening a microphone closest to the second voice information sound source position to acquire the second voice information.
Optionally, the relative distance is obtained by: acquiring the first voice information sound source position and the second voice information sound source position in a camera positioning and/or voice position analysis mode; and determining the relative distance according to the position of the first voice information sound source and the position of the second voice information sound source.
According to another aspect of the present invention, there is provided a speech information processing apparatus, including: the acquisition module is used for acquiring first voice information and second voice information generated in a preset range; wherein, the first voice information carries a designated voice awakening word; and the processing module is used for carrying out de-weighting processing on the first voice information and the second voice information according to the de-weighting weight determined by the relative distance between the sound source position of the first voice information and the sound source position of the second voice information to obtain third voice information.
Optionally, the processing module includes: a first obtaining unit, configured to obtain overlapped speech information in which the first speech information and the second speech information are overlapped with each other; and the processing unit is used for carrying out de-duplication processing on the first voice information according to the de-duplication weight and the coincidence voice information to obtain processed third voice information.
Optionally, the magnitude of the de-emphasis weight is inversely proportional to the distance of the relative distance.
Optionally, the obtaining module includes: the first acquisition unit is used for starting a microphone closest to the position of the first voice information sound source so as to acquire the first voice information; and the second acquisition unit is used for starting a microphone closest to the second voice information sound source position so as to acquire the second voice information.
Optionally, the processing module further comprises: the second acquisition unit is used for acquiring the first voice information sound source position and the second voice information sound source position in a camera positioning and/or voice position analysis mode; and the determining unit is used for determining the relative distance according to the position of the first voice information sound source and the position of the second voice information sound source.
According to a further embodiment of the present invention, there is also provided a storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.
According to the invention, the first voice information and the second voice information carrying the appointed voice awakening words in the preset range are acquired, and then the first voice information and the second voice information are subjected to the de-duplication processing according to the de-duplication weight determined by the relative distance between the sound source position of the first voice information and the sound source position of the second voice information, so that the first voice information carrying the appointed voice awakening words is more pure, namely, the noise in the first voice information is filtered through the de-duplication processing to obtain the third voice information, therefore, the third voice information can be more accurately identified, the identification rate of the voice information is improved, and the problem of the accuracy rate of the voice control information identification in the related technology is solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
fig. 1 is a block diagram of a hardware configuration of a terminal of a voice information processing method according to an embodiment of the present invention;
FIG. 2 is a flow chart of a method of processing voice information according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a speech information processing apparatus according to an embodiment of the present invention.
Detailed Description
The invention will be described in detail hereinafter with reference to the accompanying drawings in conjunction with embodiments. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
Example 1
The method provided by the first embodiment of the present application may be executed in a terminal, a computer terminal, or a similar computing device. Taking the example of being operated on a terminal, fig. 1 is a block diagram of a hardware structure of the terminal of the method for processing voice information according to the embodiment of the present invention. As shown in fig. 1, the terminal 10 may include one or more (only one shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA) and a memory 104 for storing data, and optionally may also include a transmission device 106 for communication functions and an input-output device 108. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration and is not intended to limit the structure of the terminal. For example, the terminal 10 may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.
The memory 104 may be used to store computer programs, for example, software programs and modules of application software, such as a computer program corresponding to the method for processing voice information in the embodiment of the present invention, and the processor 102 executes various functional applications and data processing by running the computer programs stored in the memory 104, so as to implement the method described above. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the terminal 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device 106 is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the terminal 10. In one example, the transmission device 106 includes a Network adapter (NIC), which can be connected to other Network devices through a base station so as to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.
In the present embodiment, a method for processing voice information running on the terminal is provided, and fig. 2 is a flowchart of a method for processing voice information according to an embodiment of the present invention, as shown in fig. 2, the flowchart includes the following steps:
step S202, acquiring first voice information and second voice information generated in a preset range; the first voice information carries a designated voice awakening word;
step S204, according to the determined de-duplication weight of the relative distance between the sound source position of the first voice information and the sound source position of the second voice information, the first voice information and the second voice information are subjected to de-duplication processing to obtain third voice information.
Through the steps S202 and S204, the first voice information and the second voice information carrying the designated voice wakeup word in the preset range are acquired, and then the first voice information and the second voice information are subjected to deduplication processing according to the deduplication weight determined by the relative distance between the sound source position of the first voice information and the sound source position of the second voice information, so that the first voice information carrying the designated voice wakeup word is more pure, that is, the noise in the first voice information is filtered through the deduplication processing to obtain the third voice information, therefore, the third voice information can be recognized more accurately, the recognition rate of the voice information is improved, and the problem of accuracy rate of voice control information recognition in the related technology is solved.
In an optional implementation manner of this embodiment, as to the deduplication weight determined according to the relative distance between the sound source position of the first voice information and the sound source position of the second voice information, which is referred to in the above step S204, the manner of performing deduplication processing on the first voice information and the second voice information to obtain the third voice information may be implemented as follows:
step S204-1, acquiring superposed voice information in which the first voice information and the second voice information are superposed with each other;
step S204-2, carrying out duplication elimination processing on the first voice information according to the duplication elimination weight and the coincidence voice information to obtain processed third voice information; wherein the magnitude of the de-emphasis weight is inversely proportional to the distance of the relative distance.
For the above, the inverse ratio of the magnitude of the de-emphasis weight to the distance of the relative distance is: the weight removal is smaller if the relative distance between the sound source position of the first voice information and the sound source position of the second voice information is larger, and is larger if the relative distance between the sound source position of the first voice information and the sound source position of the second voice information is smaller. The specific value can be set according to the actual situation, as long as the above rule is met.
In another optional implementation manner of this embodiment, the manner of acquiring the first voice information and the second voice information generated within the preset range in step S202 may be implemented as follows:
step S202-1, starting a microphone closest to the position of a first voice information sound source to acquire first voice information;
step S202-2, the microphone closest to the second voice information sound source position is turned on to collect the second voice information.
For the above step S202-1 and step S202-2, in a specific application scenario, the following may be performed: a plurality of microphones are arranged in a preset range, and the microphone closest to the sound source position of the voice information is started while the voice information is generated, and the same way is also adopted for other voice information; in this way, in the case of multiple microphones, the multiple microphones do not have to be turned on simultaneously, thereby saving power. And the method can acquire the voice more accurately.
It should be noted that the relative distance involved in this embodiment is obtained by: acquiring a first voice information sound source position and a second voice information sound source position in a camera positioning and/or voice position analysis mode; and determining the relative distance according to the position of the first voice information sound source and the position of the second voice information sound source.
The following describes the present embodiment in detail with reference to specific embodiments thereof;
the embodiment provides a voice acquisition control and noise reduction method based on multiple microphones, which comprises the following steps:
step S302, acquiring the position of each user based on the camera positioning recognition or voice position analysis technology.
Step S302 is to acquire a voice wake-up word, where the voice information corresponding to the voice wake-up word is control voice information (corresponding to the first voice information in the above embodiment), and the acquisition mode is to turn on a microphone device closest to the user to acquire the control voice information of the user as a control sound source.
In step S304, the microphone devices closest to the other user positions are respectively turned on to collect the voice information (corresponding to the second voice information in the above embodiment) corresponding to the sound source as the noise source.
It should be noted that in the case of a single user, no noise source is acquired.
And S306, judging the coincidence degree of the noise source and the control sound source, and removing the coincidence of the noise source and the control sound source, thereby obtaining a duplication-removed sound source.
Step S308, carrying out de-duplication sound source attenuation processing based on the distance between the noise sound source user and the control sound source user.
Wherein, the larger the distance between the noise source user and the control sound source user is, the larger the attenuation weight of the de-emphasis sound source is. The smaller the distance of the noise source user from the control source user position, the smaller the de-emphasis source attenuation weight.
Step S310, controlling the source to screen the de-weighted source to obtain a relatively pure processed source sample.
And step S312, performing next sound source processing and semantic analysis based on the processed sound source samples.
Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
Example 2
In this embodiment, a device for processing voice information is further provided, and the device is used to implement the foregoing embodiments and preferred embodiments, and details of which have been already described are omitted. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.
Fig. 3 is a schematic structural diagram of a speech information processing apparatus according to an embodiment of the present invention, as shown in fig. 3, the apparatus including: the acquiring module 32 is configured to acquire first voice information and second voice information generated within a preset range; the first voice information carries a designated voice awakening word; and the processing module 34 is coupled to the obtaining module 32, and configured to perform deduplication processing on the first voice information and the second voice information according to the deduplication weight determined by the relative distance between the sound source position of the first voice information and the sound source position of the second voice information to obtain third voice information.
Optionally, the processing module 34 in this embodiment includes: a first acquiring unit configured to acquire overlapped speech information in which the first speech information and the second speech information are overlapped with each other; the processing unit is coupled with the first acquisition unit and used for carrying out de-duplication processing on the first voice information according to the de-duplication weight and the coincidence voice information to obtain processed third voice information; wherein the magnitude of the de-emphasis weight is inversely proportional to the distance of the relative distance.
Optionally, the obtaining module 32 in this embodiment includes: the first acquisition unit is used for starting a microphone closest to the position of a first voice information sound source so as to acquire first voice information; and the second acquisition unit is used for starting a microphone closest to the second voice information sound source position so as to acquire the second voice information.
Optionally, the processing module 34 in this embodiment may further include: the second acquisition unit is used for acquiring the first voice information sound source position and the second voice information sound source position in a camera positioning and/or voice position analysis mode; and the determining unit is used for determining the relative distance according to the position of the first voice information sound source and the position of the second voice information sound source.
It should be noted that, the above modules may be implemented by software or hardware, and for the latter, the following may be implemented, but not limited to: the modules are all positioned in the same processor; alternatively, the modules are respectively located in different processors in any combination.
Example 3
Embodiments of the present invention also provide a storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.
Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:
s1, acquiring first voice information and second voice information generated in a preset range; the first voice information carries a designated voice awakening word;
and S2, performing de-duplication processing on the first voice information and the second voice information according to the de-duplication weight determined by the relative distance between the position of the sound source of the first voice information and the position of the sound source of the second voice information to obtain third voice information.
Optionally, the storage medium is further arranged to store a computer program for performing the steps of:
step S1, acquiring overlapped voice information in which the first voice information and the second voice information are overlapped;
step S2, carrying out duplication elimination processing on the first voice information according to the duplication elimination weight and the coincidence voice information to obtain processed third voice information; wherein the magnitude of the de-emphasis weight is inversely proportional to the distance of the relative distance.
Optionally, the storage medium is further arranged to store a computer program for performing the steps of:
step S1, starting a microphone nearest to the first voice information sound source position to collect the first voice information;
in step S2, the microphone closest to the second voice information sound source is turned on to collect the second voice information.
Optionally, in this embodiment, the storage medium may include, but is not limited to: various media capable of storing computer programs, such as a usb disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.
Optionally, the specific examples in this embodiment may refer to the examples described in the above embodiments and optional implementation manners, and this embodiment is not described herein again.
It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the principle of the present invention should be included in the protection scope of the present invention.

Claims (9)

1. A method for processing voice information, comprising:
acquiring first voice information and second voice information generated within a preset range; wherein, the first voice information carries a designated voice awakening word;
according to the de-weighting weight determined by the relative distance between the sound source position of the first voice information and the sound source position of the second voice information, performing de-weighting processing on the first voice information and the second voice information to obtain third voice information;
the acquiring of the first voice message and the second voice message generated in the preset range comprises:
starting a microphone closest to the first voice information sound source position to acquire the first voice information;
and opening a microphone closest to the second voice information sound source position to acquire the second voice information.
2. The method according to claim 1, wherein performing de-duplication processing on the first voice message and the second voice message according to a de-duplication weight determined by a relative distance between a sound source position of the first voice message and a sound source position of the second voice message to obtain third voice message comprises:
acquiring coincident voice information in which the first voice information and the second voice information are coincident with each other;
and carrying out duplication elimination processing on the first voice information according to the duplication elimination weight and the coincidence voice information to obtain processed third voice information.
3. The method of claim 2, wherein the magnitude of the de-emphasis weight is inversely proportional to the distance of the relative distance.
4. The method of claim 1, wherein the relative distance is obtained by:
acquiring the first voice information sound source position and the second voice information sound source position in a camera positioning and/or voice position analysis mode;
and determining the relative distance according to the position of the first voice information sound source and the position of the second voice information sound source.
5. An apparatus for processing speech information, comprising:
the acquisition module is used for acquiring first voice information and second voice information generated in a preset range; wherein, the first voice information carries a designated voice awakening word;
the processing module is used for carrying out de-weighting processing on the first voice information and the second voice information according to de-weighting weight determined by the relative distance between the sound source position of the first voice information and the sound source position of the second voice information to obtain third voice information;
the acquisition module includes:
the first acquisition unit is used for starting a microphone closest to the position of the first voice information sound source so as to acquire the first voice information;
and the second acquisition unit is used for starting a microphone closest to the second voice information sound source position so as to acquire the second voice information.
6. The apparatus of claim 5, wherein the processing module comprises:
a first obtaining unit, configured to obtain overlapped speech information in which the first speech information and the second speech information are overlapped with each other;
and the processing unit is used for carrying out de-duplication processing on the first voice information according to the de-duplication weight and the coincidence voice information to obtain processed third voice information.
7. The apparatus of claim 6, wherein the magnitude of the de-emphasis weight is inversely proportional to the distance of the relative distance.
8. The apparatus of claim 5, wherein the processing module further comprises:
the second acquisition unit is used for acquiring the first voice information sound source position and the second voice information sound source position in a camera positioning and/or voice position analysis mode;
and the determining unit is used for determining the relative distance according to the position of the first voice information sound source and the position of the second voice information sound source.
9. A storage medium, in which a computer program is stored, wherein the computer program is arranged to perform the method of any of claims 1 to 4 when executed.
CN201811307605.XA 2018-11-05 2018-11-05 Voice information processing method and device and storage medium Active CN109346067B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811307605.XA CN109346067B (en) 2018-11-05 2018-11-05 Voice information processing method and device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811307605.XA CN109346067B (en) 2018-11-05 2018-11-05 Voice information processing method and device and storage medium

Publications (2)

Publication Number Publication Date
CN109346067A CN109346067A (en) 2019-02-15
CN109346067B true CN109346067B (en) 2021-02-26

Family

ID=65313730

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811307605.XA Active CN109346067B (en) 2018-11-05 2018-11-05 Voice information processing method and device and storage medium

Country Status (1)

Country Link
CN (1) CN109346067B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001318687A (en) * 2000-02-28 2001-11-16 Mitsubishi Electric Corp Speech recognition device
CN101510426A (en) * 2009-03-23 2009-08-19 北京中星微电子有限公司 Method and system for eliminating noise
CN102456351A (en) * 2010-10-14 2012-05-16 清华大学 Voice enhancement system
CN106448697A (en) * 2016-09-28 2017-02-22 惠州Tcl移动通信有限公司 Double-microphone noise elimination implementation method and system and smart glasses
CN106603878A (en) * 2016-12-09 2017-04-26 奇酷互联网络科技(深圳)有限公司 Voice positioning method, device and system
CN107316649A (en) * 2017-05-15 2017-11-03 百度在线网络技术(北京)有限公司 Audio recognition method and device based on artificial intelligence
CN107333093A (en) * 2017-05-24 2017-11-07 苏州科达科技股份有限公司 A kind of sound processing method, device, terminal and computer-readable recording medium
CN107577449A (en) * 2017-09-04 2018-01-12 百度在线网络技术(北京)有限公司 Wake up pick-up method, device, equipment and the storage medium of voice

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2961916B2 (en) * 1991-03-08 1999-10-12 三菱電機株式会社 Voice recognition device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001318687A (en) * 2000-02-28 2001-11-16 Mitsubishi Electric Corp Speech recognition device
CN101510426A (en) * 2009-03-23 2009-08-19 北京中星微电子有限公司 Method and system for eliminating noise
CN102456351A (en) * 2010-10-14 2012-05-16 清华大学 Voice enhancement system
CN106448697A (en) * 2016-09-28 2017-02-22 惠州Tcl移动通信有限公司 Double-microphone noise elimination implementation method and system and smart glasses
CN106603878A (en) * 2016-12-09 2017-04-26 奇酷互联网络科技(深圳)有限公司 Voice positioning method, device and system
CN107316649A (en) * 2017-05-15 2017-11-03 百度在线网络技术(北京)有限公司 Audio recognition method and device based on artificial intelligence
CN107333093A (en) * 2017-05-24 2017-11-07 苏州科达科技股份有限公司 A kind of sound processing method, device, terminal and computer-readable recording medium
CN107577449A (en) * 2017-09-04 2018-01-12 百度在线网络技术(北京)有限公司 Wake up pick-up method, device, equipment and the storage medium of voice

Also Published As

Publication number Publication date
CN109346067A (en) 2019-02-15

Similar Documents

Publication Publication Date Title
US11450337B2 (en) Multi-person speech separation method and apparatus using a generative adversarial network model
CN109961780B (en) A man-machine interaction method a device(s) Server and storage medium
CN110265052B (en) Signal-to-noise ratio determining method and device for radio equipment, storage medium and electronic device
CN108307069B (en) Navigation operation method, navigation operation device and mobile terminal
CN112037789A (en) Equipment awakening method and device, storage medium and electronic device
CN110290280B (en) Terminal state identification method and device and storage medium
CN109509465A (en) Processing method, component, equipment and the medium of voice signal
CN110751960B (en) Method and device for determining noise data
CN108932947B (en) Voice control method and household appliance
CN105975063B (en) A kind of method and apparatus controlling intelligent terminal
CN110428835B (en) Voice equipment adjusting method and device, storage medium and voice equipment
CN109067883B (en) Information pushing method and device
CN112908321A (en) Device control method, device, storage medium, and electronic apparatus
US9552813B2 (en) Self-adaptive intelligent voice device and method
CN106782498A (en) Voice messaging player method, device and terminal
CN107680598B (en) Information interaction method, device and equipment based on friend voiceprint address list
CN109346067B (en) Voice information processing method and device and storage medium
CN111739515B (en) Speech recognition method, equipment, electronic equipment, server and related system
US9626967B2 (en) Information processing method and electronic device
CN107154996B (en) Incoming call interception method and device, storage medium and terminal
CN113889116A (en) Voice information processing method and device, storage medium and electronic device
CN112002339B (en) Speech noise reduction method and device, computer-readable storage medium and electronic device
CN113436613A (en) Voice recognition method and device, electronic equipment and storage medium
CN105374364B (en) Signal processing method and electronic equipment
CN109753659B (en) Semantic processing method, semantic processing device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant