CN109602333B - Voice denoising method and chip based on cleaning robot - Google Patents

Voice denoising method and chip based on cleaning robot Download PDF

Info

Publication number
CN109602333B
CN109602333B CN201811512538.5A CN201811512538A CN109602333B CN 109602333 B CN109602333 B CN 109602333B CN 201811512538 A CN201811512538 A CN 201811512538A CN 109602333 B CN109602333 B CN 109602333B
Authority
CN
China
Prior art keywords
target
signal
denoising
noise
confidence value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811512538.5A
Other languages
Chinese (zh)
Other versions
CN109602333A (en
Inventor
许登科
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhuhai Amicro Semiconductor Co Ltd
Original Assignee
Zhuhai Amicro Semiconductor Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhuhai Amicro Semiconductor Co Ltd filed Critical Zhuhai Amicro Semiconductor Co Ltd
Priority to CN201811512538.5A priority Critical patent/CN109602333B/en
Publication of CN109602333A publication Critical patent/CN109602333A/en
Application granted granted Critical
Publication of CN109602333B publication Critical patent/CN109602333B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A47FURNITURE; DOMESTIC ARTICLES OR APPLIANCES; COFFEE MILLS; SPICE MILLS; SUCTION CLEANERS IN GENERAL
    • A47LDOMESTIC WASHING OR CLEANING; SUCTION CLEANERS IN GENERAL
    • A47L11/00Machines for cleaning floors, carpets, furniture, walls, or wall coverings
    • AHUMAN NECESSITIES
    • A47FURNITURE; DOMESTIC ARTICLES OR APPLIANCES; COFFEE MILLS; SPICE MILLS; SUCTION CLEANERS IN GENERAL
    • A47LDOMESTIC WASHING OR CLEANING; SUCTION CLEANERS IN GENERAL
    • A47L11/00Machines for cleaning floors, carpets, furniture, walls, or wall coverings
    • A47L11/40Parts or details of machines not provided for in groups A47L11/02 - A47L11/38, or not restricted to one of these groups, e.g. handles, arrangements of switches, skirts, buffers, levers
    • A47L11/4011Regulation of the cleaning machine by electric means; Control systems and remote control systems therefor

Landscapes

  • Manipulator (AREA)
  • Soundproofing, Sound Blocking, And Sound Damping (AREA)

Abstract

The invention discloses a voice denoising method and a chip based on a cleaning robot, comprising the following steps: step 1, determining a target voice signal from voice signals acquired by a microphone array, and correspondingly acquiring a target confidence value; step 2, judging whether preset noise data with a difference absolute value between the confidence value and the target confidence value smaller than a preset noise threshold exists in the noise database, if so, entering step 3; step 3, controlling the noise data to perform inverse processing to obtain an inverse noise signal, and then mixing and superposing the inverse noise signal and the target voice signal to obtain a pre-denoising processing result; step 4, according to the relation between the pre-denoising processing result and a preset threshold value, marking the sound frame corresponding to the pre-denoising processing result as a denoised sound frame in the target voice signal; wherein the target speech signal comprises voiced frames associated with the control commands. According to the invention, the target voice signal is subjected to pre-denoising processing, so that the denoising precision of the voice signal is improved.

Description

Voice denoising method and chip based on cleaning robot
Technical Field
The invention belongs to the technical field of robots, and particularly relates to a voice denoising method and a chip based on a cleaning robot.
Background
Although the speech pickup equipment circulating on the market can perform speech pickup on speech signals sent by a user, the speech pickup equipment can generally perform speech pickup on the noise generated in the working process of the robot while picking up the speech signals sent by the user, so that a large amount of external noise is mixed in the speech signals picked up by the equipment, the corresponding speech recognition accuracy is not high, the recognition of the external speech (effective signals) by the robot is seriously influenced, and logic judgment (for example, relevant path planning is executed) is made based on the interpretation of the speech.
In the prior art, the cleaning robot with the voice recognition function does not preprocess the collected voice signal in the front-end denoising process of the voice signal, so that the accuracy of voice recognition is reduced.
Disclosure of Invention
In order to overcome the technical defects, the invention provides the following technical scheme:
a voice denoising method based on a cleaning robot is applied to a mobile robot of which a base is provided with a microphone array with a fixed orientation, and comprises the following steps: step 1: determining a target voice signal from the voice signals acquired by the microphone array, and correspondingly acquiring a target confidence value; step 2: judging whether preset noise data with the difference absolute value between the confidence value and the target confidence value smaller than a preset noise threshold exists in the noise database, if so, entering the step 3; and step 3: controlling the preset noise data to perform inverse processing to obtain an inverse noise signal, and then performing mixed superposition on the inverse noise signal and the target voice signal to obtain a pre-denoising processing result; and 4, step 4: according to the relation between the pre-denoising processing result and a preset threshold value, marking the sound frame corresponding to the pre-denoising processing result as a denoised sound frame in the target voice signal; wherein the target speech signal comprises voiced frames associated with the control commands. The voice denoising method selectively carries out coarse denoising processing on the voiced frames of the target voice signals through pre-denoising processing, and then flexibly adjusts the confidence value to carry out fine denoising processing according to the real-time matching degree of the noise signals and the noise database so as to improve the denoising precision.
Further, the step 1 specifically includes: recognizing a voiced frame of a voice signal acquired from the microphone array through a voice engine, determining the voice signal corresponding to the voiced frame as the target voice signal when the signal-to-noise ratio value of the voiced frame is greater than a preset signal-to-noise ratio threshold value, and then extracting a target confidence value corresponding to the target voice signal from the voiced frame, wherein the voiced frame comprises the confidence value and the signal-to-noise ratio value based on the voice recognition signal. And screening out a target voice signal according to a preset signal-to-noise ratio threshold value, and identifying and processing a specific voice signal in a targeted manner, so that the accuracy of voice identification in a noise environment is improved.
Further, the step 3 specifically includes: step 301, judging whether the pre-denoising processing result is larger than the preset threshold value, if so, entering step 302, otherwise, entering step 303; step 302, marking the sound frame corresponding to the pre-denoising processing result as a denoised sound frame in the target speech signal; step 303, judging whether the absolute value of the difference between the confidence value of the pre-denoising processing result and the target confidence value is smaller than a confidence threshold, if so, marking the sound frame corresponding to the pre-denoising processing result as the denoised sound frame in the target voice signal; otherwise, adjusting the target confidence value and returning to the step 2. And 3, judging the pre-denoising processing result twice, and comprehensively processing each voiced frame in the target voice signal, thereby being beneficial to the completeness of denoising and improving the accuracy of voice denoising.
Further, the method for adjusting the target confidence value comprises the following steps: and adjusting the current target confidence value to be larger or smaller according to the difference value between the confidence value of the unmarked voiced frame in the target voice signal and the current target confidence value. The method is beneficial to subsequent judgment and screening based on the unmarked voiced frames in the target voice signal, and improves the accuracy of the iterative processing process.
A chip is used for storing a program code corresponding to the voice denoising method. The chip is added with a pre-denoising processing function, so that the denoising precision of the voice signal is improved.
Compared with the prior art, the technical scheme of the invention is that after the target voice signal is obtained, in the process of denoising pretreatment, the denoising treatment is selectively carried out on the voiced frame of the target voice signal, the threshold value is intelligently set and the current denoised voiced frame is marked according to the real-time matching degree of the confidence value of the noise signal and the confidence value of the noise database, and the confidence value is flexibly adjusted to improve the denoising efficiency, so that the denoising effect is more thorough.
Drawings
Fig. 1 is a flowchart of a voice denoising method based on a cleaning robot according to an embodiment of the present invention.
FIG. 2 is a flowchart of a cleaning robot-based speech denoising method according to another embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described in detail below with reference to the accompanying drawings in the embodiments of the present invention.
Referring to fig. 1, an embodiment of the present invention provides a voice denoising method based on a cleaning robot, specifically including:
step S101, a voice signal transmitted from a specific direction is acquired from a microphone array on a base of the cleaning robot, and a target voice signal is determined based on information domain analysis of a database pre-stored by a voice engine, so that directional voice pickup is realized, and external noise interference is reduced. Then, the process proceeds to step S102. The target voice signal comprises a control command spoken by a user orally or voice data input by a machine, and accordingly, a target confidence value is obtained based on the target voice signal, in this embodiment, the target confidence value is the degree of authenticity information of the mobile robot on a specific voice signal, and can be used as a numerical value for representing the credibility degree of a voice preliminary recognition result, so as to reduce erroneous judgment, the correctness of the recognition result is judged according to a confidence threshold value, and then the result is presented. If the target speech signal spoken by the user is "call back charging", then in the speech data recognition process, the returned target confidence value includes: sentence confidence N.
Optionally, voiced frames of the speech signals acquired in the microphone array may be identified by the speech engine, the microphone array may pass a correlated speech characteristic detection algorithm, the target speech signal comprises voiced frames associated with control instructions, such that the target speech signal may be converted into a plurality of speech frames associated with the user utterance, wherein the speech frames may comprise voiced frames and unvoiced frames, and the classification may be performed by various known techniques. And when the signal-to-noise ratio value of the voiced frame is greater than a preset signal-to-noise ratio threshold value, determining the voice signal corresponding to the voiced frame as the target voice signal, and then extracting a target confidence value corresponding to the target voice signal from the voiced frame, wherein the voiced frame comprises the confidence value and the signal-to-noise ratio value based on the voice recognition signal.
It should be noted that the voiced frame can measure the noise energy level contained therein by using the signal-to-noise ratio, which is the ratio of the power of the voice data to the power of the noise data, and is often expressed in decibels, and generally, a higher signal-to-noise ratio indicates a smaller power of the noise data, and vice versa. The noise energy level is used to reflect the amount of noise data energy in the user's voice data. The signal-to-noise ratio and the noise energy level are combined to indicate the noise level.
Step S102, according to noise data corresponding to the voiced frames contained in the target voice signal, searching a preset noise data from the noise database, judging whether the absolute value of the difference between the confidence values of the target confidence value and the preset noise data is smaller than a preset noise threshold value, if so, determining that the preset noise data is the noise data matched with the target confidence value, and then, entering step S103.
In the embodiment of the present invention, noise generated in a working environment of the robot is relatively stable, and the target voice signal collected by the microphone array also includes sound transmitted from the inside to the outside of the cleaning robot, specifically, noise generated when an executing component (such as a motor) inside the robot operates and noise generated by internal mechanical friction or vibration of the robot during movement are transmitted from the outside of the robot body.
Preferably, the target speech signal may be compared with all noise data in the noise database to obtain all speech similarity values, and then the predetermined noise threshold value may be determined based on a weighted average of all speech similarity values. In addition, multiple noise databases may be employed, and the result with the highest recognition rate may be selected from the multiple databases as the final matching result. Thereby improving the recognition rate of the working noise of the robot.
Step S103, controlling the noise data and the unmarked sound frames in the target voice signal to participate in pre-denoising processing so as to obtain a pre-denoising processing result corresponding to the noise data; specifically, the method for pre-denoising specifically includes: firstly, controlling the preset noise data to perform inverse processing to obtain an inverse noise signal; and then controlling the reversed phase noise signal and the target voice signal to be mixed and superposed to obtain a pre-denoising processing result corresponding to the preset noise data, so as to eliminate the noise signal in the target voice signal and obtain voice information after pre-denoising processing.
Step S104, according to the relation between the pre-denoising processing result and a preset threshold value, marking the sound frame corresponding to the pre-denoising processing result as a denoised sound frame in the target voice signal; wherein the target speech signal comprises voiced frames associated with the control commands.
As another embodiment, the step S104 may include, as shown in fig. 2: step S1041, judging whether the pre-denoising processing result is larger than a preset threshold value, if so, entering step S1042, otherwise, entering step S1043; the predetermined threshold is pre-stored and is used to measure the distortion of the speech signal. And the remaining unmarked voiced frames of the target voice signal may have been denoised after the pre-denoising process, but do not satisfy the condition that the pre-denoising process result is greater than the predetermined threshold, and further judgment and screening are needed to reduce the misjudgment.
Step S1042, marking the voiced frame corresponding to the pre-denoising processing result as a denoised voiced frame in the target speech signal, and if the pre-denoising processing result is greater than the predetermined threshold, the pre-denoising processing result indicates that the undesired noise has been removed from the voiced frame of the target speech signal, that is, the influence of the removed noise on the recognition of the partial speech signal is eliminated.
If the pre-denoising result is smaller than the predetermined threshold, further adjusting denoising is needed to ensure that each voiced frame in the received target voice signal can be processed, so that the voice signal denoising is more thorough, and further the denoising integrity of the target voice signal is improved and the accuracy of identifying the target voice signal is improved.
Step S1043, determining whether the absolute value of the difference between the confidence value of the pre-denoising result and the target confidence value is smaller than a confidence threshold, if yes, entering step S1044, otherwise, entering step S1045. The confidence value of the pre-denoising result is a value of the credibility of the recognition result of the pre-denoising target speech signal on the premise that the pre-denoising result is smaller than the predetermined threshold, and the confidence threshold can be used as an evaluation index of the correct recognition rate of the interfered target speech signal. And further processing the noise signals of the residual unmarked voiced frames of the target voice signal by judging whether the absolute value of the difference value between the confidence value of the pre-denoising processing result and the target confidence value is smaller than a confidence threshold value or not so as to improve the comprehensiveness and the accuracy of denoising the target voice signal.
Step S1044 of marking the sound frame corresponding to the pre-denoising result as a denoised sound frame in the target speech signal, so as to realize denoising of the part of the sound frame which is not marked and obtained by screening in the step S1042, thereby improving the precision of speech recognition; and the target speech signal still has an unmarked voiced frame, and at this time, the fact means that the pre-denoising effect of the noise data matched with the current target confidence value on the unmarked voiced frame is limited.
Step S1045, according to a difference between the confidence value of the unlabeled voiced frame in the target speech signal and the current target confidence value, increasing or decreasing the current target confidence value. In this embodiment, when the confidence value of the unlabeled voiced frame in the target speech signal is greater than the current target confidence value, the current target confidence value is correspondingly turned up, otherwise, the current target confidence value is correspondingly turned down, and then the step S102 is returned to, and the noise data matched with the adjusted target confidence value is selected for further denoising processing. Obviously, the method is a parameter correction process based on the current target confidence value, and then the denoised voiced frames are judged again based on the correction parameters, so that after multiple iterations, the process is circulated until all the voiced frames in the target voice signal are denoised. And flexibly adjusting the confidence value according to the real-time matching degree of the noise signal and the noise database to improve the denoising efficiency. And then the denoised sound frame in the target voice signal is converted into a voice control instruction to control the mobile robot. The target voice signal comprises periodic components, so that the method has a periodic iteration rule in the process of executing the voice denoising method, the target confidence value is prevented from being randomly corrected, the judgment speed of the target voice signal is accelerated, and the denoising working efficiency is improved.
According to the technical scheme, under the noise scene of the working of the robot, a target voice signal sent by a user is obtained, and according to pre-stored empirical data of a noise database and the target voice signal, the empirical data of the noise database is controlled to be subjected to inverse processing to suppress the noise of the target voice signal; meanwhile, the related confidence value is flexibly adjusted according to the real-time matching degree of the noise signal and the noise database, and the denoised sound frame is screened out through judgment, so that the denoising thoroughness is greatly improved, and the speech recognition rate in a noise environment is improved.
A chip is used for storing a program code corresponding to the voice denoising method. The chip adopts a special integrated control chip, and the chips can analyze internal or external control instructions and output corresponding control signals so as to control an execution component of the robot to perform corresponding actions.
Compared with the prior art, the voice denoising method selects the matched noise data to participate in the pre-denoising treatment of the reverse phase superposition through the noise database, and the denoising precision is improved. The pre-denoising process may use a subtraction circuit to perform signal subtraction, or may use a combination of an inverter and an addition circuit to perform signal subtraction, and these circuits may be integrated with a processor into a dedicated processing chip, and may be configured according to design requirements. After the internal noise interference is filtered, the processor analyzes the filtered signals to analyze external voice signals, and the external voice signals are converted into control instructions matched with the external voice signals to control the robot. How the robot analyzes the external voice signal belongs to the existing technology which can be realized, and is not described herein again.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention and not to limit it; although the present invention has been described in detail with reference to preferred embodiments, those skilled in the art will understand that: modifications to the specific embodiments of the invention or equivalent substitutions for parts of the technical features may be made; without departing from the spirit of the present invention, it is intended to cover all aspects of the invention as defined by the appended claims.

Claims (5)

1. A voice denoising method based on a cleaning robot is characterized by comprising the following steps:
step 1: determining a target voice signal from the voice signals acquired by the microphone array, and correspondingly acquiring a target confidence value;
step 2: judging whether preset noise data with the difference absolute value between the confidence value and the target confidence value smaller than a preset noise threshold exists in the noise database, if so, entering the step 3;
and step 3: controlling the preset noise data to perform inverse processing to obtain an inverse noise signal, and then performing mixed superposition on the inverse noise signal and the target voice signal to obtain a pre-denoising processing result;
and 4, step 4: according to the relation between the pre-denoising processing result and a preset threshold value, marking the sound frame corresponding to the pre-denoising processing result as a denoised sound frame in the target voice signal;
wherein the target speech signal comprises voiced frames associated with the control commands.
2. The speech denoising method according to claim 1, wherein the step 1 specifically comprises:
recognizing a voiced frame of a voice signal acquired from the microphone array through a voice engine, determining the voice signal corresponding to the voiced frame as the target voice signal when the signal-to-noise ratio value of the voiced frame is greater than a preset signal-to-noise ratio threshold value, and then extracting a target confidence value corresponding to the target voice signal from the voiced frame, wherein the voiced frame comprises the confidence value and the signal-to-noise ratio value based on the voice recognition signal.
3. The speech denoising method according to claim 1, wherein the step 3 specifically comprises:
step 301, judging whether the pre-denoising processing result is larger than the preset threshold value, if so, entering step 302, otherwise, entering step 303;
step 302, marking the sound frame corresponding to the pre-denoising processing result as a denoised sound frame in the target speech signal;
step 303, judging whether the absolute value of the difference between the confidence value of the pre-denoising processing result and the target confidence value is smaller than a confidence threshold, if so, marking the sound frame corresponding to the pre-denoising processing result as the denoised sound frame in the target voice signal; otherwise, adjusting the target confidence value and returning to the step 2.
4. The method of denoising as claimed in claim 3, wherein the method of adjusting the confidence value of the target comprises: and adjusting the current target confidence value to be larger or smaller according to the difference value between the confidence value of the unmarked voiced frame in the target voice signal and the current target confidence value.
5. A chip, characterized in that, the chip is used for storing the program code corresponding to the speech denoising method of any one of claims 1 to 4.
CN201811512538.5A 2018-12-11 2018-12-11 Voice denoising method and chip based on cleaning robot Active CN109602333B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811512538.5A CN109602333B (en) 2018-12-11 2018-12-11 Voice denoising method and chip based on cleaning robot

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811512538.5A CN109602333B (en) 2018-12-11 2018-12-11 Voice denoising method and chip based on cleaning robot

Publications (2)

Publication Number Publication Date
CN109602333A CN109602333A (en) 2019-04-12
CN109602333B true CN109602333B (en) 2020-11-03

Family

ID=66007748

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811512538.5A Active CN109602333B (en) 2018-12-11 2018-12-11 Voice denoising method and chip based on cleaning robot

Country Status (1)

Country Link
CN (1) CN109602333B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112786035A (en) * 2019-11-08 2021-05-11 珠海市一微半导体有限公司 Voice recognition method, system and chip of cleaning robot

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1868183A1 (en) * 2006-06-12 2007-12-19 Lockheed Martin Corporation Speech recognition and control sytem, program product, and related methods
CN105261357A (en) * 2015-09-15 2016-01-20 百度在线网络技术(北京)有限公司 Voice endpoint detection method and device based on statistics model
CN106388700A (en) * 2016-06-06 2017-02-15 北京小米移动软件有限公司 Active noise reduction device for automatic cleaning equipment and automatic cleaning equipment
CN106971739A (en) * 2016-01-14 2017-07-21 芋头科技(杭州)有限公司 The method and system and intelligent terminal of a kind of voice de-noising
CN106971714A (en) * 2016-01-14 2017-07-21 芋头科技(杭州)有限公司 A kind of speech de-noising recognition methods and device applied to robot
CN107767881A (en) * 2016-08-15 2018-03-06 中国移动通信有限公司研究院 A kind of acquisition methods and device of the satisfaction of voice messaging

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1868183A1 (en) * 2006-06-12 2007-12-19 Lockheed Martin Corporation Speech recognition and control sytem, program product, and related methods
CN105261357A (en) * 2015-09-15 2016-01-20 百度在线网络技术(北京)有限公司 Voice endpoint detection method and device based on statistics model
CN106971739A (en) * 2016-01-14 2017-07-21 芋头科技(杭州)有限公司 The method and system and intelligent terminal of a kind of voice de-noising
CN106971714A (en) * 2016-01-14 2017-07-21 芋头科技(杭州)有限公司 A kind of speech de-noising recognition methods and device applied to robot
CN106388700A (en) * 2016-06-06 2017-02-15 北京小米移动软件有限公司 Active noise reduction device for automatic cleaning equipment and automatic cleaning equipment
CN107767881A (en) * 2016-08-15 2018-03-06 中国移动通信有限公司研究院 A kind of acquisition methods and device of the satisfaction of voice messaging

Also Published As

Publication number Publication date
CN109602333A (en) 2019-04-12

Similar Documents

Publication Publication Date Title
CN110600017B (en) Training method of voice processing model, voice recognition method, system and device
CN102687196B (en) Method for the detection of speech segments
JP6153142B2 (en) Method for processing an acoustic signal
CN109545238B (en) Voice denoising device based on cleaning robot
US9997168B2 (en) Method and apparatus for signal extraction of audio signal
CN111081223B (en) Voice recognition method, device, equipment and storage medium
CN109346099B (en) Iterative denoising method and chip based on voice recognition
CN109602333B (en) Voice denoising method and chip based on cleaning robot
CN110837758A (en) Keyword input method and device and electronic equipment
CN109065026B (en) Recording control method and device
CN109410928B (en) Denoising method and chip based on voice recognition
CN114822578A (en) Voice noise reduction method, device, equipment and storage medium
CN111681649B (en) Speech recognition method, interaction system and achievement management system comprising system
CN109360580B (en) Iteration denoising device and cleaning robot based on voice recognition
CN109584899B (en) Denoising device and cleaning robot based on voice recognition
CN114822531A (en) Liquid crystal television based on AI voice intelligent control
CN115547352A (en) Electronic device, method, apparatus and medium for processing noise thereof
CN112002307B (en) Voice recognition method and device
CN113692618B (en) Voice command recognition method and device
JP6106618B2 (en) Speech section detection device, speech recognition device, method thereof, and program
CN113539266A (en) Command word recognition method and device, electronic equipment and storage medium
KR102218151B1 (en) Target voice signal output apparatus for improving voice recognition and method thereof
CN117727298B (en) Deep learning-based portable computer voice recognition method and system
Popović et al. Speech Enhancement Using Augmented SSL CycleGAN
Bharathi et al. Speaker verification in a noisy environment by enhancing the speech signal using various approaches of spectral subtraction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant