CN106067302B - Denoising device and method - Google Patents
Denoising device and method Download PDFInfo
- Publication number
- CN106067302B CN106067302B CN201610370200.5A CN201610370200A CN106067302B CN 106067302 B CN106067302 B CN 106067302B CN 201610370200 A CN201610370200 A CN 201610370200A CN 106067302 B CN106067302 B CN 106067302B
- Authority
- CN
- China
- Prior art keywords
- sentence
- neighboring
- similarity
- noise
- audio
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L25/84—Detection of presence or absence of voice signals for discriminating voice from noise
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/194—Calculation of difference between files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
Abstract
The invention discloses a kind of denoising devices, comprising: audio-video document is converted into text file for carrying out speech recognition to audio-video document by conversion module;Computing module, for calculating separately the similarity between the two neighboring sentence in this article this document;Judgment module, for judging the two neighboring sentence with the presence or absence of noise sentence according to the similarity between the two neighboring sentence;Determining module, for, there are when noise sentence, determining that the sentence in the two neighboring sentence is noise sentence according to preset strategy in the two neighboring sentence;Noise reduction module, for filtering out noise sentence from the audio-video document.The invention also discloses a kind of noise-reduction methods.Using the present invention, the noise sentence in audio-video document can be more objectively identified, and be not affected by the surrounding environment, be greatly improved the accuracy rate of removal noise.
Description
Technical field
The present invention relates to audio signal processing technique field more particularly to a kind of denoising device and methods.
Background technique
With the development and continuous improvement of people's living standards of mobile communication technology, people generally require to fill by recording
It sets and records in different occasions, such as in interview occasion, meeting occasion, training occasion, live sound is recorded, it is raw
At audio-video document.But since the scene of recording is complicated and changeable, the quality and content of recording due to ambient enviroment variation and
It is impacted;For example, recording during in session, user open recording device and record, until stopping record after the conference is over
Sound, still, this section of recording contain the recording during meeting rest, the audio-video document that therefore, it is necessary to record to recording device
It is denoised, to get rid of inessential sound.
In the prior art, audio-video document is denoised generally according to playback environ-ment, such as during meeting rest, sound
More noisy, during meeting, sound is more simple, but this mode has the drawback that due to too depending on surrounding
Environment causes denoising accuracy rate low, such as: even if also will appear very noisy sound during in session.
Summary of the invention
It is a primary object of the present invention to propose a kind of denoising device and method, it is intended to solve in the prior art, according to record
Sound environment denoises the audio-video document of recording, the low technical problem of denoising accuracy rate.
To achieve the above object, the present invention provides a kind of denoising device, and the denoising device includes:
The audio-video document is converted into text file for carrying out speech recognition to audio-video document by conversion module;
Computing module, for calculating separately the similarity between the two neighboring sentence in the text file;
Judgment module, for whether judging the two neighboring sentence according to the similarity between the two neighboring sentence
There are noise sentences;
Determining module, for, there are when noise sentence, determining the phase according to preset strategy in the two neighboring sentence
A sentence in adjacent two sentences is noise sentence;
Noise reduction module, for filtering out noise sentence from the audio-video document.
Optionally, the denoising device further include: word segmentation module, for being carried out to each sentence in the text file
Participle, respectively obtains the word of each sentence;
The computing module includes:
Acquiring unit, the corresponding number of word for obtaining two neighboring sentence respectively according to number dictionary;
Unit is established, for the corresponding number of word according to two neighboring sentence, respectively two neighboring sentence is established
Vector model;
First computing unit calculates the Europe between two neighboring sentence for the vector model according to two neighboring sentence
A few Reed distances;
Second computing unit, for obtaining two neighboring sentence according to the Euclidean distance between two neighboring sentence
Between similarity.
Optionally, the similarity between two neighboring sentence is calculated by the following formula:
Sim=1/ (1+D), wherein Sim indicates the similarity of two neighboring sentence, and D indicates that the Europe of two neighboring sentence is several
Reed distance.
Optionally, judgment module includes:
Judging unit, for judging whether the similarity between the two neighboring sentence is less than preset similarity threshold
Value;
First determination unit is less than preset similarity threshold for the similarity between the two neighboring sentence
When, determine that there are noise sentences for the two neighboring sentence.
Optionally, the determining module includes:
Third computing unit calculates in the two neighboring sentence in the two neighboring sentence there are when noise sentence
The first sentence and the text file in the predetermined number since first sentence sentence similarity, and calculate institute
State the sentence of the predetermined number since first sentence in the second sentence and the text file in two neighboring sentence
Similarity;
Second determination unit, for according in the two neighboring sentence the first sentence and the text file in from
The second sentence and the text in the similarity of the sentence for the predetermined number that first sentence starts and the two neighboring sentence
The similarity of the sentence of the predetermined number since first sentence in this document determines in the two neighboring sentence
One sentence or the second sentence are noise sentence.
In addition, to achieve the above object, the present invention also proposes that a kind of noise-reduction method, the noise-reduction method include:
Speech recognition is carried out to audio-video document, audio-video document is converted into text file;
The similarity between the two neighboring sentence in the text file is calculated separately, and according to the two neighboring language
Similarity between sentence judges the two neighboring sentence with the presence or absence of noise sentence;
In the two neighboring sentence there are when noise sentence, determined in the two neighboring sentence according to preset strategy
One sentence is noise sentence, and the noise sentence is filtered out from the audio-video document.
Optionally, the similarity between the two neighboring sentence in the text file is calculated separately, and according to the phase
Before similarity between adjacent two sentences judges the step of two neighboring sentence is with the presence or absence of noise sentence, the noise reduction
Method includes: to segment to each sentence in the text file, respectively obtains the word of each sentence;
The step of similarity between the two neighboring sentence calculated separately in the text file includes:
The corresponding number of word for obtaining two neighboring sentence respectively according to number dictionary;
According to the corresponding number of the word of two neighboring sentence, respectively two neighboring sentence establishes vector model;
According to the vector model of two neighboring sentence, the Euclidean distance between two neighboring sentence is calculated;
According to the Euclidean distance between two neighboring sentence, the similarity between two neighboring sentence is obtained.
Optionally, the similarity between two neighboring sentence is calculated by the following formula:
Sim=1/ (1+D), wherein Sim indicates the similarity of two neighboring sentence, and D indicates that the Europe of two neighboring sentence is several
Reed distance.
Optionally, judge that the two neighboring sentence whether there is according to the similarity between the two neighboring sentence to make an uproar
The step of sound sentence includes:
Judge whether the similarity between the two neighboring sentence is less than preset similarity threshold;
When similarity between the two neighboring sentence is less than preset similarity threshold, determine described two neighboring
There are noise sentences for sentence.
Optionally, it is determined according to preset strategy described two neighboring in the two neighboring sentence there are when noise sentence
A step of sentence in sentence is noise sentence include:
In the two neighboring sentence there are when noise sentence, the first sentence in the two neighboring sentence and institute are calculated
The similarity of the sentence of the predetermined number since first sentence in text file is stated, and calculates the two neighboring sentence
In the second sentence and the text file in the predetermined number since first sentence sentence similarity;
According in the two neighboring sentence the first sentence and the text file in since first sentence
In the second sentence in the similarity of the sentence of predetermined number and the two neighboring sentence and the text file from first
The similarity of the sentence for the predetermined number that a sentence starts determines the first sentence or the second sentence in the two neighboring sentence
For noise sentence.
Denoising device and method of the invention carries out speech recognition to audio-video document, audio-video document is converted written
This document;The similarity between the two neighboring sentence in the text file is calculated separately, and according to the two neighboring language
Similarity between sentence determines the two neighboring sentence with the presence or absence of noise sentence;In the two neighboring sentence, there are noises
When sentence, determine that the sentence in the two neighboring sentence is noise sentence according to preset strategy, and by noise sentence from institute
It states in audio-video document and filters out;The audio-video document is first converted into text file, according to sentence each in this article this document
Similarity determines noise sentence, then noise sentence is filtered out from audio-video document, can more objectively identify audio-video text
Noise sentence in part, and be not affected by the surrounding environment, it is greatly improved the accuracy rate of removal noise.
Detailed description of the invention
The hardware structural diagram of Fig. 1 each embodiment one optional mobile terminal to realize the present invention;
Fig. 2 is the module diagram of the first embodiment of denoising device of the present invention;
Fig. 3 is the module diagram of the second embodiment of denoising device of the present invention;
Fig. 4 is the module diagram of the 3rd embodiment of denoising device of the present invention;
Fig. 5 is the module diagram of the fourth embodiment of denoising device of the present invention;
Fig. 6 is the module diagram of the 5th embodiment of denoising device of the present invention;
Fig. 7 is the schematic diagram of the prompt information in denoising device of the present invention;
Fig. 8 is the flow diagram of the first embodiment of noise-reduction method of the present invention;
Fig. 9 is the flow diagram of the second embodiment of noise-reduction method of the present invention;
Figure 10 is the flow diagram of the 3rd embodiment of noise-reduction method of the present invention;
Figure 11 is the flow diagram of the fourth embodiment of noise-reduction method of the present invention;
Figure 12 is the flow diagram of the 5th embodiment of noise-reduction method of the present invention.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
The mobile terminal of each embodiment of the present invention is realized in description with reference to the drawings.In subsequent description, use
For indicate element such as " module ", " component " or " unit " suffix only for being conducive to explanation of the invention, itself
There is no specific meanings.Therefore, " module " can be used mixedly with " component ".
Mobile terminal can be implemented in a variety of manners.For example, terminal described in the present invention may include such as moving
Phone, smart phone, laptop, digit broadcasting receiver, PDA (personal digital assistant), PAD (tablet computer), PMP
The mobile terminal of (portable media player), navigation device etc. and such as number TV, desktop computer etc. are consolidated
Determine terminal.Hereinafter it is assumed that terminal is mobile terminal.However, it will be understood by those skilled in the art that in addition to being used in particular for moving
Except the element of purpose, the construction of embodiment according to the present invention can also apply to the terminal of fixed type.
The hardware configuration signal of Fig. 1 each embodiment one optional mobile terminal to realize the present invention.
Mobile terminal 100 may include wireless communication unit 110, A/V (audio/video) input unit 120, user's input
Unit 130, sensing unit 140, output unit 150, memory 160, interface unit 170, controller 180 and power supply unit 190
Etc..Fig. 1 shows the mobile terminal with various assemblies, it should be understood that being not required for implementing all groups shown
Part.More or fewer components can alternatively be implemented.The element of mobile terminal will be discussed in more detail below.The controller 180
It can control A/V (audio/video) input unit 120 to record, generate audio-video document, and audio-video document is stored in
In memory 160.The controller 180 carries out speech recognition to audio-video document, and audio-video document is converted into text file, and
Text file is stored in memory 160;The controller 180 calculates similar between the two neighboring sentence in text file
Degree, and the two neighboring sentence is judged with the presence or absence of noise sentence, in the phase according to the similarity between the two neighboring sentence
Adjacent two sentences determine that the sentence in the two neighboring sentence is noise sentence according to preset strategy there are when noise sentence,
Then noise sentence is filtered out from the audio-video document.
Wireless communication unit 110 generally includes one or more components, allows mobile terminal 100 and wireless communication device
Or the radio communication between network.
A/V input unit 120 is for receiving audio or video signal.User input unit 130 can be inputted according to user
Order generate key input data to control the various operations of mobile terminal.It is various that user input unit 130 allows user to input
The information of type, and may include keyboard, metal dome, touch tablet (for example, detection is due to resistance, pressure caused by being contacted
The sensitive component of the variation of power, capacitor etc.), idler wheel, rocking bar etc..Particularly, when touch tablet be superimposed upon in the form of layer it is aobvious
When showing on unit 151, touch screen can be formed.
Sensing unit 140 detects the current state of mobile terminal 100, (for example, mobile terminal 100 opens or closes shape
State), the position of mobile terminal 100, user is for the presence or absence of contact (that is, touch input) of mobile terminal 100, mobile terminal
100 orientation, the acceleration of mobile terminal 100 or by fast movement and direction etc., and generate for controlling mobile terminal 100
The order of operation or signal.For example, sensing unit 140 can sense when mobile terminal 100 is embodied as sliding-type mobile phone
The sliding-type phone is to open or close.In addition, sensing unit 140 be able to detect power supply unit 190 whether provide electric power or
Whether person's interface unit 170 couples with external device (ED).
Interface unit 170 be used as at least one external device (ED) connect with mobile terminal 100 can by interface.For example,
External device (ED) may include wired or wireless headphone port, external power supply (or battery charger) port, wired or nothing
Line data port, memory card port, the port for connecting the device with identification module, audio input/output (I/O) end
Mouth, video i/o port, ear port etc..Identification module can be storage and use each of mobile terminal 100 for verifying user
It plants information and may include subscriber identification module (UIM), client identification module (SIM), Universal Subscriber identification module (USIM)
Etc..In addition, the device (hereinafter referred to as " identification device ") with identification module can take the form of smart card, therefore, know
Other device can be connect via port or other attachment devices with mobile terminal 100.Interface unit 170, which can be used for receiving, to be come from
The input (for example, data information, electric power etc.) of external device (ED) and the input received is transferred in mobile terminal 100
One or more elements can be used for transmitting data between mobile terminal and external device (ED).
In addition, when mobile terminal 100 is connect with external base, interface unit 170 may be used as allowing will be electric by it
Power, which is provided from pedestal to the path or may be used as of mobile terminal 100, allows the various command signals inputted from pedestal to pass through it
It is transferred to the path of mobile terminal.The various command signals or electric power inputted from pedestal, which may be used as mobile terminal for identification, is
The no signal being accurately fitted on pedestal.Output unit 150 is configured to provide with vision, audio and/or tactile manner defeated
Signal (for example, audio signal, vision signal, alarm signal, vibration signal etc.) out.
Output unit 150 may include display unit 151 etc..
Display unit 151 may be displayed on the information handled in mobile terminal 100.For example, when mobile terminal 100 is in electricity
When talking about call mode, display unit 151 can show and converse or other communicate (for example, text messaging, multimedia file
Downloading etc.) relevant user interface (UI) or graphic user interface (GUI).When mobile terminal 100 is in video calling mode
Or when image capture mode, display unit 151 can show captured image and/or received image, show video or figure
Picture and the UI or GUI of correlation function etc..
Meanwhile when display unit 151 and touch tablet in the form of layer it is superposed on one another to form touch screen when, display unit
151 may be used as input unit and output device.Display unit 151 may include liquid crystal display (LCD), thin film transistor (TFT)
In LCD (TFT-LCD), Organic Light Emitting Diode (OLED) display, flexible display, three-dimensional (3D) display etc. at least
It is a kind of.Some in these displays may be constructed such that transparence to allow user to watch from outside, this is properly termed as transparent
Display, typical transparent display can be, for example, TOLED (transparent organic light emitting diode) display etc..According to specific
Desired embodiment, mobile terminal 100 may include two or more display units (or other display devices), for example, moving
Dynamic terminal may include outernal display unit (not shown) and inner display unit (not shown).Touch screen can be used for detecting touch
Input pressure and touch input position and touch input area.
Memory 160 can store the software program etc. of the processing and control operation that are executed by controller 180, Huo Zheke
Temporarily to store oneself data (for example, telephone directory, message, still image, video etc.) through exporting or will export.And
And memory 160 can store about the vibrations of various modes and audio signal exported when touching and being applied to touch screen
Data.
Memory 160 may include the storage medium of at least one type, and the storage medium includes flash memory, hard disk, more
Media card, card-type memory (for example, SD or DX memory etc.), random access storage device (RAM), static random-access storage
Device (SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read only memory
(PROM), magnetic storage, disk, CD etc..Moreover, mobile terminal 100 can execute memory with by network connection
The network storage device of 160 store function cooperates.
The overall operation of the usually control mobile terminal of controller 180.For example, controller 180 executes and voice communication, data
Communication, video calling etc. relevant control and processing.In addition, controller 180 may include for reproducing (or playback) more matchmakers
The multi-media module 181 of volume data, multi-media module 181 can construct in controller 180, or can be structured as and control
Device 180 separates.Controller 180 can be with execution pattern identifying processing, by the handwriting input executed on the touchscreen or picture
It draws input and is identified as character or image.
Power supply unit 190 receives external power or internal power under the control of controller 180 and provides operation each member
Electric power appropriate needed for part and component.
Various embodiments described herein can be to use the calculating of such as computer software, hardware or any combination thereof
Machine readable medium is implemented.Hardware is implemented, embodiment described herein can be by using application-specific IC
(ASIC), digital signal processor (DSP), digital signal processing device (DSPD), programmable logic device (PLD), scene can
Programming gate array (FPGA), controller, microcontroller, microprocessor, is designed to execute function described herein processor
At least one of electronic unit is implemented, and in some cases, such embodiment can be implemented in controller 180.
For software implementation, the embodiment of such as process or function can with allow to execute the individual of at least one functions or operations
Software module is implemented.Software code can by the software application (or program) write with any programming language appropriate Lai
Implement, software code can store in memory 160 and be executed by controller 180.
So far, oneself is through describing mobile terminal according to its function.In the following, for the sake of brevity, will description such as folded form,
Slide type mobile terminal in various types of mobile terminals of board-type, oscillating-type, slide type mobile terminal etc., which is used as, to be shown
Example.Therefore, the present invention can be applied to any kind of mobile terminal, and be not limited to slide type mobile terminal.
Mobile terminal 100 as shown in Figure 1 may be constructed such that using via frame or grouping send data it is all if any
Line and wireless communication device and satellite-based communication device operate.
Based on above-mentioned mobile terminal hardware configuration, each embodiment of denoising device of the present invention is proposed.
Referring to Fig. 2, Fig. 2 is the module diagram of the first embodiment of denoising device of the present invention, which includes:
Audio-video document is converted into text file for carrying out speech recognition to audio-video document by conversion module 10;
Computing module 20, for calculating separately the similarity between the two neighboring sentence in this article this document;
Judgment module 30, for judging whether the two neighboring sentence is deposited according to the similarity between the two neighboring sentence
In noise sentence;
Determining module 40, for, there are when noise sentence, determining that this is adjacent according to preset strategy in the two neighboring sentence
A sentence in two sentences is noise sentence;
Noise reduction module 50, for filtering out noise sentence from the audio-video document.
The audio-video document can be the audio file that recording device is recorded, which can be recording
Pen, or the mobile terminal with sound-recording function, such as smart phone, tablet computer.
During in session, during training or other need the occasion recorded, start the recording device and recorded, recorded
After the completion, audio-video document is generated.
The conversion module 10 can obtain the audio-video document of recording device recording by wired or wireless way, such as one
In embodiment, the audio-video document of recording device recording is obtained by WiFi mode.Optionally, which can record
In the recording process of mixer, speech recognition is carried out to the audio-video document that the recording device is recorded;The conversion module 10 can also
After the completion of recording device is recorded, speech recognition is carried out to the audio-video document that the recording device is recorded.
The conversion module 10 carries out speech recognition to the audio-video document, obtains text file;This article this document includes more
The position of a sentence and each sentence in audio-video document.Specifically, the conversion module 10 is using speech recognition technology to this
Audio-video document carries out speech recognition, such as: the audio-video document being divided multiframe according to the scheduled frame period time, calls voice
The audio-video document after sub-frame processing is converted text one by one by identification technology, obtains sentence, then each sentence is existed
Position and corresponding text in audio-video document save as one section in text file, and this article this document includes audio-video text
The position of all sentences and each sentence in audio-video document in part.As in one embodiment, to the audio-video document into
After row speech recognition, 1000 sentences are obtained, then there are 1000 sections in this article this document, according to recognition sequence, every section of correspondence one
A sentence identified;In this article this document, position of each sentence in audio-video document is recordable in the sentence most
Front or backmost, such as in one embodiment, by position of each sentence in audio-video document be recorded in the sentence most before
Face, i.e., in the either segment of this article this document, most start to write is position of this section of sentence in audio-video document, is then write
It is this section of corresponding sentence.
Position of each sentence in audio-video document is time shaft position of each sentence in audio-video document, such as
Position of one sentence in audio-video document are as follows: the 5th second to the 8th second.
The audio-video document is converted into text file by the conversion module 10, optionally, the filename of this article this document with
The filename of the audio-video document is identical, and user can be facilitated to understand which audio-video document this article this document corresponds to.
The computing module 20 calculates the similarity between the two neighboring sentence in this article this document, specifically, by text
Each sentence in file is converted to vector model, calculates the similar of the two sentences according to the vector model of two neighboring sentence
Degree.The vector model of two neighboring sentence dimension having the same, such as vector model of one of sentence indicate are as follows: a=
(x11, x21, x31... ..., xn1), the vector model of another sentence indicates are as follows: b=(x12, x22, x32... ..., xn2), wherein
xn1Indicate n-th of component of vector a, xn2The number of dimensions of n-th of component of expression vector b, vector a and vector b are all n.
When the dimension difference of the vector model of the two neighboring sentence, then the vector model of the sentence less to dimension carries out dimension benefit
It fills, so that the dimension of the vector model of two neighboring sentence is identical;Specifically, in the vector mould of the sentence less to the dimension
When type carries out dimension supplement, the corresponding value of the dimension supplemented in the vector model of the less sentence of the dimension is indicated with 0, is such as existed
In one embodiment, the vector model of one of sentence is indicated are as follows: a=(x11、x21、x31..., xn1), another sentence to
Amount model is expressed as: b=(x12, x22, x32..., xj2), wherein j < n is then modified vector model b, after modification
Vector model are as follows: b '=(x12、x22、x32..., xj2, 0,0 ... ..., 0), modified vector model b ' and vector model a
Dimension with identical quantity.
Judgment module 30, for judging whether the two neighboring sentence is deposited according to the similarity between the two neighboring sentence
In noise sentence.Similarity between two neighboring sentence is bigger, which more may be non-noise sentence, i.e.,
Noise sentence is not present in two neighboring sentence, conversely, the similarity between two neighboring sentence is lower, this two neighboring sentence is just
More there may be noise sentences.It is common, in a meeting scene, in session during, similarity between each sentence compared with
Height, in the halftime, people chat this or that, and the similarity between each sentence is lower.
Two neighboring sentence can be respectively defined as the first sentence and the second sentence, wherein the first sentence is preceding language
Sentence.
The determining module 40, there are when noise sentence, determines the two neighboring language according to preset strategy in two neighboring sentence
A sentence in sentence is noise sentence.
Optionally, preset strategy are as follows: determine that first sentence in the two neighboring sentence is noise sentence.
Optionally, preset strategy are as follows: determine that second sentence in the two neighboring sentence is noise sentence.
Optionally, this prestores strategy are as follows: calculates the previous of the first sentence in the two neighboring sentence and first sentence
The similarity of sentence, and the similarity of the latter sentence of the second sentence and second sentence in the two neighboring sentence is calculated,
According to the latter sentence of the similarity and the second sentence of the first sentence and the previous sentence of first sentence and second sentence
Similarity determines that the sentence in two neighboring sentence is noise sentence;Specifically, in first sentence and first sentence
When the similarity of previous sentence is greater than the similarity of the second sentence and the latter sentence of second sentence, the two neighboring language is determined
The second sentence in sentence is noise sentence, conversely, the similarity in first sentence and the previous sentence of first sentence is less than
Or when equal to the similarity of the second sentence and the latter sentence of second sentence, determine the first sentence in the two neighboring sentence
For noise sentence.In the similarity for the previous sentence for calculating first sentence and first sentence, and calculate second sentence with
When the similarity of the latter sentence of second sentence, calculation and computing module 20 calculate the similarity of two neighboring sentence
Calculation is identical, and this will not be repeated here.
The noise reduction module 50 filters out noise sentence from audio-video document, can to reduce the noise in audio-video document
Choosing, the noise reduction module 50 is from the position of the noise sentence in audio-video document is found in text file, according to the noise
Position of the sentence in audio-video document, noise sentence is filtered out from audio-video document, carries out noise reduction to the audio-video document.
Optionally, which can also regard the noise sentence in sound when filtering out noise sentence from audio-video document
Corresponding position in frequency file fills preset music, and e.g., which can be light music.
Optionally, which can be in the recording process of recording device, by noise sentence from the audio-video document
In filter out;The noise reduction module 50 can also filter out noise sentence after the completion of recording device is recorded from the audio-video document.
Using above-described embodiment, by carrying out speech recognition to audio-video document, audio-video document is converted into text text
Part;The similarity between the two neighboring sentence in this article this document is calculated separately, and according between the two neighboring sentence
Similarity judges the two neighboring sentence with the presence or absence of noise sentence;In the two neighboring sentence there are when noise sentence, according to
Preset strategy determines that the sentence in the two neighboring sentence is noise sentence, and noise sentence is filtered from the audio-video document
It removes;The audio-video document is first converted into text file, noise language is determined according to the similarity of sentence each in this article this document
Sentence, then noise sentence is filtered out from audio-video document, to reduce the noise in audio-video document, can more objectively identify
Noise sentence in audio-video document, and be not affected by the surrounding environment, it is greatly improved the accuracy rate of removal noise.
It is the module diagram of the second embodiment of denoising device of the present invention referring to Fig. 3, Fig. 3.
Difference based on the first embodiment of above-mentioned denoising device, the second embodiment and first embodiment is, the drop
It makes an uproar device further include: word segmentation module 60 respectively obtains each sentence for segmenting to each sentence in this article this document
Word;The computing module 20 includes:
Acquiring unit 21, the corresponding number of word for obtaining two neighboring sentence respectively according to number dictionary;
Unit 22 is established, for the corresponding number of word according to two neighboring sentence, respectively two neighboring sentence is built
Vertical vector model;
First computing unit 23 calculates between two neighboring sentence for the vector model according to two neighboring sentence
Euclidean distance;
Second computing unit 24, for obtaining two neighboring language according to the Euclidean distance between two neighboring sentence
Similarity between sentence.
The word segmentation module 60 can segment each sentence in this article this document according to preset dictionary for word segmentation, obtain
The word of each sentence after being segmented, obtains such as to sentence " the problem of today, main topic of discussion was about project process "
Word is successively are as follows: today, discussion, theme, be, about, project, progress, problem, totally 10 words;To a sentence point
The word that word obtains can be identical, as occur in the word segmentation result of above-mentioned sentence twice " ".
Optionally, each sentence in 60 cutting this article this document of word segmentation module and all participles of each sentence are obtained
Mode (a such as sentence has 2 kinds of participle modes, and another sentence has 5 kinds of participle modes), calculates all participle sides of each sentence
The sentence weight of formula, the sentence weight of more every kind of participle mode, according to preset selection strategy owning from each sentence
A kind of participle mode is selected in participle mode, and corresponding sentence is segmented according to the participle mode of selection, is segmented
As a result.As in one embodiment, a sentence, which has, segments mode in 5, then calculate separately using this in 5 participle mode to the sentence into
Sentence weight when row participle, when case statement maximum weight corresponding participle mode, further according to the participle mode pair of the selection
The sentence is segmented.The participle mode of each sentence can be different.
The corresponding relationship of word and number is recorded in number dictionary, the corresponding number of each word, same number is only
It can a corresponding word, i.e. one word of the same number expression.
The acquiring unit 21 obtains the corresponding number of word of two neighboring sentence according to number dictionary;This establishes unit
22 establish vector model according to the corresponding number of word of the two neighboring sentence, respectively two neighboring sentence.It is common, one
It include N number of word after sentence participle, then the corresponding vector model of the sentence is just N-dimensional, and a such as sentence includes 5 word (this 5
Can have partial words identical in word), then the corresponding vector model of the sentence is just five dimensions.If a sentence is that " you have a meal
", the corresponding word of the sentence is " you, have a meal, ", then the sentence corresponding vector model is the four-dimension, wherein according to
Number dictionary, finds that the corresponding number of word " you " is 110, the corresponding number of word " having a meal " is 98, word " " is corresponding
Number be 150, the corresponding number of number " " is 90, then the vector model of the sentence are as follows: c=(110,98,150,90).
Optionally, which can be preset, and all audio-video documents all share the number dictionary, in the number word
In allusion quotation, the corresponding number of each word is had recorded.
Optionally, which generates according to the audio-video document, specifically, to all languages in the audio-video document
The word of sentence is summarized, and then each word is numbered according to the number that user inputs, and generates number dictionary.Such as one
In embodiment, the word of all sentences in the audio-video document has 10,000, this 10,000 words do not repeat, and user is as required
This 10,000 words are numbered, the number of each word is different.
The value of each component in the vector model of sentence corresponds to the number of the word of the component.Such as the vector mould of a sentence
Type are as follows: c=(110,98,150,90), i.e. the first of sentence component value are 110, and the word of first component is " you ".
First computing unit 23 calculates the Euclidean distance between two neighboring sentence, specifically, passing through following public affairs
Formula calculates:
Wherein n is the dimension of two sentences, xi1It indicates in two neighboring sentence
I-th of component of the vector model of one of sentence, xi2Indicate the vector model of another sentence in two neighboring sentence
I-th of component.
Second computing unit 24 calculates the similarity between two neighboring sentence, specifically, between two neighboring sentence
Similarity be calculated by the following formula:
Sim=1/ (1+D), wherein Sim indicates the similarity of two neighboring sentence, and D indicates that the Europe of two neighboring sentence is several
Reed distance.
It can be seen that from above-mentioned calculating formula of similarity when the Euclidean distance between two neighboring sentence is smaller, phase
Similarity between adjacent two sentences is bigger;Conversely, the Euclidean distance between two neighboring sentence is bigger, adjacent two
Similarity between a sentence is with regard to smaller.
Each sentence in text file is segmented by word segmentation module 60, and by acquiring unit 21 according to number
Dictionary obtains the corresponding number of word of two neighboring sentence respectively, and establishing unit 22 is that two neighboring sentence establishes vector mould
Type, then by the first computing unit according to the vector model of two neighboring sentence, calculate Europe between two neighboring sentence it is several in
Moral distance;Then two neighboring language is obtained according to the Euclidean distance between two neighboring sentence by the second computing unit
Similarity between sentence;The similarity in text file between two neighboring sentence can be more accurately calculated, and then accurate
The two neighboring sentence of determination whether there is noise sentence, with improve removal noise accuracy rate.
It is the module diagram of the 3rd embodiment of denoising device of the present invention referring to Fig. 4, Fig. 4.
Difference based on the first embodiment of above-mentioned denoising device, the 3rd embodiment and first embodiment is that this is sentenced
Disconnected module 30 includes:
Judging unit 31, for judging whether the similarity between the two neighboring sentence is less than preset similarity threshold
Value;
First determination unit 32 is less than preset similarity threshold for the similarity between the two neighboring sentence
When, determine that there are noise sentences for the two neighboring sentence.
The similarity threshold can be preset as needed, which judges similar between two neighboring sentence
Whether degree is less than preset similarity threshold, to determine that the two neighboring sentence whether there is noise sentence.
In the present embodiment the judgment module 30 according to the similarity between the two neighboring sentence judge this adjacent two
When a sentence whether there is noise sentence, the judging unit 31 in the judgment module 30 will be similar between the two neighboring sentence
Degree is compared with preset similarity threshold, which determines adjacent according to the judging result of judging unit 31
Two sentences whether there is noise sentence, and can more objectively identify in audio-video document whether there is noise sentence, with
Improve the accuracy rate of removal noise.
It is the module diagram of the fourth embodiment of denoising device of the present invention referring to Fig. 5, Fig. 5.
Difference based on the first embodiment of above-mentioned denoising device, the fourth embodiment and first embodiment is that this is really
Cover half block 40 includes:
Third computing unit 41 calculates in the two neighboring sentence in the two neighboring sentence there are when noise sentence
The similarity of the sentence of the predetermined number since first sentence in first sentence and this article this document, and to calculate this adjacent
The similarity of the sentence of the predetermined number since first sentence in the second sentence and this article this document in two sentences;
Second determination unit 42, for according in the first sentence and this article this document in the two neighboring sentence from the
The second sentence and this article this document in the similarity of the sentence for the predetermined number that one sentence starts and the two neighboring sentence
In the predetermined number since first sentence sentence similarity, determine the first sentence in the two neighboring sentence or
Second sentence is noise sentence.
The predetermined number can be set as needed, common, which is 20.
The third computing unit 41 calculate in the first sentence and this article this document in the two neighboring sentence from first
The similarity of the sentence for the predetermined number that a sentence starts obtains multiple similarities, e.g., when predetermined number is 20, then successively
The first sentence calculated in the two neighboring sentence is similar to 20 since first sentence the sentence in text file
Degree, obtains 20 similarities.
The third computing unit 41 calculate in the second sentence and this article this document in the two neighboring sentence from first
The similarity of the sentence for the predetermined number that a sentence starts obtains multiple similarities, e.g., when predetermined number is 20, then successively
The second sentence calculated in the two neighboring sentence is similar to 20 since first sentence the sentence in text file
Degree, obtains 20 similarities.
Calculate the two neighboring sentence in the first sentence with it is pre- since first sentence in this article this document
If the similarity of the sentence of number, and calculate in the second sentence in the two neighboring sentence and this article this document from first
When the similarity of the sentence for the predetermined number that sentence starts, calculation calculates the similar of two neighboring sentence to computing module 20
The calculation of degree is identical, and this will not be repeated here.
Second determination unit by the first sentence and this article this document in the two neighboring sentence from first language
The similarity summation of the sentence for the predetermined number that sentence starts, obtains the first similarity total value;And it will be in the two neighboring sentence
The similarity of the sentence of the predetermined number since first sentence in second sentence and this article this document is summed, and obtains second
Similarity total value;The first sentence in the two neighboring sentence is determined according to the first similarity total value and the second similarity total value
Or second sentence be noise sentence, specifically, when the first similarity total value be greater than the second similarity total value when, it is determined that the phase
The second sentence in adjacent two sentences is noise sentence, when the first similarity total value is less than or equal to the second similarity total value
When, it is determined that the first sentence in the two neighboring sentence is noise sentence.
In the present embodiment the determining module in two neighboring sentence there are when noise sentence, according to the two neighboring sentence
In the first sentence and the similarity of the sentence of the predetermined number since first sentence in this article this document and this is adjacent
The similarity of the sentence of the predetermined number since first sentence in the second sentence and this article this document in two sentences,
Determine that the first sentence or the second sentence in the two neighboring sentence are noise sentence;It can more objectively identify two neighboring
Noise sentence in sentence, to improve the accuracy rate of removal noise.
It is the module diagram of the 5th embodiment of denoising device of the present invention referring to Fig. 6, Fig. 6.
Based on the first embodiment of above-mentioned denoising device, the difference of the 5th embodiment and first embodiment is, this is really
Cover half block 40 includes:
Prompt unit 43, for, there are when noise sentence, issuing the user with prompt information in the two neighboring sentence, for
It is noise sentence that family, which selects a sentence in the two neighboring sentence according to the prompt information,;
Third determination unit 44, the selection instruction inputted for receiving user according to the prompt information, and according to the selection
Instructing the sentence determined in the two neighboring sentence is noise sentence.
The prompt unit 43 issues the user with prompt information, includes two options in the prompt information, and an option is
The first sentence in the two neighboring sentence is selected, another option is the second sentence selected in the two neighboring sentence,
The particular content of the two neighboring sentence is shown in the prompt information, as shown in fig. 7, if the first sentence is " you have had a meal ",
Second sentence is " the problem of today, main topic of discussion was about project process ".
User selects a sentence in the two neighboring sentence to feel for noise sentence, such as user according to the prompt information
The first sentence in the two neighboring sentence may be noise sentence, then selects first sentence.
The third determination unit 44 receives the selection instruction that user inputs according to the prompt information, if selection instruction is selection
The first sentence in the two neighboring sentence is then determined as noise sentence by the first sentence in two neighboring sentence;Such as selection
The second sentence in the two neighboring sentence is then determined as noise language to select the second sentence in two neighboring sentence by instruction
Sentence.
In the present embodiment, which issues the user with prompt in two neighboring sentence there are when noise sentence
Information, the third determination unit 44 determine in two neighboring sentence according to the selection instruction that user is inputted based on the prompt information
One sentence is noise sentence, the more flexible noise sentence determined in two neighboring sentence, to improve the standard of removal noise
True rate, better user experience.
The present invention further provides a kind of noise-reduction methods.
Referring to Fig. 8, Fig. 8 is the flow diagram of the first embodiment of noise-reduction method of the present invention, which includes:
S10, speech recognition is carried out to audio-video document, audio-video document is converted into text file.
The audio-video document can be the audio file that recording device is recorded, which can be recording
Pen, or the mobile terminal with sound-recording function, such as smart phone, tablet computer.
During in session, during training or other need the occasion recorded, start the recording device and recorded, recorded
After the completion, audio-video document is generated.
The audio-video document of recording device recording can be obtained by wired or wireless way, such as in one embodiment, led to
Cross the audio-video document that WiFi mode obtains recording device recording.It optionally, can be in the recording process of recording device, to this
The audio-video document that recording device is recorded carries out speech recognition;The recording device can also be recorded after the completion of recording device is recorded
The audio-video document of system carries out speech recognition.
In this step, speech recognition is carried out to the audio-video document, obtains text file;This article this document includes multiple
The position of sentence and each sentence in audio-video document.Specifically, being carried out using speech recognition technology to the audio-video document
Speech recognition, such as: the audio-video document being divided into multiframe according to the scheduled frame period time, calls speech recognition technology by framing
Treated, and audio-video document is converted into text one by one, obtains sentence, then by each sentence in audio-video document
Position and corresponding text save as one section in text file, and this article this document includes all sentences in the audio-video document
And position of each sentence in audio-video document.As in one embodiment, after carrying out speech recognition to the audio-video document, obtain
To 1000 sentences, then there are 1000 sections in this article this document, according to recognition sequence, one sentence identified of every section of correspondence;
In this article this document, position of each sentence in audio-video document is recordable in the foremost or backmost of the sentence, such as
In one embodiment, position of each sentence in audio-video document is recorded in the foremost of the sentence, i.e., in this article this paper
In the either segment of part, most start to write is position of this section of sentence in audio-video document, and what is then write is this section of corresponding language
Sentence.
Position of each sentence in audio-video document is time shaft position of each sentence in audio-video document, such as
Position of one sentence in audio-video document are as follows: the 5th second to the 8th second.
In this step, which is converted into text file, optionally, the filename of this article this document with should
The filename of audio-video document is identical, and user can be facilitated to understand which audio-video document this article this document corresponds to.
S20, similarity between two neighboring sentence in this article this document is calculated separately.
The similarity between the two neighboring sentence in this article this document is calculated, specifically, by each of text file
Sentence is converted to vector model, and the similarity of the two sentences is calculated according to the vector model of two neighboring sentence.It is two neighboring
The vector model of sentence dimension having the same, such as vector model of one of sentence indicate are as follows: a=(x11, x21,
x31... ..., xn1), the vector model of another sentence indicates are as follows: b=(x12, x22, x32... ..., xn2), wherein xn1Indicate to
Measure n-th of component of a, xn2The number of dimensions of n-th of component of expression vector b, vector a and vector b are all n.When this is adjacent
When the dimension difference of the vector model of two sentences, then the vector model of the sentence less to dimension carries out dimension supplement, so that
The dimension for obtaining the vector model of two neighboring sentence is identical;Specifically, the vector model in the sentence less to the dimension carries out
When dimension is supplemented, the corresponding value of the dimension supplemented in the vector model of the less sentence of the dimension is indicated with 0, is such as implemented one
In example, the vector model of one of sentence is indicated are as follows: a=(x11、x21、x31..., xn1), the vector model of another sentence
It indicates are as follows: b=(x12、x22、x32..., xj2), wherein j < n is then modified vector model b, modified vector
Model are as follows: b '=(x12、x22、x32..., xj2, 0,0 ... ..., 0), modified vector model b ' and vector model a has phase
With the dimension of quantity.
S30, judged the two neighboring sentence with the presence or absence of noise sentence according to the similarity between the two neighboring sentence.
Similarity between two neighboring sentence is bigger, which more may be non-noise sentence, i.e. phase
Noise sentence is not present in adjacent two sentences, conversely, the similarity between two neighboring sentence is lower, this two neighboring sentence is got over
There may be noise sentences.It is common, in a meeting scene, in session during, the similarity between each sentence is higher,
In the halftime, people chat this or that, and the similarity between each sentence is lower.
Two neighboring sentence can be respectively defined as the first sentence and the second sentence, wherein the first sentence is preceding language
Sentence.
S40, in the two neighboring sentence there are when noise sentence, determined in the two neighboring sentence according to preset strategy
One sentence is noise sentence.
In this step, the two neighboring language is determined according to preset strategy there are when noise sentence in two neighboring sentence
A sentence in sentence is noise sentence.
Optionally, preset strategy are as follows: determine that first sentence in the two neighboring sentence is noise sentence.
Optionally, preset strategy are as follows: determine that second sentence in the two neighboring sentence is noise sentence.
Optionally, this prestores strategy are as follows: calculates the previous of the first sentence in the two neighboring sentence and first sentence
The similarity of sentence, and the similarity of the latter sentence of the second sentence and second sentence in the two neighboring sentence is calculated,
According to the latter sentence of the similarity and the second sentence of the first sentence and the previous sentence of first sentence and second sentence
Similarity determines that the sentence in two neighboring sentence is noise sentence;Specifically, in first sentence and first sentence
When the similarity of previous sentence is greater than the similarity of the second sentence and the latter sentence of second sentence, the two neighboring language is determined
The second sentence in sentence is noise sentence, conversely, the similarity in first sentence and the previous sentence of first sentence is less than
Or when equal to the similarity of the second sentence and the latter sentence of second sentence, determine the first sentence in the two neighboring sentence
For noise sentence.In the similarity for the previous sentence for calculating first sentence and first sentence, and calculate second sentence with
When the similarity of the latter sentence of second sentence, the similarity of two neighboring sentence is calculated in calculation and step S30
Calculation is identical, and this will not be repeated here.
S50, noise sentence is filtered out from the audio-video document.
In this step, noise sentence is filtered out from audio-video document, it is optional to reduce the noise in audio-video document
, from the position of the noise sentence in audio-video document is found in text file, according to the noise sentence in audio-video text
Position in part filters out noise sentence from audio-video document, carries out noise reduction to the audio-video document.Optionally, it will make an uproar
When sound sentence is filtered out from audio-video document, corresponding position of the noise sentence in audio-video document can also be filled default
Music, e.g., the preset music be light music.
Optionally, noise sentence can be filtered out from the audio-video document in the recording process of recording device;It can also be
After the completion of recording device is recorded, noise sentence is filtered out from the audio-video document.
Using above-described embodiment, speech recognition is carried out by the audio-video document recorded to recording device, by audio-video text
Part is converted into text file;The similarity between the two neighboring sentence in this article this document is calculated separately, and adjacent according to this
Similarity between two sentences judges the two neighboring sentence with the presence or absence of noise sentence;Exist in the two neighboring sentence and makes an uproar
When sound sentence, according to preset strategy determine the sentence in the two neighboring sentence be noise sentence, and by noise sentence from this
It is filtered out in audio-video document;The audio-video document is first converted into text file, according to the phase of sentence each in this article this document
Noise sentence is determined like spending, then noise sentence is filtered out from audio-video document, it, can be more to reduce the noise in audio-video document
Add the noise sentence objectively identified in audio-video document, and be not affected by the surrounding environment, is greatly improved removal noise
Accuracy rate.
It is the flow diagram of the second embodiment of noise-reduction method of the present invention referring to Fig. 9, Fig. 9.
Difference based on the first embodiment of above-mentioned noise-reduction method, the second embodiment and first embodiment is, in step
Before rapid S20, which further includes S60, segments to each sentence in this article this document, respectively obtains each language
The word of sentence;
Step S20 includes: the corresponding number of word that S21 obtains two neighboring sentence according to number dictionary respectively;
S22, the corresponding number of word according to two neighboring sentence, respectively two neighboring sentence establish vector model;S23, basis
The vector model of two neighboring sentence calculates the Euclidean distance between two neighboring sentence;S24, according to two neighboring language
Euclidean distance between sentence, obtains the similarity between two neighboring sentence.
In step S60, each sentence in this article this document can be segmented according to preset dictionary for word segmentation, be obtained
To the word of each sentence, such as to sentence " the problem of today, main topic of discussion was about project process ", after being segmented, obtain
Word successively are as follows: today, discussion, theme, be, about, project, progress, problem, totally 10 words;To a sentence
Segmenting obtained word can be identical, as occur in the word segmentation result of above-mentioned sentence twice " ".
Optionally, in step S60, each sentence in cutting this article this document simultaneously obtains all points of each sentence
Word mode (a such as sentence has 2 kinds of participle modes, and another sentence has 5 kinds of participle modes), calculates all participles of each sentence
The sentence weight of mode, the sentence weight of more every kind of participle mode, according to preset selection strategy from the institute of each sentence
Have and select a kind of participle mode in participle mode, and corresponding sentence is segmented according to the participle mode of selection, is divided
Word result.As in one embodiment, a sentence, which has, segments mode in 5, then calculates separately using this in 5 participle mode to the sentence
Sentence weight when being segmented, when case statement maximum weight corresponding participle mode, further according to the participle mode of the selection
The sentence is segmented.The participle mode of each sentence can be different.
The corresponding relationship of word and number is recorded in number dictionary, the corresponding number of each word, same number is only
It can a corresponding word, i.e. one word of the same number expression.
In step S21, according to number dictionary, the corresponding number of word of two neighboring sentence is obtained;In the step
In S22, according to the corresponding number of the word of the two neighboring sentence, vector model is established for two neighboring sentence.It is common, one
It include N number of word after sentence participle, then the corresponding vector model of the sentence is just N-dimensional, and a such as sentence includes 5 word (this 5
Can have partial words identical in word), then the corresponding vector model of the sentence is just five dimensions.If a sentence is that " you have a meal
", the corresponding word of the sentence is " you, have a meal, ", then the sentence corresponding vector model is the four-dimension, wherein according to
Number dictionary, finds that the corresponding number of word " you " is 110, the corresponding number of word " having a meal " is 98, word " " is corresponding
Number be 150, the corresponding number of number " " is 90, then the vector model of the sentence are as follows: c=(110,98,150,90).
Optionally, which can be preset, and all audio-video documents all share the number dictionary, in the number word
In allusion quotation, the corresponding number of each word is had recorded.
Optionally, which generates according to the audio-video document, specifically, to all languages in the audio-video document
The word of sentence is summarized, and then each word is numbered according to the number that user inputs, and generates number dictionary.Such as one
In embodiment, the word of all sentences in the audio-video document has 10,000, this 10,000 words do not repeat, and user is as required
This 10,000 words are numbered, the number of each word is different.
The value of each component in the vector model of sentence corresponds to the number of the word of the component.Such as the vector mould of a sentence
Type are as follows: c=(110,98,150,90), i.e. the first of sentence component value are 110, and the word of first component is " you ".
In step S23, the Euclidean distance between two neighboring sentence is calculated, specifically, passing through following formula
It calculates:
Wherein n is the dimension of two sentences, xi1It indicates in two neighboring sentence
I-th of component of the vector model of one of sentence, xi2Indicate the vector model of another sentence in two neighboring sentence
I-th of component.
In step S24, the similarity between two neighboring sentence is calculated, specifically, between two neighboring sentence
Similarity is calculated by the following formula:
Sim=1/ (1+D), wherein Sim indicates the similarity of two neighboring sentence, and D indicates that the Europe of two neighboring sentence is several
Reed distance.
It can be seen that from above-mentioned calculating formula of similarity when the Euclidean distance between two neighboring sentence is smaller, phase
Similarity between adjacent two sentences is bigger;Conversely, the Euclidean distance between two neighboring sentence is bigger, adjacent two
Similarity between a sentence is with regard to smaller.
Using above-described embodiment, each sentence in text file is segmented, phase is obtained according to number dictionary respectively
The corresponding number of word of adjacent two sentences is that two neighboring sentence is established according to the corresponding number of the word of two neighboring sentence
Vector model calculates the Euclidean distance between two neighboring sentence further according to the vector model of two neighboring sentence;Then
According to the Euclidean distance between two neighboring sentence, the similarity between two neighboring sentence is obtained;It can more accurately
The similarity in text file between two neighboring sentence is calculated, and then accurately determines that two neighboring sentence whether there is and makes an uproar
Sound sentence, to improve the accuracy rate of removal noise.
0, Figure 10 is the flow diagram of the 3rd embodiment of noise-reduction method of the present invention referring to Fig.1.
Difference based on the first embodiment of above-mentioned noise-reduction method, the 3rd embodiment and first embodiment is, the step
Suddenly S30 includes:
S31, judge whether the similarity between the two neighboring sentence is less than preset similarity threshold.
The similarity threshold can be preset as needed, in this step, be judged similar between two neighboring sentence
Whether degree is less than preset similarity threshold, to determine that the two neighboring sentence whether there is noise sentence.
When S32, the similarity between the two neighboring sentence are less than preset similarity threshold, determine that this is two neighboring
There are noise sentences for sentence.
Using above-described embodiment, whether which is being judged according to the similarity between the two neighboring sentence
There are when noise sentence, the similarity between the two neighboring sentence is compared with preset similarity threshold, according to than
Relatively result determines with the presence or absence of noise sentence, whether depositing in audio-video document can be more objectively identified for two neighboring sentence
In noise sentence, to improve the accuracy rate of removal noise.
1, Figure 11 is the flow diagram of the fourth embodiment of noise-reduction method of the present invention referring to Fig.1.
Difference based on the first embodiment of above-mentioned noise-reduction method, the fourth embodiment and first embodiment is, the step
Suddenly S40 includes:
S41, it there are the first sentence when noise sentence, calculated in the two neighboring sentence and is somebody's turn to do in the two neighboring sentence
The similarity of the sentence of the predetermined number since first sentence in text file, and calculate in the two neighboring sentence
The similarity of the sentence of the predetermined number since first sentence in second sentence and this article this document.
The predetermined number can be set as needed, common, which is 20.
In this step, calculate in the first sentence in the two neighboring sentence and this article this document from first sentence
The similarity of the sentence of the predetermined number of beginning obtains multiple similarities, and e.g., when predetermined number is 20, then successively calculating should
The similarity of 20 since first sentence the sentence in the first sentence and text file in two neighboring sentence, obtains
20 similarities.
In this step, calculate in the second sentence in the two neighboring sentence and this article this document from first sentence
The similarity of the sentence of the predetermined number of beginning obtains multiple similarities, and e.g., when predetermined number is 20, then successively calculating should
The similarity of 20 since first sentence the sentence in the second sentence and text file in two neighboring sentence, obtains
20 similarities.
Calculate the two neighboring sentence in the first sentence with it is pre- since first sentence in this article this document
If the similarity of the sentence of number, and calculate in the second sentence in the two neighboring sentence and this article this document from first
When the similarity of the sentence for the predetermined number that sentence starts, the similar of two neighboring sentence is calculated in calculation and step S30
The calculation of degree is identical, and this will not be repeated here.
S42, according in the two neighboring sentence the first sentence and this article this document in since first sentence
In the second sentence in the similarity of the sentence of predetermined number and the two neighboring sentence and this article this document from first language
The similarity of the sentence for the predetermined number that sentence starts determines that the first sentence or the second sentence in the two neighboring sentence are noise
Sentence.
By default since first sentence in the first sentence and this article this document in the two neighboring sentence
The similarity of several sentences is summed, and the first similarity total value is obtained;And by the second sentence and this article in the two neighboring sentence
The similarity of the sentence of the predetermined number since first sentence in this document is summed, and the second similarity total value is obtained;Root
Determine that the first sentence or the second sentence in the two neighboring sentence are according to the first similarity total value and the second similarity total value
Noise sentence, specifically, when the first similarity total value is greater than the second similarity total value, it is determined that in the two neighboring sentence
The second sentence be noise sentence, when the first similarity total value be less than or equal to the second similarity total value when, it is determined that the phase
The first sentence in adjacent two sentences is noise sentence.
Using above-described embodiment, in two neighboring sentence there are when noise sentence, according in the two neighboring sentence
The similarity and the two neighboring language of the sentence of the predetermined number since first sentence in one sentence and this article this document
The similarity of the sentence of the predetermined number since first sentence in the second sentence and this article this document in sentence, determining should
The first sentence or the second sentence in two neighboring sentence are noise sentence;It can more objectively identify in two neighboring sentence
Noise sentence, with improve removal noise accuracy rate.
2, Figure 12 is the flow diagram of the 5th embodiment of noise-reduction method of the present invention referring to Fig.1.
Based on the first embodiment of above-mentioned noise-reduction method, the difference of the 5th embodiment and first embodiment is, the step
Suddenly S40 includes:
S43, in the two neighboring sentence there are when noise sentence, issue the user with prompt information, mentioned for user according to this
Showing that information selects a sentence in the two neighboring sentence is noise sentence.
In this step, prompt information is issued the user with, includes two options in the prompt information, an option is choosing
The first sentence in the two neighboring sentence is selected, another option is the second sentence selected in the two neighboring sentence, at this
The particular content of the two neighboring sentence is shown in prompt information, as shown in fig. 7, if the first sentence is " you have had a meal ", the
Two sentences are " the problem of today, main topic of discussion was about project process ".
User selects a sentence in the two neighboring sentence to feel for noise sentence, such as user according to the prompt information
The first sentence in the two neighboring sentence may be noise sentence, then selects first sentence.
S44, receive the selection instruction that inputs according to the prompt information of user, and according to the selection instruction determine this adjacent two
A sentence in a sentence is noise sentence.
In this step, the selection instruction that user inputs according to the prompt information is received, if selection instruction is that selection is adjacent
The first sentence in the two neighboring sentence is then determined as noise sentence by the first sentence in two sentences;Such as selection instruction
To select the second sentence in two neighboring sentence, then the second sentence in the two neighboring sentence is determined as noise sentence.
Using above-described embodiment, in two neighboring sentence there are when noise sentence, issuing the user with prompt information, and according to
The selection instruction that user is inputted based on the prompt information determines that the sentence in two neighboring sentence is noise sentence, more flexible
The noise sentence determined in two neighboring sentence, with improve removal noise accuracy rate, better user experience.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side
Method can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but in many cases
The former is more preferably embodiment.Based on this understanding, technical solution of the present invention substantially in other words does the prior art
The part contributed out can be embodied in the form of software products, which is stored in a storage medium
In (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that a terminal device (can be mobile phone, computer, clothes
Business device, air conditioner or the network equipment etc.) execute the method that each embodiment of the present invention is somebody's turn to do.
The above is only a preferred embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair
Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills
Art field, is included within the scope of the present invention.
Claims (8)
1. a kind of denoising device, which is characterized in that the denoising device includes:
The audio-video document is converted into text file for carrying out speech recognition to audio-video document by conversion module;
Computing module, for calculating separately the similarity between the two neighboring sentence in the text file;
Judgment module, for judging that the two neighboring sentence whether there is according to the similarity between the two neighboring sentence
Noise sentence;
Determining module, for, there are when noise sentence, determining described adjacent two according to preset strategy in the two neighboring sentence
A sentence in a sentence is noise sentence;
Noise reduction module, for filtering out the noise sentence from the audio-video document;
Wherein, the computing module is also used to each sentence in the text file being converted to vector model, and according to phase
The vector model of adjacent two sentences calculates the similarity of the two neighboring sentence;
The determining module includes:
Third computing unit calculates the in the two neighboring sentence in the two neighboring sentence there are when noise sentence
The similarity of the sentence of the predetermined number since first sentence in one sentence and the text file, and calculate the phase
The phase of the second sentence and the sentence of the predetermined number since first sentence in the text file in adjacent two sentences
Like degree;
Second determination unit, for according in the first sentence and the text file in the two neighboring sentence from first
The second sentence and text text in the similarity of the sentence for the predetermined number that a sentence starts and the two neighboring sentence
The similarity of the sentence of the predetermined number since first sentence in part determines the first language in the two neighboring sentence
Sentence or the second sentence are noise sentence.
2. denoising device as described in claim 1, which is characterized in that the denoising device further include: word segmentation module, for pair
Each sentence in the text file is segmented, and the word of each sentence is respectively obtained;
The computing module includes:
Acquiring unit, the corresponding number of word for obtaining two neighboring sentence respectively according to number dictionary;
Unit is established, for the corresponding number of word according to two neighboring sentence, respectively two neighboring sentence establishes vector
Model;
First computing unit, for the vector model according to two neighboring sentence, calculate Europe between two neighboring sentence it is several in
Moral distance;
Second computing unit, for obtaining between two neighboring sentence according to the Euclidean distance between two neighboring sentence
Similarity.
3. denoising device as claimed in claim 2, which is characterized in that the similarity between two neighboring sentence passes through following public affairs
Formula calculates:
Sim=1/ (1+D), wherein Sim indicates the similarity of two neighboring sentence, and D indicates the euclidean of two neighboring sentence
Distance.
4. denoising device as described in claim 1, which is characterized in that judgment module includes:
Judging unit, for judging whether the similarity between the two neighboring sentence is less than preset similarity threshold;
First determination unit, when being less than preset similarity threshold for the similarity between the two neighboring sentence, really
There are noise sentences for the fixed two neighboring sentence.
5. a kind of noise-reduction method, which is characterized in that the noise-reduction method includes:
Speech recognition is carried out to audio-video document, the audio-video document is converted into text file;
Calculate separately the similarity between the two neighboring sentence in the text file, and according to the two neighboring sentence it
Between similarity judge the two neighboring sentence with the presence or absence of noise sentence;
In the two neighboring sentence there are when noise sentence, the language in the two neighboring sentence is determined according to preset strategy
Sentence is noise sentence, and the noise sentence is filtered out from the audio-video document;
Wherein, the step of similarity between the two neighboring sentence calculated separately in the text file, comprising:
Each sentence in the text file is converted to vector model, and is calculated according to the vector model of two neighboring sentence
The similarity of the two neighboring sentence;
Wherein, it is described in the two neighboring sentence there are when noise sentence, the two neighboring language is determined according to preset strategy
Sentence in a sentence be noise sentence the step of include:
In the two neighboring sentence, there are the first sentences and the text that when noise sentence, calculate in the two neighboring sentence
The similarity of the sentence of the predetermined number since first sentence in this document, and calculate in the two neighboring sentence
The similarity of the sentence of the predetermined number since first sentence in second sentence and the text file;
According to default since first sentence in the first sentence and the text file in the two neighboring sentence
In the second sentence in the similarity of the sentence of number and the two neighboring sentence and the text file from first language
The similarity of the sentence for the predetermined number that sentence starts determines that the first sentence or the second sentence in the two neighboring sentence are to make an uproar
Sound sentence.
6. noise-reduction method as claimed in claim 5, which is characterized in that calculate separately the two neighboring language in the text file
Similarity between sentence, and judge that the two neighboring sentence whether there is according to the similarity between the two neighboring sentence
Before the step of noise sentence, the noise-reduction method includes: to segment to each sentence in the text file, respectively
To the word of each sentence;
The step of similarity between the two neighboring sentence calculated separately in the text file includes:
The corresponding number of word for obtaining two neighboring sentence respectively according to number dictionary;
According to the corresponding number of the word of two neighboring sentence, respectively two neighboring sentence establishes vector model;
According to the vector model of two neighboring sentence, the Euclidean distance between two neighboring sentence is calculated;
According to the Euclidean distance between two neighboring sentence, the similarity between two neighboring sentence is obtained.
7. noise-reduction method as claimed in claim 6, which is characterized in that the similarity between two neighboring sentence passes through following public affairs
Formula calculates:
Sim=1/ (1+D), wherein Sim indicates the similarity of two neighboring sentence, and D indicates the euclidean of two neighboring sentence
Distance.
8. noise-reduction method as claimed in claim 5, which is characterized in that sentenced according to the similarity between the two neighboring sentence
The two neighboring sentence that breaks whether there is noise sentence the step of include:
Judge whether the similarity between the two neighboring sentence is less than preset similarity threshold;
When similarity between the two neighboring sentence is less than preset similarity threshold, the two neighboring sentence is determined
There are noise sentences.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610370200.5A CN106067302B (en) | 2016-05-27 | 2016-05-27 | Denoising device and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610370200.5A CN106067302B (en) | 2016-05-27 | 2016-05-27 | Denoising device and method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106067302A CN106067302A (en) | 2016-11-02 |
CN106067302B true CN106067302B (en) | 2019-06-25 |
Family
ID=57420247
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610370200.5A Active CN106067302B (en) | 2016-05-27 | 2016-05-27 | Denoising device and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106067302B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110909021A (en) * | 2018-09-12 | 2020-03-24 | 北京奇虎科技有限公司 | Construction method and device of query rewriting model and application thereof |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1748249A (en) * | 2003-02-12 | 2006-03-15 | 松下电器产业株式会社 | Intermediary for speech processing in network environments |
JP2012128188A (en) * | 2010-12-15 | 2012-07-05 | Nippon Hoso Kyokai <Nhk> | Text correction device and program |
CN103369122A (en) * | 2012-03-31 | 2013-10-23 | 盛乐信息技术(上海)有限公司 | Voice input method and system |
CN103838789A (en) * | 2012-11-27 | 2014-06-04 | 大连灵动科技发展有限公司 | Text similarity computing method |
CN103956162A (en) * | 2014-04-04 | 2014-07-30 | 上海元趣信息技术有限公司 | Voice recognition method and device oriented towards child |
CN104462327A (en) * | 2014-12-02 | 2015-03-25 | 百度在线网络技术(北京)有限公司 | Computing method, search processing method, computing device and search processing device for sentence similarity |
CN104751846A (en) * | 2015-03-20 | 2015-07-01 | 努比亚技术有限公司 | Method and device for converting voice into text |
CN105161096A (en) * | 2015-09-22 | 2015-12-16 | 百度在线网络技术(北京)有限公司 | Speech recognition processing method and device based on garbage models |
-
2016
- 2016-05-27 CN CN201610370200.5A patent/CN106067302B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1748249A (en) * | 2003-02-12 | 2006-03-15 | 松下电器产业株式会社 | Intermediary for speech processing in network environments |
JP2012128188A (en) * | 2010-12-15 | 2012-07-05 | Nippon Hoso Kyokai <Nhk> | Text correction device and program |
CN103369122A (en) * | 2012-03-31 | 2013-10-23 | 盛乐信息技术(上海)有限公司 | Voice input method and system |
CN103838789A (en) * | 2012-11-27 | 2014-06-04 | 大连灵动科技发展有限公司 | Text similarity computing method |
CN103956162A (en) * | 2014-04-04 | 2014-07-30 | 上海元趣信息技术有限公司 | Voice recognition method and device oriented towards child |
CN104462327A (en) * | 2014-12-02 | 2015-03-25 | 百度在线网络技术(北京)有限公司 | Computing method, search processing method, computing device and search processing device for sentence similarity |
CN104751846A (en) * | 2015-03-20 | 2015-07-01 | 努比亚技术有限公司 | Method and device for converting voice into text |
CN105161096A (en) * | 2015-09-22 | 2015-12-16 | 百度在线网络技术(北京)有限公司 | Speech recognition processing method and device based on garbage models |
Also Published As
Publication number | Publication date |
---|---|
CN106067302A (en) | 2016-11-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105549819B (en) | The display methods and device of background application information | |
CN105915673B (en) | A kind of method and mobile terminal of special video effect switching | |
CN106027905B (en) | A kind of method and mobile terminal for sky focusing | |
CN108353103A (en) | Subscriber terminal equipment and its method for recommendation response message | |
CN105635452B (en) | Mobile terminal and its identification of contacts method | |
CN105959554B (en) | Video capture device and method | |
CN105591440B (en) | Mobile terminal charging control device and method | |
CN106571136A (en) | Voice output device and method | |
CN106409286A (en) | Method and device for implementing audio processing | |
CN106851451B (en) | A kind of earpiece volume control method and device | |
CN106534422B (en) | A kind of loudspeaker assembly, speaker and mobile terminal | |
KR20180109499A (en) | Method and apparatus for providng response to user's voice input | |
CN105681894A (en) | Device and method for displaying video file | |
CN106612396A (en) | Photographing device, photographing terminal and photographing method | |
CN113033245A (en) | Function adjusting method and device, storage medium and electronic equipment | |
CN111263009B (en) | Quality inspection method, device, equipment and medium for telephone recording | |
CN106471493B (en) | Method and apparatus for managing data | |
CN106686232A (en) | Method for optimizing control interfaces and mobile terminal | |
CN105654974B (en) | Multimedia playing apparatus and method | |
CN106713656A (en) | Photographing method and mobile terminal | |
CN106527685A (en) | Control method and device for terminal application | |
CN106067302B (en) | Denoising device and method | |
CN106095744B (en) | Irregular control icons processing unit and method | |
CN114360546A (en) | Electronic equipment and awakening method thereof | |
CN106021129B (en) | A kind of method of terminal and terminal cleaning caching |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |