CN106067302B - Denoising device and method - Google Patents

Denoising device and method Download PDF

Info

Publication number
CN106067302B
CN106067302B CN201610370200.5A CN201610370200A CN106067302B CN 106067302 B CN106067302 B CN 106067302B CN 201610370200 A CN201610370200 A CN 201610370200A CN 106067302 B CN106067302 B CN 106067302B
Authority
CN
China
Prior art keywords
sentence
neighboring
similarity
noise
audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610370200.5A
Other languages
Chinese (zh)
Other versions
CN106067302A (en
Inventor
王荣洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nubia Technology Co Ltd
Original Assignee
Nubia Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nubia Technology Co Ltd filed Critical Nubia Technology Co Ltd
Priority to CN201610370200.5A priority Critical patent/CN106067302B/en
Publication of CN106067302A publication Critical patent/CN106067302A/en
Application granted granted Critical
Publication of CN106067302B publication Critical patent/CN106067302B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L25/84Detection of presence or absence of voice signals for discriminating voice from noise
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Abstract

The invention discloses a kind of denoising devices, comprising: audio-video document is converted into text file for carrying out speech recognition to audio-video document by conversion module;Computing module, for calculating separately the similarity between the two neighboring sentence in this article this document;Judgment module, for judging the two neighboring sentence with the presence or absence of noise sentence according to the similarity between the two neighboring sentence;Determining module, for, there are when noise sentence, determining that the sentence in the two neighboring sentence is noise sentence according to preset strategy in the two neighboring sentence;Noise reduction module, for filtering out noise sentence from the audio-video document.The invention also discloses a kind of noise-reduction methods.Using the present invention, the noise sentence in audio-video document can be more objectively identified, and be not affected by the surrounding environment, be greatly improved the accuracy rate of removal noise.

Description

Denoising device and method
Technical field
The present invention relates to audio signal processing technique field more particularly to a kind of denoising device and methods.
Background technique
With the development and continuous improvement of people's living standards of mobile communication technology, people generally require to fill by recording It sets and records in different occasions, such as in interview occasion, meeting occasion, training occasion, live sound is recorded, it is raw At audio-video document.But since the scene of recording is complicated and changeable, the quality and content of recording due to ambient enviroment variation and It is impacted;For example, recording during in session, user open recording device and record, until stopping record after the conference is over Sound, still, this section of recording contain the recording during meeting rest, the audio-video document that therefore, it is necessary to record to recording device It is denoised, to get rid of inessential sound.
In the prior art, audio-video document is denoised generally according to playback environ-ment, such as during meeting rest, sound More noisy, during meeting, sound is more simple, but this mode has the drawback that due to too depending on surrounding Environment causes denoising accuracy rate low, such as: even if also will appear very noisy sound during in session.
Summary of the invention
It is a primary object of the present invention to propose a kind of denoising device and method, it is intended to solve in the prior art, according to record Sound environment denoises the audio-video document of recording, the low technical problem of denoising accuracy rate.
To achieve the above object, the present invention provides a kind of denoising device, and the denoising device includes:
The audio-video document is converted into text file for carrying out speech recognition to audio-video document by conversion module;
Computing module, for calculating separately the similarity between the two neighboring sentence in the text file;
Judgment module, for whether judging the two neighboring sentence according to the similarity between the two neighboring sentence There are noise sentences;
Determining module, for, there are when noise sentence, determining the phase according to preset strategy in the two neighboring sentence A sentence in adjacent two sentences is noise sentence;
Noise reduction module, for filtering out noise sentence from the audio-video document.
Optionally, the denoising device further include: word segmentation module, for being carried out to each sentence in the text file Participle, respectively obtains the word of each sentence;
The computing module includes:
Acquiring unit, the corresponding number of word for obtaining two neighboring sentence respectively according to number dictionary;
Unit is established, for the corresponding number of word according to two neighboring sentence, respectively two neighboring sentence is established Vector model;
First computing unit calculates the Europe between two neighboring sentence for the vector model according to two neighboring sentence A few Reed distances;
Second computing unit, for obtaining two neighboring sentence according to the Euclidean distance between two neighboring sentence Between similarity.
Optionally, the similarity between two neighboring sentence is calculated by the following formula:
Sim=1/ (1+D), wherein Sim indicates the similarity of two neighboring sentence, and D indicates that the Europe of two neighboring sentence is several Reed distance.
Optionally, judgment module includes:
Judging unit, for judging whether the similarity between the two neighboring sentence is less than preset similarity threshold Value;
First determination unit is less than preset similarity threshold for the similarity between the two neighboring sentence When, determine that there are noise sentences for the two neighboring sentence.
Optionally, the determining module includes:
Third computing unit calculates in the two neighboring sentence in the two neighboring sentence there are when noise sentence The first sentence and the text file in the predetermined number since first sentence sentence similarity, and calculate institute State the sentence of the predetermined number since first sentence in the second sentence and the text file in two neighboring sentence Similarity;
Second determination unit, for according in the two neighboring sentence the first sentence and the text file in from The second sentence and the text in the similarity of the sentence for the predetermined number that first sentence starts and the two neighboring sentence The similarity of the sentence of the predetermined number since first sentence in this document determines in the two neighboring sentence One sentence or the second sentence are noise sentence.
In addition, to achieve the above object, the present invention also proposes that a kind of noise-reduction method, the noise-reduction method include:
Speech recognition is carried out to audio-video document, audio-video document is converted into text file;
The similarity between the two neighboring sentence in the text file is calculated separately, and according to the two neighboring language Similarity between sentence judges the two neighboring sentence with the presence or absence of noise sentence;
In the two neighboring sentence there are when noise sentence, determined in the two neighboring sentence according to preset strategy One sentence is noise sentence, and the noise sentence is filtered out from the audio-video document.
Optionally, the similarity between the two neighboring sentence in the text file is calculated separately, and according to the phase Before similarity between adjacent two sentences judges the step of two neighboring sentence is with the presence or absence of noise sentence, the noise reduction Method includes: to segment to each sentence in the text file, respectively obtains the word of each sentence;
The step of similarity between the two neighboring sentence calculated separately in the text file includes:
The corresponding number of word for obtaining two neighboring sentence respectively according to number dictionary;
According to the corresponding number of the word of two neighboring sentence, respectively two neighboring sentence establishes vector model;
According to the vector model of two neighboring sentence, the Euclidean distance between two neighboring sentence is calculated;
According to the Euclidean distance between two neighboring sentence, the similarity between two neighboring sentence is obtained.
Optionally, the similarity between two neighboring sentence is calculated by the following formula:
Sim=1/ (1+D), wherein Sim indicates the similarity of two neighboring sentence, and D indicates that the Europe of two neighboring sentence is several Reed distance.
Optionally, judge that the two neighboring sentence whether there is according to the similarity between the two neighboring sentence to make an uproar The step of sound sentence includes:
Judge whether the similarity between the two neighboring sentence is less than preset similarity threshold;
When similarity between the two neighboring sentence is less than preset similarity threshold, determine described two neighboring There are noise sentences for sentence.
Optionally, it is determined according to preset strategy described two neighboring in the two neighboring sentence there are when noise sentence A step of sentence in sentence is noise sentence include:
In the two neighboring sentence there are when noise sentence, the first sentence in the two neighboring sentence and institute are calculated The similarity of the sentence of the predetermined number since first sentence in text file is stated, and calculates the two neighboring sentence In the second sentence and the text file in the predetermined number since first sentence sentence similarity;
According in the two neighboring sentence the first sentence and the text file in since first sentence In the second sentence in the similarity of the sentence of predetermined number and the two neighboring sentence and the text file from first The similarity of the sentence for the predetermined number that a sentence starts determines the first sentence or the second sentence in the two neighboring sentence For noise sentence.
Denoising device and method of the invention carries out speech recognition to audio-video document, audio-video document is converted written This document;The similarity between the two neighboring sentence in the text file is calculated separately, and according to the two neighboring language Similarity between sentence determines the two neighboring sentence with the presence or absence of noise sentence;In the two neighboring sentence, there are noises When sentence, determine that the sentence in the two neighboring sentence is noise sentence according to preset strategy, and by noise sentence from institute It states in audio-video document and filters out;The audio-video document is first converted into text file, according to sentence each in this article this document Similarity determines noise sentence, then noise sentence is filtered out from audio-video document, can more objectively identify audio-video text Noise sentence in part, and be not affected by the surrounding environment, it is greatly improved the accuracy rate of removal noise.
Detailed description of the invention
The hardware structural diagram of Fig. 1 each embodiment one optional mobile terminal to realize the present invention;
Fig. 2 is the module diagram of the first embodiment of denoising device of the present invention;
Fig. 3 is the module diagram of the second embodiment of denoising device of the present invention;
Fig. 4 is the module diagram of the 3rd embodiment of denoising device of the present invention;
Fig. 5 is the module diagram of the fourth embodiment of denoising device of the present invention;
Fig. 6 is the module diagram of the 5th embodiment of denoising device of the present invention;
Fig. 7 is the schematic diagram of the prompt information in denoising device of the present invention;
Fig. 8 is the flow diagram of the first embodiment of noise-reduction method of the present invention;
Fig. 9 is the flow diagram of the second embodiment of noise-reduction method of the present invention;
Figure 10 is the flow diagram of the 3rd embodiment of noise-reduction method of the present invention;
Figure 11 is the flow diagram of the fourth embodiment of noise-reduction method of the present invention;
Figure 12 is the flow diagram of the 5th embodiment of noise-reduction method of the present invention.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
The mobile terminal of each embodiment of the present invention is realized in description with reference to the drawings.In subsequent description, use For indicate element such as " module ", " component " or " unit " suffix only for being conducive to explanation of the invention, itself There is no specific meanings.Therefore, " module " can be used mixedly with " component ".
Mobile terminal can be implemented in a variety of manners.For example, terminal described in the present invention may include such as moving Phone, smart phone, laptop, digit broadcasting receiver, PDA (personal digital assistant), PAD (tablet computer), PMP The mobile terminal of (portable media player), navigation device etc. and such as number TV, desktop computer etc. are consolidated Determine terminal.Hereinafter it is assumed that terminal is mobile terminal.However, it will be understood by those skilled in the art that in addition to being used in particular for moving Except the element of purpose, the construction of embodiment according to the present invention can also apply to the terminal of fixed type.
The hardware configuration signal of Fig. 1 each embodiment one optional mobile terminal to realize the present invention.
Mobile terminal 100 may include wireless communication unit 110, A/V (audio/video) input unit 120, user's input Unit 130, sensing unit 140, output unit 150, memory 160, interface unit 170, controller 180 and power supply unit 190 Etc..Fig. 1 shows the mobile terminal with various assemblies, it should be understood that being not required for implementing all groups shown Part.More or fewer components can alternatively be implemented.The element of mobile terminal will be discussed in more detail below.The controller 180 It can control A/V (audio/video) input unit 120 to record, generate audio-video document, and audio-video document is stored in In memory 160.The controller 180 carries out speech recognition to audio-video document, and audio-video document is converted into text file, and Text file is stored in memory 160;The controller 180 calculates similar between the two neighboring sentence in text file Degree, and the two neighboring sentence is judged with the presence or absence of noise sentence, in the phase according to the similarity between the two neighboring sentence Adjacent two sentences determine that the sentence in the two neighboring sentence is noise sentence according to preset strategy there are when noise sentence, Then noise sentence is filtered out from the audio-video document.
Wireless communication unit 110 generally includes one or more components, allows mobile terminal 100 and wireless communication device Or the radio communication between network.
A/V input unit 120 is for receiving audio or video signal.User input unit 130 can be inputted according to user Order generate key input data to control the various operations of mobile terminal.It is various that user input unit 130 allows user to input The information of type, and may include keyboard, metal dome, touch tablet (for example, detection is due to resistance, pressure caused by being contacted The sensitive component of the variation of power, capacitor etc.), idler wheel, rocking bar etc..Particularly, when touch tablet be superimposed upon in the form of layer it is aobvious When showing on unit 151, touch screen can be formed.
Sensing unit 140 detects the current state of mobile terminal 100, (for example, mobile terminal 100 opens or closes shape State), the position of mobile terminal 100, user is for the presence or absence of contact (that is, touch input) of mobile terminal 100, mobile terminal 100 orientation, the acceleration of mobile terminal 100 or by fast movement and direction etc., and generate for controlling mobile terminal 100 The order of operation or signal.For example, sensing unit 140 can sense when mobile terminal 100 is embodied as sliding-type mobile phone The sliding-type phone is to open or close.In addition, sensing unit 140 be able to detect power supply unit 190 whether provide electric power or Whether person's interface unit 170 couples with external device (ED).
Interface unit 170 be used as at least one external device (ED) connect with mobile terminal 100 can by interface.For example, External device (ED) may include wired or wireless headphone port, external power supply (or battery charger) port, wired or nothing Line data port, memory card port, the port for connecting the device with identification module, audio input/output (I/O) end Mouth, video i/o port, ear port etc..Identification module can be storage and use each of mobile terminal 100 for verifying user It plants information and may include subscriber identification module (UIM), client identification module (SIM), Universal Subscriber identification module (USIM) Etc..In addition, the device (hereinafter referred to as " identification device ") with identification module can take the form of smart card, therefore, know Other device can be connect via port or other attachment devices with mobile terminal 100.Interface unit 170, which can be used for receiving, to be come from The input (for example, data information, electric power etc.) of external device (ED) and the input received is transferred in mobile terminal 100 One or more elements can be used for transmitting data between mobile terminal and external device (ED).
In addition, when mobile terminal 100 is connect with external base, interface unit 170 may be used as allowing will be electric by it Power, which is provided from pedestal to the path or may be used as of mobile terminal 100, allows the various command signals inputted from pedestal to pass through it It is transferred to the path of mobile terminal.The various command signals or electric power inputted from pedestal, which may be used as mobile terminal for identification, is The no signal being accurately fitted on pedestal.Output unit 150 is configured to provide with vision, audio and/or tactile manner defeated Signal (for example, audio signal, vision signal, alarm signal, vibration signal etc.) out.
Output unit 150 may include display unit 151 etc..
Display unit 151 may be displayed on the information handled in mobile terminal 100.For example, when mobile terminal 100 is in electricity When talking about call mode, display unit 151 can show and converse or other communicate (for example, text messaging, multimedia file Downloading etc.) relevant user interface (UI) or graphic user interface (GUI).When mobile terminal 100 is in video calling mode Or when image capture mode, display unit 151 can show captured image and/or received image, show video or figure Picture and the UI or GUI of correlation function etc..
Meanwhile when display unit 151 and touch tablet in the form of layer it is superposed on one another to form touch screen when, display unit 151 may be used as input unit and output device.Display unit 151 may include liquid crystal display (LCD), thin film transistor (TFT) In LCD (TFT-LCD), Organic Light Emitting Diode (OLED) display, flexible display, three-dimensional (3D) display etc. at least It is a kind of.Some in these displays may be constructed such that transparence to allow user to watch from outside, this is properly termed as transparent Display, typical transparent display can be, for example, TOLED (transparent organic light emitting diode) display etc..According to specific Desired embodiment, mobile terminal 100 may include two or more display units (or other display devices), for example, moving Dynamic terminal may include outernal display unit (not shown) and inner display unit (not shown).Touch screen can be used for detecting touch Input pressure and touch input position and touch input area.
Memory 160 can store the software program etc. of the processing and control operation that are executed by controller 180, Huo Zheke Temporarily to store oneself data (for example, telephone directory, message, still image, video etc.) through exporting or will export.And And memory 160 can store about the vibrations of various modes and audio signal exported when touching and being applied to touch screen Data.
Memory 160 may include the storage medium of at least one type, and the storage medium includes flash memory, hard disk, more Media card, card-type memory (for example, SD or DX memory etc.), random access storage device (RAM), static random-access storage Device (SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read only memory (PROM), magnetic storage, disk, CD etc..Moreover, mobile terminal 100 can execute memory with by network connection The network storage device of 160 store function cooperates.
The overall operation of the usually control mobile terminal of controller 180.For example, controller 180 executes and voice communication, data Communication, video calling etc. relevant control and processing.In addition, controller 180 may include for reproducing (or playback) more matchmakers The multi-media module 181 of volume data, multi-media module 181 can construct in controller 180, or can be structured as and control Device 180 separates.Controller 180 can be with execution pattern identifying processing, by the handwriting input executed on the touchscreen or picture It draws input and is identified as character or image.
Power supply unit 190 receives external power or internal power under the control of controller 180 and provides operation each member Electric power appropriate needed for part and component.
Various embodiments described herein can be to use the calculating of such as computer software, hardware or any combination thereof Machine readable medium is implemented.Hardware is implemented, embodiment described herein can be by using application-specific IC (ASIC), digital signal processor (DSP), digital signal processing device (DSPD), programmable logic device (PLD), scene can Programming gate array (FPGA), controller, microcontroller, microprocessor, is designed to execute function described herein processor At least one of electronic unit is implemented, and in some cases, such embodiment can be implemented in controller 180. For software implementation, the embodiment of such as process or function can with allow to execute the individual of at least one functions or operations Software module is implemented.Software code can by the software application (or program) write with any programming language appropriate Lai Implement, software code can store in memory 160 and be executed by controller 180.
So far, oneself is through describing mobile terminal according to its function.In the following, for the sake of brevity, will description such as folded form, Slide type mobile terminal in various types of mobile terminals of board-type, oscillating-type, slide type mobile terminal etc., which is used as, to be shown Example.Therefore, the present invention can be applied to any kind of mobile terminal, and be not limited to slide type mobile terminal.
Mobile terminal 100 as shown in Figure 1 may be constructed such that using via frame or grouping send data it is all if any Line and wireless communication device and satellite-based communication device operate.
Based on above-mentioned mobile terminal hardware configuration, each embodiment of denoising device of the present invention is proposed.
Referring to Fig. 2, Fig. 2 is the module diagram of the first embodiment of denoising device of the present invention, which includes:
Audio-video document is converted into text file for carrying out speech recognition to audio-video document by conversion module 10;
Computing module 20, for calculating separately the similarity between the two neighboring sentence in this article this document;
Judgment module 30, for judging whether the two neighboring sentence is deposited according to the similarity between the two neighboring sentence In noise sentence;
Determining module 40, for, there are when noise sentence, determining that this is adjacent according to preset strategy in the two neighboring sentence A sentence in two sentences is noise sentence;
Noise reduction module 50, for filtering out noise sentence from the audio-video document.
The audio-video document can be the audio file that recording device is recorded, which can be recording Pen, or the mobile terminal with sound-recording function, such as smart phone, tablet computer.
During in session, during training or other need the occasion recorded, start the recording device and recorded, recorded After the completion, audio-video document is generated.
The conversion module 10 can obtain the audio-video document of recording device recording by wired or wireless way, such as one In embodiment, the audio-video document of recording device recording is obtained by WiFi mode.Optionally, which can record In the recording process of mixer, speech recognition is carried out to the audio-video document that the recording device is recorded;The conversion module 10 can also After the completion of recording device is recorded, speech recognition is carried out to the audio-video document that the recording device is recorded.
The conversion module 10 carries out speech recognition to the audio-video document, obtains text file;This article this document includes more The position of a sentence and each sentence in audio-video document.Specifically, the conversion module 10 is using speech recognition technology to this Audio-video document carries out speech recognition, such as: the audio-video document being divided multiframe according to the scheduled frame period time, calls voice The audio-video document after sub-frame processing is converted text one by one by identification technology, obtains sentence, then each sentence is existed Position and corresponding text in audio-video document save as one section in text file, and this article this document includes audio-video text The position of all sentences and each sentence in audio-video document in part.As in one embodiment, to the audio-video document into After row speech recognition, 1000 sentences are obtained, then there are 1000 sections in this article this document, according to recognition sequence, every section of correspondence one A sentence identified;In this article this document, position of each sentence in audio-video document is recordable in the sentence most Front or backmost, such as in one embodiment, by position of each sentence in audio-video document be recorded in the sentence most before Face, i.e., in the either segment of this article this document, most start to write is position of this section of sentence in audio-video document, is then write It is this section of corresponding sentence.
Position of each sentence in audio-video document is time shaft position of each sentence in audio-video document, such as Position of one sentence in audio-video document are as follows: the 5th second to the 8th second.
The audio-video document is converted into text file by the conversion module 10, optionally, the filename of this article this document with The filename of the audio-video document is identical, and user can be facilitated to understand which audio-video document this article this document corresponds to.
The computing module 20 calculates the similarity between the two neighboring sentence in this article this document, specifically, by text Each sentence in file is converted to vector model, calculates the similar of the two sentences according to the vector model of two neighboring sentence Degree.The vector model of two neighboring sentence dimension having the same, such as vector model of one of sentence indicate are as follows: a= (x11, x21, x31... ..., xn1), the vector model of another sentence indicates are as follows: b=(x12, x22, x32... ..., xn2), wherein xn1Indicate n-th of component of vector a, xn2The number of dimensions of n-th of component of expression vector b, vector a and vector b are all n. When the dimension difference of the vector model of the two neighboring sentence, then the vector model of the sentence less to dimension carries out dimension benefit It fills, so that the dimension of the vector model of two neighboring sentence is identical;Specifically, in the vector mould of the sentence less to the dimension When type carries out dimension supplement, the corresponding value of the dimension supplemented in the vector model of the less sentence of the dimension is indicated with 0, is such as existed In one embodiment, the vector model of one of sentence is indicated are as follows: a=(x11、x21、x31..., xn1), another sentence to Amount model is expressed as: b=(x12, x22, x32..., xj2), wherein j < n is then modified vector model b, after modification Vector model are as follows: b '=(x12、x22、x32..., xj2, 0,0 ... ..., 0), modified vector model b ' and vector model a Dimension with identical quantity.
Judgment module 30, for judging whether the two neighboring sentence is deposited according to the similarity between the two neighboring sentence In noise sentence.Similarity between two neighboring sentence is bigger, which more may be non-noise sentence, i.e., Noise sentence is not present in two neighboring sentence, conversely, the similarity between two neighboring sentence is lower, this two neighboring sentence is just More there may be noise sentences.It is common, in a meeting scene, in session during, similarity between each sentence compared with Height, in the halftime, people chat this or that, and the similarity between each sentence is lower.
Two neighboring sentence can be respectively defined as the first sentence and the second sentence, wherein the first sentence is preceding language Sentence.
The determining module 40, there are when noise sentence, determines the two neighboring language according to preset strategy in two neighboring sentence A sentence in sentence is noise sentence.
Optionally, preset strategy are as follows: determine that first sentence in the two neighboring sentence is noise sentence.
Optionally, preset strategy are as follows: determine that second sentence in the two neighboring sentence is noise sentence.
Optionally, this prestores strategy are as follows: calculates the previous of the first sentence in the two neighboring sentence and first sentence The similarity of sentence, and the similarity of the latter sentence of the second sentence and second sentence in the two neighboring sentence is calculated, According to the latter sentence of the similarity and the second sentence of the first sentence and the previous sentence of first sentence and second sentence Similarity determines that the sentence in two neighboring sentence is noise sentence;Specifically, in first sentence and first sentence When the similarity of previous sentence is greater than the similarity of the second sentence and the latter sentence of second sentence, the two neighboring language is determined The second sentence in sentence is noise sentence, conversely, the similarity in first sentence and the previous sentence of first sentence is less than Or when equal to the similarity of the second sentence and the latter sentence of second sentence, determine the first sentence in the two neighboring sentence For noise sentence.In the similarity for the previous sentence for calculating first sentence and first sentence, and calculate second sentence with When the similarity of the latter sentence of second sentence, calculation and computing module 20 calculate the similarity of two neighboring sentence Calculation is identical, and this will not be repeated here.
The noise reduction module 50 filters out noise sentence from audio-video document, can to reduce the noise in audio-video document Choosing, the noise reduction module 50 is from the position of the noise sentence in audio-video document is found in text file, according to the noise Position of the sentence in audio-video document, noise sentence is filtered out from audio-video document, carries out noise reduction to the audio-video document. Optionally, which can also regard the noise sentence in sound when filtering out noise sentence from audio-video document Corresponding position in frequency file fills preset music, and e.g., which can be light music.
Optionally, which can be in the recording process of recording device, by noise sentence from the audio-video document In filter out;The noise reduction module 50 can also filter out noise sentence after the completion of recording device is recorded from the audio-video document.
Using above-described embodiment, by carrying out speech recognition to audio-video document, audio-video document is converted into text text Part;The similarity between the two neighboring sentence in this article this document is calculated separately, and according between the two neighboring sentence Similarity judges the two neighboring sentence with the presence or absence of noise sentence;In the two neighboring sentence there are when noise sentence, according to Preset strategy determines that the sentence in the two neighboring sentence is noise sentence, and noise sentence is filtered from the audio-video document It removes;The audio-video document is first converted into text file, noise language is determined according to the similarity of sentence each in this article this document Sentence, then noise sentence is filtered out from audio-video document, to reduce the noise in audio-video document, can more objectively identify Noise sentence in audio-video document, and be not affected by the surrounding environment, it is greatly improved the accuracy rate of removal noise.
It is the module diagram of the second embodiment of denoising device of the present invention referring to Fig. 3, Fig. 3.
Difference based on the first embodiment of above-mentioned denoising device, the second embodiment and first embodiment is, the drop It makes an uproar device further include: word segmentation module 60 respectively obtains each sentence for segmenting to each sentence in this article this document Word;The computing module 20 includes:
Acquiring unit 21, the corresponding number of word for obtaining two neighboring sentence respectively according to number dictionary;
Unit 22 is established, for the corresponding number of word according to two neighboring sentence, respectively two neighboring sentence is built Vertical vector model;
First computing unit 23 calculates between two neighboring sentence for the vector model according to two neighboring sentence Euclidean distance;
Second computing unit 24, for obtaining two neighboring language according to the Euclidean distance between two neighboring sentence Similarity between sentence.
The word segmentation module 60 can segment each sentence in this article this document according to preset dictionary for word segmentation, obtain The word of each sentence after being segmented, obtains such as to sentence " the problem of today, main topic of discussion was about project process " Word is successively are as follows: today, discussion, theme, be, about, project, progress, problem, totally 10 words;To a sentence point The word that word obtains can be identical, as occur in the word segmentation result of above-mentioned sentence twice " ".
Optionally, each sentence in 60 cutting this article this document of word segmentation module and all participles of each sentence are obtained Mode (a such as sentence has 2 kinds of participle modes, and another sentence has 5 kinds of participle modes), calculates all participle sides of each sentence The sentence weight of formula, the sentence weight of more every kind of participle mode, according to preset selection strategy owning from each sentence A kind of participle mode is selected in participle mode, and corresponding sentence is segmented according to the participle mode of selection, is segmented As a result.As in one embodiment, a sentence, which has, segments mode in 5, then calculate separately using this in 5 participle mode to the sentence into Sentence weight when row participle, when case statement maximum weight corresponding participle mode, further according to the participle mode pair of the selection The sentence is segmented.The participle mode of each sentence can be different.
The corresponding relationship of word and number is recorded in number dictionary, the corresponding number of each word, same number is only It can a corresponding word, i.e. one word of the same number expression.
The acquiring unit 21 obtains the corresponding number of word of two neighboring sentence according to number dictionary;This establishes unit 22 establish vector model according to the corresponding number of word of the two neighboring sentence, respectively two neighboring sentence.It is common, one It include N number of word after sentence participle, then the corresponding vector model of the sentence is just N-dimensional, and a such as sentence includes 5 word (this 5 Can have partial words identical in word), then the corresponding vector model of the sentence is just five dimensions.If a sentence is that " you have a meal ", the corresponding word of the sentence is " you, have a meal, ", then the sentence corresponding vector model is the four-dimension, wherein according to Number dictionary, finds that the corresponding number of word " you " is 110, the corresponding number of word " having a meal " is 98, word " " is corresponding Number be 150, the corresponding number of number " " is 90, then the vector model of the sentence are as follows: c=(110,98,150,90).
Optionally, which can be preset, and all audio-video documents all share the number dictionary, in the number word In allusion quotation, the corresponding number of each word is had recorded.
Optionally, which generates according to the audio-video document, specifically, to all languages in the audio-video document The word of sentence is summarized, and then each word is numbered according to the number that user inputs, and generates number dictionary.Such as one In embodiment, the word of all sentences in the audio-video document has 10,000, this 10,000 words do not repeat, and user is as required This 10,000 words are numbered, the number of each word is different.
The value of each component in the vector model of sentence corresponds to the number of the word of the component.Such as the vector mould of a sentence Type are as follows: c=(110,98,150,90), i.e. the first of sentence component value are 110, and the word of first component is " you ".
First computing unit 23 calculates the Euclidean distance between two neighboring sentence, specifically, passing through following public affairs Formula calculates:
Wherein n is the dimension of two sentences, xi1It indicates in two neighboring sentence I-th of component of the vector model of one of sentence, xi2Indicate the vector model of another sentence in two neighboring sentence I-th of component.
Second computing unit 24 calculates the similarity between two neighboring sentence, specifically, between two neighboring sentence Similarity be calculated by the following formula:
Sim=1/ (1+D), wherein Sim indicates the similarity of two neighboring sentence, and D indicates that the Europe of two neighboring sentence is several Reed distance.
It can be seen that from above-mentioned calculating formula of similarity when the Euclidean distance between two neighboring sentence is smaller, phase Similarity between adjacent two sentences is bigger;Conversely, the Euclidean distance between two neighboring sentence is bigger, adjacent two Similarity between a sentence is with regard to smaller.
Each sentence in text file is segmented by word segmentation module 60, and by acquiring unit 21 according to number Dictionary obtains the corresponding number of word of two neighboring sentence respectively, and establishing unit 22 is that two neighboring sentence establishes vector mould Type, then by the first computing unit according to the vector model of two neighboring sentence, calculate Europe between two neighboring sentence it is several in Moral distance;Then two neighboring language is obtained according to the Euclidean distance between two neighboring sentence by the second computing unit Similarity between sentence;The similarity in text file between two neighboring sentence can be more accurately calculated, and then accurate The two neighboring sentence of determination whether there is noise sentence, with improve removal noise accuracy rate.
It is the module diagram of the 3rd embodiment of denoising device of the present invention referring to Fig. 4, Fig. 4.
Difference based on the first embodiment of above-mentioned denoising device, the 3rd embodiment and first embodiment is that this is sentenced Disconnected module 30 includes:
Judging unit 31, for judging whether the similarity between the two neighboring sentence is less than preset similarity threshold Value;
First determination unit 32 is less than preset similarity threshold for the similarity between the two neighboring sentence When, determine that there are noise sentences for the two neighboring sentence.
The similarity threshold can be preset as needed, which judges similar between two neighboring sentence Whether degree is less than preset similarity threshold, to determine that the two neighboring sentence whether there is noise sentence.
In the present embodiment the judgment module 30 according to the similarity between the two neighboring sentence judge this adjacent two When a sentence whether there is noise sentence, the judging unit 31 in the judgment module 30 will be similar between the two neighboring sentence Degree is compared with preset similarity threshold, which determines adjacent according to the judging result of judging unit 31 Two sentences whether there is noise sentence, and can more objectively identify in audio-video document whether there is noise sentence, with Improve the accuracy rate of removal noise.
It is the module diagram of the fourth embodiment of denoising device of the present invention referring to Fig. 5, Fig. 5.
Difference based on the first embodiment of above-mentioned denoising device, the fourth embodiment and first embodiment is that this is really Cover half block 40 includes:
Third computing unit 41 calculates in the two neighboring sentence in the two neighboring sentence there are when noise sentence The similarity of the sentence of the predetermined number since first sentence in first sentence and this article this document, and to calculate this adjacent The similarity of the sentence of the predetermined number since first sentence in the second sentence and this article this document in two sentences;
Second determination unit 42, for according in the first sentence and this article this document in the two neighboring sentence from the The second sentence and this article this document in the similarity of the sentence for the predetermined number that one sentence starts and the two neighboring sentence In the predetermined number since first sentence sentence similarity, determine the first sentence in the two neighboring sentence or Second sentence is noise sentence.
The predetermined number can be set as needed, common, which is 20.
The third computing unit 41 calculate in the first sentence and this article this document in the two neighboring sentence from first The similarity of the sentence for the predetermined number that a sentence starts obtains multiple similarities, e.g., when predetermined number is 20, then successively The first sentence calculated in the two neighboring sentence is similar to 20 since first sentence the sentence in text file Degree, obtains 20 similarities.
The third computing unit 41 calculate in the second sentence and this article this document in the two neighboring sentence from first The similarity of the sentence for the predetermined number that a sentence starts obtains multiple similarities, e.g., when predetermined number is 20, then successively The second sentence calculated in the two neighboring sentence is similar to 20 since first sentence the sentence in text file Degree, obtains 20 similarities.
Calculate the two neighboring sentence in the first sentence with it is pre- since first sentence in this article this document If the similarity of the sentence of number, and calculate in the second sentence in the two neighboring sentence and this article this document from first When the similarity of the sentence for the predetermined number that sentence starts, calculation calculates the similar of two neighboring sentence to computing module 20 The calculation of degree is identical, and this will not be repeated here.
Second determination unit by the first sentence and this article this document in the two neighboring sentence from first language The similarity summation of the sentence for the predetermined number that sentence starts, obtains the first similarity total value;And it will be in the two neighboring sentence The similarity of the sentence of the predetermined number since first sentence in second sentence and this article this document is summed, and obtains second Similarity total value;The first sentence in the two neighboring sentence is determined according to the first similarity total value and the second similarity total value Or second sentence be noise sentence, specifically, when the first similarity total value be greater than the second similarity total value when, it is determined that the phase The second sentence in adjacent two sentences is noise sentence, when the first similarity total value is less than or equal to the second similarity total value When, it is determined that the first sentence in the two neighboring sentence is noise sentence.
In the present embodiment the determining module in two neighboring sentence there are when noise sentence, according to the two neighboring sentence In the first sentence and the similarity of the sentence of the predetermined number since first sentence in this article this document and this is adjacent The similarity of the sentence of the predetermined number since first sentence in the second sentence and this article this document in two sentences, Determine that the first sentence or the second sentence in the two neighboring sentence are noise sentence;It can more objectively identify two neighboring Noise sentence in sentence, to improve the accuracy rate of removal noise.
It is the module diagram of the 5th embodiment of denoising device of the present invention referring to Fig. 6, Fig. 6.
Based on the first embodiment of above-mentioned denoising device, the difference of the 5th embodiment and first embodiment is, this is really Cover half block 40 includes:
Prompt unit 43, for, there are when noise sentence, issuing the user with prompt information in the two neighboring sentence, for It is noise sentence that family, which selects a sentence in the two neighboring sentence according to the prompt information,;
Third determination unit 44, the selection instruction inputted for receiving user according to the prompt information, and according to the selection Instructing the sentence determined in the two neighboring sentence is noise sentence.
The prompt unit 43 issues the user with prompt information, includes two options in the prompt information, and an option is The first sentence in the two neighboring sentence is selected, another option is the second sentence selected in the two neighboring sentence, The particular content of the two neighboring sentence is shown in the prompt information, as shown in fig. 7, if the first sentence is " you have had a meal ", Second sentence is " the problem of today, main topic of discussion was about project process ".
User selects a sentence in the two neighboring sentence to feel for noise sentence, such as user according to the prompt information The first sentence in the two neighboring sentence may be noise sentence, then selects first sentence.
The third determination unit 44 receives the selection instruction that user inputs according to the prompt information, if selection instruction is selection The first sentence in the two neighboring sentence is then determined as noise sentence by the first sentence in two neighboring sentence;Such as selection The second sentence in the two neighboring sentence is then determined as noise language to select the second sentence in two neighboring sentence by instruction Sentence.
In the present embodiment, which issues the user with prompt in two neighboring sentence there are when noise sentence Information, the third determination unit 44 determine in two neighboring sentence according to the selection instruction that user is inputted based on the prompt information One sentence is noise sentence, the more flexible noise sentence determined in two neighboring sentence, to improve the standard of removal noise True rate, better user experience.
The present invention further provides a kind of noise-reduction methods.
Referring to Fig. 8, Fig. 8 is the flow diagram of the first embodiment of noise-reduction method of the present invention, which includes:
S10, speech recognition is carried out to audio-video document, audio-video document is converted into text file.
The audio-video document can be the audio file that recording device is recorded, which can be recording Pen, or the mobile terminal with sound-recording function, such as smart phone, tablet computer.
During in session, during training or other need the occasion recorded, start the recording device and recorded, recorded After the completion, audio-video document is generated.
The audio-video document of recording device recording can be obtained by wired or wireless way, such as in one embodiment, led to Cross the audio-video document that WiFi mode obtains recording device recording.It optionally, can be in the recording process of recording device, to this The audio-video document that recording device is recorded carries out speech recognition;The recording device can also be recorded after the completion of recording device is recorded The audio-video document of system carries out speech recognition.
In this step, speech recognition is carried out to the audio-video document, obtains text file;This article this document includes multiple The position of sentence and each sentence in audio-video document.Specifically, being carried out using speech recognition technology to the audio-video document Speech recognition, such as: the audio-video document being divided into multiframe according to the scheduled frame period time, calls speech recognition technology by framing Treated, and audio-video document is converted into text one by one, obtains sentence, then by each sentence in audio-video document Position and corresponding text save as one section in text file, and this article this document includes all sentences in the audio-video document And position of each sentence in audio-video document.As in one embodiment, after carrying out speech recognition to the audio-video document, obtain To 1000 sentences, then there are 1000 sections in this article this document, according to recognition sequence, one sentence identified of every section of correspondence; In this article this document, position of each sentence in audio-video document is recordable in the foremost or backmost of the sentence, such as In one embodiment, position of each sentence in audio-video document is recorded in the foremost of the sentence, i.e., in this article this paper In the either segment of part, most start to write is position of this section of sentence in audio-video document, and what is then write is this section of corresponding language Sentence.
Position of each sentence in audio-video document is time shaft position of each sentence in audio-video document, such as Position of one sentence in audio-video document are as follows: the 5th second to the 8th second.
In this step, which is converted into text file, optionally, the filename of this article this document with should The filename of audio-video document is identical, and user can be facilitated to understand which audio-video document this article this document corresponds to.
S20, similarity between two neighboring sentence in this article this document is calculated separately.
The similarity between the two neighboring sentence in this article this document is calculated, specifically, by each of text file Sentence is converted to vector model, and the similarity of the two sentences is calculated according to the vector model of two neighboring sentence.It is two neighboring The vector model of sentence dimension having the same, such as vector model of one of sentence indicate are as follows: a=(x11, x21, x31... ..., xn1), the vector model of another sentence indicates are as follows: b=(x12, x22, x32... ..., xn2), wherein xn1Indicate to Measure n-th of component of a, xn2The number of dimensions of n-th of component of expression vector b, vector a and vector b are all n.When this is adjacent When the dimension difference of the vector model of two sentences, then the vector model of the sentence less to dimension carries out dimension supplement, so that The dimension for obtaining the vector model of two neighboring sentence is identical;Specifically, the vector model in the sentence less to the dimension carries out When dimension is supplemented, the corresponding value of the dimension supplemented in the vector model of the less sentence of the dimension is indicated with 0, is such as implemented one In example, the vector model of one of sentence is indicated are as follows: a=(x11、x21、x31..., xn1), the vector model of another sentence It indicates are as follows: b=(x12、x22、x32..., xj2), wherein j < n is then modified vector model b, modified vector Model are as follows: b '=(x12、x22、x32..., xj2, 0,0 ... ..., 0), modified vector model b ' and vector model a has phase With the dimension of quantity.
S30, judged the two neighboring sentence with the presence or absence of noise sentence according to the similarity between the two neighboring sentence.
Similarity between two neighboring sentence is bigger, which more may be non-noise sentence, i.e. phase Noise sentence is not present in adjacent two sentences, conversely, the similarity between two neighboring sentence is lower, this two neighboring sentence is got over There may be noise sentences.It is common, in a meeting scene, in session during, the similarity between each sentence is higher, In the halftime, people chat this or that, and the similarity between each sentence is lower.
Two neighboring sentence can be respectively defined as the first sentence and the second sentence, wherein the first sentence is preceding language Sentence.
S40, in the two neighboring sentence there are when noise sentence, determined in the two neighboring sentence according to preset strategy One sentence is noise sentence.
In this step, the two neighboring language is determined according to preset strategy there are when noise sentence in two neighboring sentence A sentence in sentence is noise sentence.
Optionally, preset strategy are as follows: determine that first sentence in the two neighboring sentence is noise sentence.
Optionally, preset strategy are as follows: determine that second sentence in the two neighboring sentence is noise sentence.
Optionally, this prestores strategy are as follows: calculates the previous of the first sentence in the two neighboring sentence and first sentence The similarity of sentence, and the similarity of the latter sentence of the second sentence and second sentence in the two neighboring sentence is calculated, According to the latter sentence of the similarity and the second sentence of the first sentence and the previous sentence of first sentence and second sentence Similarity determines that the sentence in two neighboring sentence is noise sentence;Specifically, in first sentence and first sentence When the similarity of previous sentence is greater than the similarity of the second sentence and the latter sentence of second sentence, the two neighboring language is determined The second sentence in sentence is noise sentence, conversely, the similarity in first sentence and the previous sentence of first sentence is less than Or when equal to the similarity of the second sentence and the latter sentence of second sentence, determine the first sentence in the two neighboring sentence For noise sentence.In the similarity for the previous sentence for calculating first sentence and first sentence, and calculate second sentence with When the similarity of the latter sentence of second sentence, the similarity of two neighboring sentence is calculated in calculation and step S30 Calculation is identical, and this will not be repeated here.
S50, noise sentence is filtered out from the audio-video document.
In this step, noise sentence is filtered out from audio-video document, it is optional to reduce the noise in audio-video document , from the position of the noise sentence in audio-video document is found in text file, according to the noise sentence in audio-video text Position in part filters out noise sentence from audio-video document, carries out noise reduction to the audio-video document.Optionally, it will make an uproar When sound sentence is filtered out from audio-video document, corresponding position of the noise sentence in audio-video document can also be filled default Music, e.g., the preset music be light music.
Optionally, noise sentence can be filtered out from the audio-video document in the recording process of recording device;It can also be After the completion of recording device is recorded, noise sentence is filtered out from the audio-video document.
Using above-described embodiment, speech recognition is carried out by the audio-video document recorded to recording device, by audio-video text Part is converted into text file;The similarity between the two neighboring sentence in this article this document is calculated separately, and adjacent according to this Similarity between two sentences judges the two neighboring sentence with the presence or absence of noise sentence;Exist in the two neighboring sentence and makes an uproar When sound sentence, according to preset strategy determine the sentence in the two neighboring sentence be noise sentence, and by noise sentence from this It is filtered out in audio-video document;The audio-video document is first converted into text file, according to the phase of sentence each in this article this document Noise sentence is determined like spending, then noise sentence is filtered out from audio-video document, it, can be more to reduce the noise in audio-video document Add the noise sentence objectively identified in audio-video document, and be not affected by the surrounding environment, is greatly improved removal noise Accuracy rate.
It is the flow diagram of the second embodiment of noise-reduction method of the present invention referring to Fig. 9, Fig. 9.
Difference based on the first embodiment of above-mentioned noise-reduction method, the second embodiment and first embodiment is, in step Before rapid S20, which further includes S60, segments to each sentence in this article this document, respectively obtains each language The word of sentence;
Step S20 includes: the corresponding number of word that S21 obtains two neighboring sentence according to number dictionary respectively; S22, the corresponding number of word according to two neighboring sentence, respectively two neighboring sentence establish vector model;S23, basis The vector model of two neighboring sentence calculates the Euclidean distance between two neighboring sentence;S24, according to two neighboring language Euclidean distance between sentence, obtains the similarity between two neighboring sentence.
In step S60, each sentence in this article this document can be segmented according to preset dictionary for word segmentation, be obtained To the word of each sentence, such as to sentence " the problem of today, main topic of discussion was about project process ", after being segmented, obtain Word successively are as follows: today, discussion, theme, be, about, project, progress, problem, totally 10 words;To a sentence Segmenting obtained word can be identical, as occur in the word segmentation result of above-mentioned sentence twice " ".
Optionally, in step S60, each sentence in cutting this article this document simultaneously obtains all points of each sentence Word mode (a such as sentence has 2 kinds of participle modes, and another sentence has 5 kinds of participle modes), calculates all participles of each sentence The sentence weight of mode, the sentence weight of more every kind of participle mode, according to preset selection strategy from the institute of each sentence Have and select a kind of participle mode in participle mode, and corresponding sentence is segmented according to the participle mode of selection, is divided Word result.As in one embodiment, a sentence, which has, segments mode in 5, then calculates separately using this in 5 participle mode to the sentence Sentence weight when being segmented, when case statement maximum weight corresponding participle mode, further according to the participle mode of the selection The sentence is segmented.The participle mode of each sentence can be different.
The corresponding relationship of word and number is recorded in number dictionary, the corresponding number of each word, same number is only It can a corresponding word, i.e. one word of the same number expression.
In step S21, according to number dictionary, the corresponding number of word of two neighboring sentence is obtained;In the step In S22, according to the corresponding number of the word of the two neighboring sentence, vector model is established for two neighboring sentence.It is common, one It include N number of word after sentence participle, then the corresponding vector model of the sentence is just N-dimensional, and a such as sentence includes 5 word (this 5 Can have partial words identical in word), then the corresponding vector model of the sentence is just five dimensions.If a sentence is that " you have a meal ", the corresponding word of the sentence is " you, have a meal, ", then the sentence corresponding vector model is the four-dimension, wherein according to Number dictionary, finds that the corresponding number of word " you " is 110, the corresponding number of word " having a meal " is 98, word " " is corresponding Number be 150, the corresponding number of number " " is 90, then the vector model of the sentence are as follows: c=(110,98,150,90).
Optionally, which can be preset, and all audio-video documents all share the number dictionary, in the number word In allusion quotation, the corresponding number of each word is had recorded.
Optionally, which generates according to the audio-video document, specifically, to all languages in the audio-video document The word of sentence is summarized, and then each word is numbered according to the number that user inputs, and generates number dictionary.Such as one In embodiment, the word of all sentences in the audio-video document has 10,000, this 10,000 words do not repeat, and user is as required This 10,000 words are numbered, the number of each word is different.
The value of each component in the vector model of sentence corresponds to the number of the word of the component.Such as the vector mould of a sentence Type are as follows: c=(110,98,150,90), i.e. the first of sentence component value are 110, and the word of first component is " you ".
In step S23, the Euclidean distance between two neighboring sentence is calculated, specifically, passing through following formula It calculates:
Wherein n is the dimension of two sentences, xi1It indicates in two neighboring sentence I-th of component of the vector model of one of sentence, xi2Indicate the vector model of another sentence in two neighboring sentence I-th of component.
In step S24, the similarity between two neighboring sentence is calculated, specifically, between two neighboring sentence Similarity is calculated by the following formula:
Sim=1/ (1+D), wherein Sim indicates the similarity of two neighboring sentence, and D indicates that the Europe of two neighboring sentence is several Reed distance.
It can be seen that from above-mentioned calculating formula of similarity when the Euclidean distance between two neighboring sentence is smaller, phase Similarity between adjacent two sentences is bigger;Conversely, the Euclidean distance between two neighboring sentence is bigger, adjacent two Similarity between a sentence is with regard to smaller.
Using above-described embodiment, each sentence in text file is segmented, phase is obtained according to number dictionary respectively The corresponding number of word of adjacent two sentences is that two neighboring sentence is established according to the corresponding number of the word of two neighboring sentence Vector model calculates the Euclidean distance between two neighboring sentence further according to the vector model of two neighboring sentence;Then According to the Euclidean distance between two neighboring sentence, the similarity between two neighboring sentence is obtained;It can more accurately The similarity in text file between two neighboring sentence is calculated, and then accurately determines that two neighboring sentence whether there is and makes an uproar Sound sentence, to improve the accuracy rate of removal noise.
0, Figure 10 is the flow diagram of the 3rd embodiment of noise-reduction method of the present invention referring to Fig.1.
Difference based on the first embodiment of above-mentioned noise-reduction method, the 3rd embodiment and first embodiment is, the step Suddenly S30 includes:
S31, judge whether the similarity between the two neighboring sentence is less than preset similarity threshold.
The similarity threshold can be preset as needed, in this step, be judged similar between two neighboring sentence Whether degree is less than preset similarity threshold, to determine that the two neighboring sentence whether there is noise sentence.
When S32, the similarity between the two neighboring sentence are less than preset similarity threshold, determine that this is two neighboring There are noise sentences for sentence.
Using above-described embodiment, whether which is being judged according to the similarity between the two neighboring sentence There are when noise sentence, the similarity between the two neighboring sentence is compared with preset similarity threshold, according to than Relatively result determines with the presence or absence of noise sentence, whether depositing in audio-video document can be more objectively identified for two neighboring sentence In noise sentence, to improve the accuracy rate of removal noise.
1, Figure 11 is the flow diagram of the fourth embodiment of noise-reduction method of the present invention referring to Fig.1.
Difference based on the first embodiment of above-mentioned noise-reduction method, the fourth embodiment and first embodiment is, the step Suddenly S40 includes:
S41, it there are the first sentence when noise sentence, calculated in the two neighboring sentence and is somebody's turn to do in the two neighboring sentence The similarity of the sentence of the predetermined number since first sentence in text file, and calculate in the two neighboring sentence The similarity of the sentence of the predetermined number since first sentence in second sentence and this article this document.
The predetermined number can be set as needed, common, which is 20.
In this step, calculate in the first sentence in the two neighboring sentence and this article this document from first sentence The similarity of the sentence of the predetermined number of beginning obtains multiple similarities, and e.g., when predetermined number is 20, then successively calculating should The similarity of 20 since first sentence the sentence in the first sentence and text file in two neighboring sentence, obtains 20 similarities.
In this step, calculate in the second sentence in the two neighboring sentence and this article this document from first sentence The similarity of the sentence of the predetermined number of beginning obtains multiple similarities, and e.g., when predetermined number is 20, then successively calculating should The similarity of 20 since first sentence the sentence in the second sentence and text file in two neighboring sentence, obtains 20 similarities.
Calculate the two neighboring sentence in the first sentence with it is pre- since first sentence in this article this document If the similarity of the sentence of number, and calculate in the second sentence in the two neighboring sentence and this article this document from first When the similarity of the sentence for the predetermined number that sentence starts, the similar of two neighboring sentence is calculated in calculation and step S30 The calculation of degree is identical, and this will not be repeated here.
S42, according in the two neighboring sentence the first sentence and this article this document in since first sentence In the second sentence in the similarity of the sentence of predetermined number and the two neighboring sentence and this article this document from first language The similarity of the sentence for the predetermined number that sentence starts determines that the first sentence or the second sentence in the two neighboring sentence are noise Sentence.
By default since first sentence in the first sentence and this article this document in the two neighboring sentence The similarity of several sentences is summed, and the first similarity total value is obtained;And by the second sentence and this article in the two neighboring sentence The similarity of the sentence of the predetermined number since first sentence in this document is summed, and the second similarity total value is obtained;Root Determine that the first sentence or the second sentence in the two neighboring sentence are according to the first similarity total value and the second similarity total value Noise sentence, specifically, when the first similarity total value is greater than the second similarity total value, it is determined that in the two neighboring sentence The second sentence be noise sentence, when the first similarity total value be less than or equal to the second similarity total value when, it is determined that the phase The first sentence in adjacent two sentences is noise sentence.
Using above-described embodiment, in two neighboring sentence there are when noise sentence, according in the two neighboring sentence The similarity and the two neighboring language of the sentence of the predetermined number since first sentence in one sentence and this article this document The similarity of the sentence of the predetermined number since first sentence in the second sentence and this article this document in sentence, determining should The first sentence or the second sentence in two neighboring sentence are noise sentence;It can more objectively identify in two neighboring sentence Noise sentence, with improve removal noise accuracy rate.
2, Figure 12 is the flow diagram of the 5th embodiment of noise-reduction method of the present invention referring to Fig.1.
Based on the first embodiment of above-mentioned noise-reduction method, the difference of the 5th embodiment and first embodiment is, the step Suddenly S40 includes:
S43, in the two neighboring sentence there are when noise sentence, issue the user with prompt information, mentioned for user according to this Showing that information selects a sentence in the two neighboring sentence is noise sentence.
In this step, prompt information is issued the user with, includes two options in the prompt information, an option is choosing The first sentence in the two neighboring sentence is selected, another option is the second sentence selected in the two neighboring sentence, at this The particular content of the two neighboring sentence is shown in prompt information, as shown in fig. 7, if the first sentence is " you have had a meal ", the Two sentences are " the problem of today, main topic of discussion was about project process ".
User selects a sentence in the two neighboring sentence to feel for noise sentence, such as user according to the prompt information The first sentence in the two neighboring sentence may be noise sentence, then selects first sentence.
S44, receive the selection instruction that inputs according to the prompt information of user, and according to the selection instruction determine this adjacent two A sentence in a sentence is noise sentence.
In this step, the selection instruction that user inputs according to the prompt information is received, if selection instruction is that selection is adjacent The first sentence in the two neighboring sentence is then determined as noise sentence by the first sentence in two sentences;Such as selection instruction To select the second sentence in two neighboring sentence, then the second sentence in the two neighboring sentence is determined as noise sentence.
Using above-described embodiment, in two neighboring sentence there are when noise sentence, issuing the user with prompt information, and according to The selection instruction that user is inputted based on the prompt information determines that the sentence in two neighboring sentence is noise sentence, more flexible The noise sentence determined in two neighboring sentence, with improve removal noise accuracy rate, better user experience.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side Method can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but in many cases The former is more preferably embodiment.Based on this understanding, technical solution of the present invention substantially in other words does the prior art The part contributed out can be embodied in the form of software products, which is stored in a storage medium In (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that a terminal device (can be mobile phone, computer, clothes Business device, air conditioner or the network equipment etc.) execute the method that each embodiment of the present invention is somebody's turn to do.
The above is only a preferred embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills Art field, is included within the scope of the present invention.

Claims (8)

1. a kind of denoising device, which is characterized in that the denoising device includes:
The audio-video document is converted into text file for carrying out speech recognition to audio-video document by conversion module;
Computing module, for calculating separately the similarity between the two neighboring sentence in the text file;
Judgment module, for judging that the two neighboring sentence whether there is according to the similarity between the two neighboring sentence Noise sentence;
Determining module, for, there are when noise sentence, determining described adjacent two according to preset strategy in the two neighboring sentence A sentence in a sentence is noise sentence;
Noise reduction module, for filtering out the noise sentence from the audio-video document;
Wherein, the computing module is also used to each sentence in the text file being converted to vector model, and according to phase The vector model of adjacent two sentences calculates the similarity of the two neighboring sentence;
The determining module includes:
Third computing unit calculates the in the two neighboring sentence in the two neighboring sentence there are when noise sentence The similarity of the sentence of the predetermined number since first sentence in one sentence and the text file, and calculate the phase The phase of the second sentence and the sentence of the predetermined number since first sentence in the text file in adjacent two sentences Like degree;
Second determination unit, for according in the first sentence and the text file in the two neighboring sentence from first The second sentence and text text in the similarity of the sentence for the predetermined number that a sentence starts and the two neighboring sentence The similarity of the sentence of the predetermined number since first sentence in part determines the first language in the two neighboring sentence Sentence or the second sentence are noise sentence.
2. denoising device as described in claim 1, which is characterized in that the denoising device further include: word segmentation module, for pair Each sentence in the text file is segmented, and the word of each sentence is respectively obtained;
The computing module includes:
Acquiring unit, the corresponding number of word for obtaining two neighboring sentence respectively according to number dictionary;
Unit is established, for the corresponding number of word according to two neighboring sentence, respectively two neighboring sentence establishes vector Model;
First computing unit, for the vector model according to two neighboring sentence, calculate Europe between two neighboring sentence it is several in Moral distance;
Second computing unit, for obtaining between two neighboring sentence according to the Euclidean distance between two neighboring sentence Similarity.
3. denoising device as claimed in claim 2, which is characterized in that the similarity between two neighboring sentence passes through following public affairs Formula calculates:
Sim=1/ (1+D), wherein Sim indicates the similarity of two neighboring sentence, and D indicates the euclidean of two neighboring sentence Distance.
4. denoising device as described in claim 1, which is characterized in that judgment module includes:
Judging unit, for judging whether the similarity between the two neighboring sentence is less than preset similarity threshold;
First determination unit, when being less than preset similarity threshold for the similarity between the two neighboring sentence, really There are noise sentences for the fixed two neighboring sentence.
5. a kind of noise-reduction method, which is characterized in that the noise-reduction method includes:
Speech recognition is carried out to audio-video document, the audio-video document is converted into text file;
Calculate separately the similarity between the two neighboring sentence in the text file, and according to the two neighboring sentence it Between similarity judge the two neighboring sentence with the presence or absence of noise sentence;
In the two neighboring sentence there are when noise sentence, the language in the two neighboring sentence is determined according to preset strategy Sentence is noise sentence, and the noise sentence is filtered out from the audio-video document;
Wherein, the step of similarity between the two neighboring sentence calculated separately in the text file, comprising:
Each sentence in the text file is converted to vector model, and is calculated according to the vector model of two neighboring sentence The similarity of the two neighboring sentence;
Wherein, it is described in the two neighboring sentence there are when noise sentence, the two neighboring language is determined according to preset strategy Sentence in a sentence be noise sentence the step of include:
In the two neighboring sentence, there are the first sentences and the text that when noise sentence, calculate in the two neighboring sentence The similarity of the sentence of the predetermined number since first sentence in this document, and calculate in the two neighboring sentence The similarity of the sentence of the predetermined number since first sentence in second sentence and the text file;
According to default since first sentence in the first sentence and the text file in the two neighboring sentence In the second sentence in the similarity of the sentence of number and the two neighboring sentence and the text file from first language The similarity of the sentence for the predetermined number that sentence starts determines that the first sentence or the second sentence in the two neighboring sentence are to make an uproar Sound sentence.
6. noise-reduction method as claimed in claim 5, which is characterized in that calculate separately the two neighboring language in the text file Similarity between sentence, and judge that the two neighboring sentence whether there is according to the similarity between the two neighboring sentence Before the step of noise sentence, the noise-reduction method includes: to segment to each sentence in the text file, respectively To the word of each sentence;
The step of similarity between the two neighboring sentence calculated separately in the text file includes:
The corresponding number of word for obtaining two neighboring sentence respectively according to number dictionary;
According to the corresponding number of the word of two neighboring sentence, respectively two neighboring sentence establishes vector model;
According to the vector model of two neighboring sentence, the Euclidean distance between two neighboring sentence is calculated;
According to the Euclidean distance between two neighboring sentence, the similarity between two neighboring sentence is obtained.
7. noise-reduction method as claimed in claim 6, which is characterized in that the similarity between two neighboring sentence passes through following public affairs Formula calculates:
Sim=1/ (1+D), wherein Sim indicates the similarity of two neighboring sentence, and D indicates the euclidean of two neighboring sentence Distance.
8. noise-reduction method as claimed in claim 5, which is characterized in that sentenced according to the similarity between the two neighboring sentence The two neighboring sentence that breaks whether there is noise sentence the step of include:
Judge whether the similarity between the two neighboring sentence is less than preset similarity threshold;
When similarity between the two neighboring sentence is less than preset similarity threshold, the two neighboring sentence is determined There are noise sentences.
CN201610370200.5A 2016-05-27 2016-05-27 Denoising device and method Active CN106067302B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610370200.5A CN106067302B (en) 2016-05-27 2016-05-27 Denoising device and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610370200.5A CN106067302B (en) 2016-05-27 2016-05-27 Denoising device and method

Publications (2)

Publication Number Publication Date
CN106067302A CN106067302A (en) 2016-11-02
CN106067302B true CN106067302B (en) 2019-06-25

Family

ID=57420247

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610370200.5A Active CN106067302B (en) 2016-05-27 2016-05-27 Denoising device and method

Country Status (1)

Country Link
CN (1) CN106067302B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110909021A (en) * 2018-09-12 2020-03-24 北京奇虎科技有限公司 Construction method and device of query rewriting model and application thereof

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1748249A (en) * 2003-02-12 2006-03-15 松下电器产业株式会社 Intermediary for speech processing in network environments
JP2012128188A (en) * 2010-12-15 2012-07-05 Nippon Hoso Kyokai <Nhk> Text correction device and program
CN103369122A (en) * 2012-03-31 2013-10-23 盛乐信息技术(上海)有限公司 Voice input method and system
CN103838789A (en) * 2012-11-27 2014-06-04 大连灵动科技发展有限公司 Text similarity computing method
CN103956162A (en) * 2014-04-04 2014-07-30 上海元趣信息技术有限公司 Voice recognition method and device oriented towards child
CN104462327A (en) * 2014-12-02 2015-03-25 百度在线网络技术(北京)有限公司 Computing method, search processing method, computing device and search processing device for sentence similarity
CN104751846A (en) * 2015-03-20 2015-07-01 努比亚技术有限公司 Method and device for converting voice into text
CN105161096A (en) * 2015-09-22 2015-12-16 百度在线网络技术(北京)有限公司 Speech recognition processing method and device based on garbage models

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1748249A (en) * 2003-02-12 2006-03-15 松下电器产业株式会社 Intermediary for speech processing in network environments
JP2012128188A (en) * 2010-12-15 2012-07-05 Nippon Hoso Kyokai <Nhk> Text correction device and program
CN103369122A (en) * 2012-03-31 2013-10-23 盛乐信息技术(上海)有限公司 Voice input method and system
CN103838789A (en) * 2012-11-27 2014-06-04 大连灵动科技发展有限公司 Text similarity computing method
CN103956162A (en) * 2014-04-04 2014-07-30 上海元趣信息技术有限公司 Voice recognition method and device oriented towards child
CN104462327A (en) * 2014-12-02 2015-03-25 百度在线网络技术(北京)有限公司 Computing method, search processing method, computing device and search processing device for sentence similarity
CN104751846A (en) * 2015-03-20 2015-07-01 努比亚技术有限公司 Method and device for converting voice into text
CN105161096A (en) * 2015-09-22 2015-12-16 百度在线网络技术(北京)有限公司 Speech recognition processing method and device based on garbage models

Also Published As

Publication number Publication date
CN106067302A (en) 2016-11-02

Similar Documents

Publication Publication Date Title
CN105549819B (en) The display methods and device of background application information
CN105915673B (en) A kind of method and mobile terminal of special video effect switching
CN106027905B (en) A kind of method and mobile terminal for sky focusing
CN108353103A (en) Subscriber terminal equipment and its method for recommendation response message
CN105635452B (en) Mobile terminal and its identification of contacts method
CN105959554B (en) Video capture device and method
CN105591440B (en) Mobile terminal charging control device and method
CN106571136A (en) Voice output device and method
CN106409286A (en) Method and device for implementing audio processing
CN106851451B (en) A kind of earpiece volume control method and device
CN106534422B (en) A kind of loudspeaker assembly, speaker and mobile terminal
KR20180109499A (en) Method and apparatus for providng response to user&#39;s voice input
CN105681894A (en) Device and method for displaying video file
CN106612396A (en) Photographing device, photographing terminal and photographing method
CN113033245A (en) Function adjusting method and device, storage medium and electronic equipment
CN111263009B (en) Quality inspection method, device, equipment and medium for telephone recording
CN106471493B (en) Method and apparatus for managing data
CN106686232A (en) Method for optimizing control interfaces and mobile terminal
CN105654974B (en) Multimedia playing apparatus and method
CN106713656A (en) Photographing method and mobile terminal
CN106527685A (en) Control method and device for terminal application
CN106067302B (en) Denoising device and method
CN106095744B (en) Irregular control icons processing unit and method
CN114360546A (en) Electronic equipment and awakening method thereof
CN106021129B (en) A kind of method of terminal and terminal cleaning caching

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant