WO2018116944A1 - Audio noise detection device, digital broadcast receiving device, and audio noise detection method - Google Patents

Audio noise detection device, digital broadcast receiving device, and audio noise detection method Download PDF

Info

Publication number
WO2018116944A1
WO2018116944A1 PCT/JP2017/044832 JP2017044832W WO2018116944A1 WO 2018116944 A1 WO2018116944 A1 WO 2018116944A1 JP 2017044832 W JP2017044832 W JP 2017044832W WO 2018116944 A1 WO2018116944 A1 WO 2018116944A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio signal
audio
unit
frequency
noise detection
Prior art date
Application number
PCT/JP2017/044832
Other languages
French (fr)
Japanese (ja)
Inventor
勇哉 西牟田
高木 和也
Original Assignee
三菱電機株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 三菱電機株式会社 filed Critical 三菱電機株式会社
Priority to JP2018557717A priority Critical patent/JP6669277B2/en
Publication of WO2018116944A1 publication Critical patent/WO2018116944A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0264Noise filtering characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B1/00Details of transmission systems, not covered by a single one of groups H04B3/00 - H04B13/00; Details of transmission systems not characterised by the medium used for transmission
    • H04B1/06Receivers
    • H04B1/10Means associated with receiver for limiting or suppressing noise or interference
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B1/00Details of transmission systems, not covered by a single one of groups H04B3/00 - H04B13/00; Details of transmission systems not characterised by the medium used for transmission
    • H04B1/06Receivers
    • H04B1/16Circuits

Definitions

  • the present invention relates to an audio noise detection apparatus for detecting audio noise in a digital broadcast receiving apparatus that receives a digital broadcast while moving.
  • Digital broadcast receivers that can receive and watch digital TV and digital radio while moving have become widespread.
  • in-vehicle digital broadcast receivers that can be viewed in a car or a car navigation system with a built-in digital broadcast reception function. It has become popular in recent years.
  • the radio wave reception environment is affected by changes in the surrounding environment or high-speed movement, so that the radio wave received by the digital broadcast receiving apparatus may be disturbed.
  • the intensity of the radio wave reaching the receiving antenna from the broadcasting station becomes weak.
  • unnecessary interference waves are included in the received radio waves due to the influence of reflection on the building walls.
  • the amplitude variation of the received signal and the phase variation of the phase appear due to the influence of the Doppler shift.
  • Digital broadcasting is performed by applying an error correction code to data obtained by compressing video data and audio data by a compression method such as MPEG (Moving Picture Experts Group) on the transmission side.
  • the receiving side can correct the error in the received data by performing error correction using the error correction code used on the transmitting side, but if there are many errors in the received data, the error correction may not be completed. For this reason, when this received data is decoded by the compression method used on the transmission side, an abnormality may occur in the decoding result and a part of the digital audio signal may be output as noise.
  • the frequency spectrum signal obtained by converting the digital audio signal into the frequency domain is divided into a plurality of bands, and the noise generation section is detected for each divided band, and the audio signal in the corresponding section is corrected.
  • There is a method of reducing the influence of see, for example, Patent Document 1).
  • Patent Document 1 uses a frequency peak for noise detection, and thus has a problem that noise cannot be detected correctly when a strong component is distributed over a wide range in the frequency direction. .
  • the present invention has been made to solve the above-described problems, and is an audio noise that detects noise even for a digital audio signal having a noise component in which a strong component is distributed over a wide range in the frequency direction.
  • An object is to provide a detection device.
  • an audio signal input unit that inputs a digital audio signal
  • an interval audio signal generation unit that generates an interval audio signal from the digital audio signal based on a set time width
  • an interval audio High-frequency component extraction unit that extracts the high-frequency component of the frequency spectrum signal from the signal, and the voice feature based on the value obtained by extracting the high-frequency component value from the high-frequency component of the frequency spectrum signal and multiplying the component value by the frequency value
  • a feature amount calculation unit that generates data and an audio noise detection unit that detects a noise component of the section audio signal from the audio feature data are provided.
  • the present invention extracts a frequency value having a high component value from a high frequency component of a frequency spectrum signal, generates voice feature data from the product of the component value and the frequency value, and detects a noise component. There is an effect that noise can be detected even for a digital audio signal having a noise component in which a strong component is distributed.
  • FIG. 1 is a block diagram schematically showing a configuration of a digital broadcast receiving apparatus according to a first exemplary embodiment.
  • 1 is a block diagram schematically showing a configuration of an audio noise detection apparatus according to a first exemplary embodiment. It is a figure which shows the relationship between the extraction area of the area audio signal production
  • FIG. 3 is a block diagram schematically showing a configuration of a high frequency component extraction unit according to the first exemplary embodiment. It is a figure explaining the boundary line determination in a support vector machine.
  • 1 is a block diagram schematically showing a configuration of an audio signal processing unit according to a first embodiment;
  • FIG. 3 is a diagram illustrating an example of processing of an audio signal processing unit according to the first embodiment.
  • FIG. 6 is a diagram illustrating another example of the process of the audio signal processing unit according to the first embodiment.
  • 3 is a flowchart illustrating an example of audio noise detection processing according to the first exemplary embodiment; It is a block diagram which shows roughly the structure of the audio
  • FIG. It is a table
  • FIG. 6 is a block diagram schematically showing a configuration of a digital broadcast receiving apparatus according to a third exemplary embodiment. It is a block diagram which shows an example of a quality information map.
  • FIG. 1 is a block diagram schematically showing a configuration of a digital broadcast receiving device including an audio noise detection device 10 according to the present exemplary embodiment.
  • the digital broadcast receiving apparatus includes an audio noise detection device 10, a receiving unit 20, a demultiplexing unit 30, an audio decoding unit 40, an audio signal processing unit 50, and a control unit 60.
  • the receiving unit 20 receives and demodulates the selected digital broadcast radio wave.
  • the receiving unit 20 may demodulate signals received from a plurality of antennas.
  • the digital broadcasting handled in the present embodiment compresses an audio signal, converts the compressed data into other data (for example, compression of data obtained by compressing a video signal) and multiplex processing (both multiplexing processing). It is assumed that the signal is transmitted after digital modulation.
  • the other data is, for example, data obtained by compressing a video signal.
  • digital broadcasting not only ISDB-T (Integrated Services Digital Broadcasting-Terrestrial) which is a digital television broadcasting standard adopted in Japan, but also DVB-T (Digital Video Broadcasting-) which is a digital television broadcasting standard in Europe.
  • DTMB Digital Terrestrial Multimedia Broadcast
  • CMMB China Mobile Multimedia Broadcasting
  • DAB Digital Audio Broadcast
  • a digital radio broadcast standard may be targeted, or another digital broadcast standard may be targeted.
  • the demultiplex unit 30 performs demultiplex processing (also referred to as separation processing) on the demodulated data, acquires audio compression data, and supplies it to the audio decoding unit 40.
  • demultiplex processing also referred to as separation processing
  • the audio decoding unit 40 performs a decoding process (also referred to as a decoding process) on the audio compression data from the demultiplexing unit 30 to generate a digital audio signal.
  • a decoding process also referred to as a decoding process
  • the audio noise detection device 10 receives a digital audio signal from the audio decoding unit 40 and detects a noise component of the audio signal. The detection method will be described later.
  • the audio signal processing unit 50 corrects the period of the noise component of the digital audio signal from the audio decoding unit 40 using the noise component information of the digital audio signal detected by the audio noise detection device 10 and outputs the audio signal. A digital audio signal is generated. The correction of the audio signal processing unit 50 will be described later.
  • the control unit 60 controls the operations and settings of the receiving unit 20, the demultiplexing unit 30, the audio decoding unit 40, the audio signal processing unit 50, and the audio noise detection device 10. For example, control is performed by transmitting information necessary for channel selection and information to be demultiplexed from the information to each component. Further, when the label set by the audio decoding unit 40 is managed, a control signal instructing association of unit audio signals is transmitted to the audio decoding unit 40, the audio signal processing unit 50, and the audio noise detection device 10.
  • FIG. 2 is a block diagram schematically showing the configuration of the audio noise detection apparatus 10 according to the present exemplary embodiment.
  • the audio noise detection device 10 includes an audio signal input unit 101, an audio feature data generation unit 102, and an audio noise detection unit 103.
  • the audio feature data generation unit 102 includes a section audio signal generation unit 1021, a high frequency component extraction unit 1022, and a feature amount calculation unit 1023.
  • the audio noise detection unit 103 includes a noise identification information storage area 1031, an audio noise detection processing unit 1032, and a detection result storage area 1033.
  • the noise identification information storage area 1031 and the detection result storage area 1033 may be configured to be stored in a common storage unit.
  • the audio signal input unit 101 inputs the digital audio signal from the audio decoding unit 40.
  • the section audio signal generation unit 1021 generates a section audio signal by extracting based on the time width set for the digital audio signal input by the audio signal input unit 101.
  • the time width may be set as a time width corresponding to a power of 2 in a sampling unit.
  • the section audio signal generation unit 1021 may set and extract a section that overlaps in the time direction with the previous extraction section when extracting the section sound signal from the digital audio signal. That is, the section voice signal may be extracted based on a section in which continuous section voice signals overlap in the time direction and the set time width.
  • FIG. 3 is a diagram illustrating the relationship between the extraction interval of the interval audio signal generation unit 1021 and the overlap rate.
  • FIG. 3 shows a situation in which a first interval, a second interval, and a third interval are sequentially extracted from the input digital audio signal with a time width L and an overlap rate R_o (0 ⁇ R_o ⁇ 0.5). ing.
  • the overlapping section between the respective extraction sections is a section obtained by multiplying the time width L by the overlap rate R_o.
  • FIG. 4 is a block diagram schematically showing the configuration of the high frequency component extraction unit 1022.
  • the high frequency component extraction unit 1022 extracts a high frequency component of the frequency from the section audio signal.
  • the high-frequency component extraction unit 1022 includes a high-pass filter 10221 or 10223 that is a high-pass filter that removes components in the low-frequency region, and a frequency-domain conversion unit 10222, and the section generated by the section audio signal generation unit 1021 Only high frequency components are extracted from the audio signal.
  • a high frequency component of the frequency spectrum signal that is, a power value corresponding to the frequency (hereinafter also referred to as a component value) is acquired.
  • the extraction of the high frequency component may be performed by converting the frequency of the digital audio signal from which the low frequency component has been removed first as shown in FIG. 4A, or by converting the frequency first as shown in FIG. 4B.
  • the low-frequency component may be removed after performing the above.
  • the range of the low frequency to be removed is only required to suppress normal audio signal components.
  • a frequency region of 4,000 Hz or lower including main components may be suppressed.
  • the high frequency component is a component in a frequency region exceeding 4,000 Hz. That is, the high-frequency component is a component in a frequency region higher than the frequency region in which main components of human speech are included.
  • the feature amount calculation unit 1023 calculates a feature amount from the high frequency component of the frequency spectrum signal received from the high frequency component extraction unit 1022.
  • the feature amount calculation unit 1023 extracts the top N samples (N: natural number) in descending order of the power value from the frequency spectrum signal, and acquires the extracted frequency and power value. Then, each power value is weighted with a frequency to calculate a feature amount F_noise, which is voice feature data.
  • the voice feature data generation unit 102 generates a section voice signal by extracting a digital voice signal based on the set time width, converts the section voice signal into a frequency spectrum signal, and A high frequency component of the frequency is extracted, a frequency value having a high component value is extracted from the frequency spectrum signal, and a feature amount F_noise in the section audio signal is generated by multiplying the component value and the frequency value.
  • the audio noise detection unit 103 includes a noise identification information storage area 1031, an audio noise detection processing unit 1032, and a detection result storage area 1033. From the feature amount F_noise in the corresponding section audio signal generated by the audio feature data generation unit 102. Then, it is determined whether or not there is noise in the corresponding section audio signal.
  • the noise identification information storage area 1031 is an area for storing noise identification information, which is information used for noise detection, and is a partial storage area in a common storage unit even if it is an independent storage unit storage area. It does not matter.
  • the noise identification information is information of a discriminator that determines whether or not there is noise from the feature amount.
  • the audio noise detection processing unit 1032 determines whether there is noise or no noise from the feature amount F_noise based on the noise identification information.
  • the discriminator when the discriminator is a linear discriminator, the discriminator is represented by a discriminant polynomial D shown in the following equation (2).
  • F_noise is a feature quantity
  • a and B are coefficients constituting the discriminant polynomial D
  • M is the number of dimensions of the feature quantity.
  • the noise identification information storage area 1031 stores information on the coefficients A and B of each dimension in the discriminant polynomial D.
  • the discriminant polynomial D is determined by an algorithm for constructing a linear classifier called a support vector machine, for example.
  • a discriminant polynomial D for identifying the presence / absence of noise is determined using a feature quantity labeled with / without noise, that is, learning data.
  • FIG. 5 is a diagram for explaining the determination of the boundary line that distinguishes two classes (class 1 and class 2) from the two-dimensional feature quantity by the support vector machine.
  • the support vector machine determines the boundary based on the idea of margin maximization in order to optimally separate the two classes from the two-dimensional feature quantity.
  • Margin maximization is to maximize the margin (distance) between classes.
  • the straight line that maximizes the distance from both the point X and the point Y is defined as the discriminant polynomial D.
  • the values of the feature quantity 1 and the feature quantity 2 are substituted into the discriminant polynomial D shown in the equation (2), and if the values are positive, the class 1 above the straight line and the value is negative. If there is, it is determined to be class 2 below the straight line.
  • the discriminant polynomial D is determined by the support vector machine, and the coefficient A and the coefficient B are stored in advance in the noise identification information storage area 1031 as noise identification information in order to be used by the audio noise detection processing unit 1032. To do.
  • the audio noise detection processing unit 1032 acquires the coefficient A and the coefficient B, which are noise identification information held in the noise identification information storage area 1031, calculates the discriminant polynomial D shown in the equation (2), and determines whether the calculation result is positive or negative Determine the presence or absence of noise.
  • two classes are classified by the linear separation / identification plane using a support vector machine.
  • an algorithm for constructing a nonlinear separation / identification plane may be used, or another algorithm such as a neural network may be used. Also good.
  • the presence / absence of noise is determined for each section audio signal.
  • the detection result is accumulated in the detection result storage area 1033 until the noise determination of all the segment audio signals obtained by dividing the input digital audio signal is completed. After accumulating the detection results for the input digital audio signal, it is output to the outside.
  • the method of accumulating the noise detection result may be a 1-bit signal represented by 0 or 1 that is ON only in the section audio signal noise generation section, or may be a list of the start time and end time of the noise generation section.
  • the audio signal processing unit 50 uses the information on the noise component detected by the audio noise detection device 10, and the noise component information detected by the audio noise detection device 10 for the digital audio signal from the audio decoding unit 40. Can be used to correct the period in which there was a noise component, and a digital audio signal for audio output can be generated.
  • FIG. 6 is a block diagram schematically showing the configuration of the audio signal processing unit 50.
  • the audio signal processing unit 50 includes a buffer control unit 501, a past signal storage area 502, a corrected audio signal generation unit 503, and an audio signal correction unit 504, and a noise component detected by the audio noise detection device 10 from a digital audio signal. Perform correction based on.
  • the buffer control unit 501 stores the digital audio signal from the audio decoding unit 40 in the past signal storage area 502 and performs correction from the corresponding noise detection result
  • the buffered control unit 501 converts the stored digital audio signal into the corrected audio signal generation unit 503. Output to.
  • the past signal storage area 502 is an area for storing a digital audio signal from the audio decoding unit 40, and may be a storage area of an independent storage unit or a part of a common storage unit. .
  • the corrected audio signal generation unit 503 inputs the digital audio signal stored from the past signal storage area 502 and the information of the noise component detected by the audio noise detection device 10, and corrects the section in which the noise component is detected. A corrected audio signal is generated.
  • the audio signal correction unit 504 receives the corrected audio signal from the corrected audio signal generation unit 503.
  • the audio signal correcting unit 504 switches the digital audio signal from the audio decoding unit 40 to the corrected audio signal and outputs the corrected audio signal in a section where noise is detected.
  • FIG. 7 is a diagram illustrating an example of processing of the audio signal processing unit.
  • the upper diagram of FIG. 7 shows a digital audio signal from the audio decoding unit 40, and shows the result of detecting that there is a noise component in the audio noise detection device 10 in the section from time ta to time tb.
  • the lower diagram of FIG. 7 shows the corrected audio signal that the corrected audio signal generation unit 503 has corrected for the section from time ta to time tb of the digital audio signal from the audio decoding unit 40.
  • a signal obtained by switching the time tb from the section time ta where the noise is detected to a signal having no amplitude is generated as a corrected sound signal.
  • a signal without amplitude becomes a silence signal.
  • FIG. 8 is a diagram illustrating another example of the processing of the audio signal processing unit.
  • the upper diagram of FIG. 8 is a digital audio signal from the audio decoding unit 40, and shows a result of detecting that there is a noise component in the audio noise detection device 10 in the section from time ta to time tb.
  • the lower diagram of FIG. 8 shows a corrected audio signal that the corrected audio signal generation unit 503 has corrected for the section from time ta to time tb of the digital audio signal from the audio decoding unit 40.
  • the section from time tc to time td is a section having the same length (set time width) as the section from time ta to time tb and indicating that there is no noise component from the audio noise detection device 10.
  • the section from time tc to time td is immediately before the noise component is generated.
  • a digital audio signal having no amplitude at time td is copied from the section time tc indicated that the section has no noise component, and is replaced from the section time ta at which noise is detected to time tb. Is generated as a corrected audio signal.
  • the section having no noise component immediately before the generation of the noise component is repeated, thereby producing an effect that a sense of incongruity is reduced as compared with the silence signal.
  • FIG. 9 is a flowchart showing an example of the audio noise detection process.
  • the audio signal input unit 101 inputs the digital audio signal from the audio decoding unit 40 (step S1).
  • the section audio signal generation unit 1021 generates a section audio signal by extracting based on the time width set for the digital audio signal input by the audio signal input unit 101 (step S2).
  • the high frequency component extraction unit 1022 extracts a high frequency component of the frequency from the section audio signal (step S3).
  • the feature amount calculation unit 1023 extracts a frequency value having a high component value from the high frequency component of the frequency spectrum signal received from the high frequency component extraction unit 1022, and obtains voice feature data from a value obtained by multiplying the component value and the frequency value. Generate (step S4).
  • the audio noise detection unit 103 determines whether the corresponding interval audio signal has noise or no noise from the audio feature data in the corresponding interval audio signal generated by the feature amount calculation unit 1023, and the noise of the interval audio signal A component is detected (step S5). Then, the audio signal processing unit 50 corrects the digital audio signal from the audio decoding unit 40 for a period in which there is a noise component using information on the noise component detected by the audio noise detection unit 103, and outputs the audio.
  • An audio signal is generated (step S6).
  • the audio noise detection apparatus 10 can detect noise even for a signal having a noise component in which strong components are distributed over a wide range in the frequency direction. Further, by correcting a section having a noise component detected by the audio noise detection device 10, it is possible to improve the quality of the digital audio signal output from the digital broadcast receiving device without outputting the detected noise component. It becomes possible.
  • the audio noise detection device can perform noise detection using, for example, a single processor and a recording unit such as a RAM (Random Access Memory) or an HDD (Hard Disk Drive). There is.
  • a recording unit such as a RAM (Random Access Memory) or an HDD (Hard Disk Drive).
  • the audio noise detection apparatus further includes a quality information input unit that inputs quality information, and a parameter determination unit that changes the overlap rate used in the section audio signal generation unit based on the input quality information. Prepare.
  • FIG. 10 is a block diagram schematically showing the configuration of the audio noise detection apparatus 11 according to the present embodiment.
  • the audio noise detection device 11 further includes a quality information input unit 114 and a parameter determination unit 115 in addition to the audio signal input unit 101, the audio feature data generation unit 102, and the audio noise detection unit 103.
  • a quality information input unit 114 and a parameter determination unit 115 in addition to the audio signal input unit 101, the audio feature data generation unit 102, and the audio noise detection unit 103.
  • omitted is abbreviate
  • the quality information input unit 114 inputs quality information.
  • the quality information is information related to the quality of the input digital audio signal, and is, for example, the CNR estimated from the radio wave intensity of the digital broadcast radio wave received and selected by the receiving unit 20 and information obtained at the time of demodulation. (Carrier to Noise Ratio) or SNR (Signal to Noise Ratio), and packet error rate.
  • the parameter determination unit 115 sets the overlap rate of the extraction section set by the section audio signal generation unit 1021 based on the input quality information.
  • the overlap ratio is set to be larger as the input quality information is a value indicating that the reception state is poor.
  • FIG. 11 is a table showing an example of the relationship between the quality information and the overlap rate.
  • a threshold is set for the input quality information
  • the classification of the quality information is classified into binary values of “good (G)” or “bad (B)”
  • the overlap rate is set. showed that.
  • G voice quality: good
  • B voice quality: bad
  • the overlap rate is set to 0.5, that is, half of the divided section overlaps with the respective extracted sections.
  • the quality information classification is not limited to binary, and it is obvious that the number of classifications may be increased and set in stages. The lower the voice quality based on the quality information, the higher the overlap rate ( Set the overlap section longer) and extract.
  • a final classification result may be determined based on a combination of results obtained by inputting a plurality of types of quality information and classified by respective threshold values, and an overlap rate may be set based on the final classification result.
  • the better the reception state the more efficiently noise can be detected by reducing the overlapping period extracted by the section audio signal generation unit 1021.
  • the load on other processors can be reduced by reducing the load on the processor related to noise detection. There is an effect that a load can be assigned to the process execution.
  • Embodiment 3 The quality information input by the audio noise detection apparatus according to the second embodiment is the quality information received and estimated by the digital broadcast reception apparatus, but the audio noise detection apparatus according to the third embodiment is external to the digital broadcast reception apparatus.
  • the position information is input from, and the quality information is supplied to the audio noise detection device based on the quality information map stored in association with the position information and the quality information.
  • FIG. 12 is a block diagram schematically showing a configuration of a digital broadcast receiving apparatus provided with the audio noise detection apparatus 11 according to the present embodiment.
  • the digital broadcast receiving apparatus includes an input unit (also referred to as a position information input unit) in addition to the audio noise detection device 11, the receiving unit 20, the demultiplexing unit 30, the audio decoding unit 40, the audio signal processing unit 50, and the control unit 60. 70 and a quality information map storage area 80.
  • an input unit also referred to as a position information input unit
  • the receiving unit 20 the demultiplexing unit 30, the audio decoding unit 40, the audio signal processing unit 50, and the control unit 60.
  • 70 and a quality information map storage area 80.
  • omitted about the component with the same code
  • the input unit 70 acquires information regarding the environment in which radio waves are received from the outside.
  • position information is acquired from a moving body such as a car equipped with a digital broadcast receiver.
  • the position information includes GPS (Global Positioning System) information acquired by an in-vehicle device such as a car navigation system.
  • GPS Global Positioning System
  • the input unit 70 inputs position information from the outside, and supplies the quality information corresponding to the input position information to the audio noise detection device 11 based on the quality information map stored in the quality information map storage area 80. To do.
  • the quality information map storage area 80 is an area for storing location information and quality information in association with each other, and may be a storage area of an independent storage unit or a part of a storage area in a common storage unit. Absent.
  • the quality information includes the radio wave intensity when a mobile body such as a car equipped with a digital broadcast receiver is actually received, the CNR or SNR estimated from information obtained at the time of demodulation, and the packet error rate.
  • FIG. 13 is a block diagram showing an example of the quality information map stored in the quality information map storage area 80.
  • the quality information map is map information divided in a grid pattern, and has quality information at positions corresponding to the grids.
  • the quality information possessed by each grid holds, for example, the aforementioned binary values of “good (G) and bad (B)”.
  • the quality information obtained by the digital broadcast receiving apparatus when it is located in a block having a quality information map is stored in the quality information map storage area 80 as quality information for the entire block.
  • the quality information can be set using the quality information at the previously received position, so that the quality information can be obtained without always estimating the quality information.
  • the quality information of the block may be estimated from the quality information in the neighboring blocks and stored. Furthermore, when there is a difference between the result of re-estimating the reception quality in the same block and the quality information stored in the quality information map storage area 80, it may be stored again with new information.
  • the quality information map storage area 80 is stored on the digital broadcast receiving apparatus side. However, when the digital broadcast receiving apparatus is connected to the Internet, the quality information map storage area 80 is stored in a cloud such as an external server. It goes without saying that the intended purpose can be achieved even if the quality information in the input location information block is acquired by accessing the quality information map storage area 80 via the Internet.
  • part of the audio noise detection device and the digital broadcast reception device is realized by a processing circuit.
  • the processing circuit may be dedicated hardware or a CPU that executes a program stored in a memory.
  • the functions of the receiving unit 20, the demultiplexing unit 30, the audio decoding unit 40, the audio signal processing unit 50, and the control unit 60 may be realized by separate processing circuits.
  • the functions of a plurality of parts may be realized by a single processing circuit.
  • the functions of the audio signal input unit 101, the audio feature data generation unit 102, and the audio noise detection unit 103 may be realized by separate processing circuits, or the functions of the above-described plurality of parts. May be realized by a single processing circuit.
  • the processing circuit is a CPU
  • the functions of the plurality of parts described above are realized by software, firmware, or a combination of software and firmware.
  • Software or firmware is described as a program and stored in a memory.
  • the CPU implements the functions of the respective units by reading and executing the program stored in the memory.
  • some of the functions of the plurality of parts of the digital broadcast receiving apparatus may be realized by dedicated hardware, and the other part of the functions may be realized by software or firmware.
  • 10, 11 audio noise detection device 20 receiving unit, 30 demultiplexing unit, 40 audio decoding unit, 50 audio signal processing unit, 60 control unit, 70 input unit, 80 quality information map storage area, 101 audio signal input unit, 102 voice feature data generation section, 103 voice noise detection section, 114 quality information input section, 115 parameter determination section, 501 buffer control section, 502 past signal storage area, 503 correction voice signal generation section, 504 voice signal correction section, 1021 section Audio signal generation unit, 1022, high frequency component extraction unit, 1023 feature quantity calculation unit, 1031 noise identification information storage area, 1032 audio noise detection processing unit, 1033 detection result storage area, 10221, 10223 high pass filter , 10222 frequency domain conversion unit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Noise Elimination (AREA)
  • Circuits Of Receivers In General (AREA)

Abstract

An audio noise detection device (10), which detects audio noise even for digital audio signals having noise components wherein strong components are distributed over a wide range of frequency, is characterized by being provided with: an audio signal input unit (101) for inputting a digital audio signal received; a segmented audio signal generating unit (1021) for generating a segmented audio signal from the digital audio signal on the basis of a defined time interval; a high-frequency component extraction unit (1022) for extracting high-frequency components in the frequency spectrum signal from the segmented audio signal; a characteristic quantity calculation unit (1023) for extracting high-frequency values for component values from the high-frequency components in the frequency spectrum signal and generating audio characteristic data from values obtained by multiplying the component values and frequency values; and an audio noise detection unit (103) for detecting noise components in the segmented audio signal from the audio characteristic data.

Description

音声ノイズ検出装置、デジタル放送受信装置、及び音声ノイズ検出方法Audio noise detection device, digital broadcast reception device, and audio noise detection method
 この発明は、移動中にデジタル放送を受信するデジタル放送受信装置における音声ノイズの検出を行なう音声ノイズ検出装置に関する。 The present invention relates to an audio noise detection apparatus for detecting audio noise in a digital broadcast receiving apparatus that receives a digital broadcast while moving.
 デジタルテレビやデジタルラジオを移動中に受信して視聴できるデジタル放送受信装置が普及してきており、例えば、自動車の車内で視聴する車載デジタル放送受信装置、又はデジタル放送受信機能を内蔵したカーナビゲーションシステムが近年普及している。 Digital broadcast receivers that can receive and watch digital TV and digital radio while moving have become widespread. For example, in-vehicle digital broadcast receivers that can be viewed in a car or a car navigation system with a built-in digital broadcast reception function. It has become popular in recent years.
 デジタル放送受信装置が自動車等の移動体に搭載される場合、電波受信環境が周辺の環境変化や高速移動における影響を受けるため、デジタル放送受信装置が受信する電波に乱れが発生する場合がある。例えば、デジタル放送受信装置が放送局から遠く離れた場所にある場合、放送局から受信アンテナに届く電波の強度が弱くなる。また、デジタル放送受信装置が高層ビルに囲まれた都市部にある場合、ビル壁における反射の影響により受信電波に不要な干渉波が含まれる。さらに移動中に電波を受信すると、ドップラーシフトの影響を受けて受信信号の振幅変動や位相の時間変動が現れる。 When the digital broadcast receiving apparatus is mounted on a moving body such as an automobile, the radio wave reception environment is affected by changes in the surrounding environment or high-speed movement, so that the radio wave received by the digital broadcast receiving apparatus may be disturbed. For example, when the digital broadcast receiving apparatus is located far away from the broadcasting station, the intensity of the radio wave reaching the receiving antenna from the broadcasting station becomes weak. When the digital broadcast receiving apparatus is in an urban area surrounded by high-rise buildings, unnecessary interference waves are included in the received radio waves due to the influence of reflection on the building walls. Further, when radio waves are received during movement, the amplitude variation of the received signal and the phase variation of the phase appear due to the influence of the Doppler shift.
 デジタル放送は、送信側で映像データ及び音声データをMPEG(Moving Picture Experts Group)などの圧縮方式で圧縮されたデータについて誤り訂正符号を施して送信される。受信側は送信側で用いた誤り訂正符号を用いて誤り訂正を行なうことで受信データのエラーを訂正することができるが、受信データのエラーが多ければ誤り訂正がしきれない場合がある。このため、送信側で用いた圧縮方式によってこの受信データをデコードする場合、デコード結果に異常が発生してデジタル音声信号の一部がノイズとなって出力されてしまう場合がある。 Digital broadcasting is performed by applying an error correction code to data obtained by compressing video data and audio data by a compression method such as MPEG (Moving Picture Experts Group) on the transmission side. The receiving side can correct the error in the received data by performing error correction using the error correction code used on the transmitting side, but if there are many errors in the received data, the error correction may not be completed. For this reason, when this received data is decoded by the compression method used on the transmission side, an abnormality may occur in the decoding result and a part of the digital audio signal may be output as noise.
 そこで、デジタル音声信号を周波数領域に変換した周波数スペクトル信号を複数の帯域に分割して、分割した帯域ごとに、ノイズの発生区間を検出し、該当区間の音声信号を補正することで、音声ノイズの影響を低減する方法がある(例えば、特許文献1参照)。 Therefore, the frequency spectrum signal obtained by converting the digital audio signal into the frequency domain is divided into a plurality of bands, and the noise generation section is detected for each divided band, and the audio signal in the corresponding section is corrected. There is a method of reducing the influence of (see, for example, Patent Document 1).
特開2010-249939号公報(第6-36頁、第5図)JP 2010-249939 (pages 6-36, FIG. 5)
 しかしながら、特許文献1の方法では、ノイズの検出に周波数のピークを使用しているため、周波数方向に広範囲に強い成分が分布している場合、正しくノイズを検出することができないという問題があった。 However, the method of Patent Document 1 uses a frequency peak for noise detection, and thus has a problem that noise cannot be detected correctly when a strong component is distributed over a wide range in the frequency direction. .
 この発明は、上述のような課題を解決するためになされたもので、周波数方向に広範囲に強い成分が分布しているようなノイズ成分があるデジタル音声信号に対してもノイズを検出する音声ノイズ検出装置を提供することを目的とする。 The present invention has been made to solve the above-described problems, and is an audio noise that detects noise even for a digital audio signal having a noise component in which a strong component is distributed over a wide range in the frequency direction. An object is to provide a detection device.
 この発明に係る音声ノイズ検出装置においては、デジタル音声信号を入力する音声信号入力部と、設定された時間幅に基づいてデジタル音声信号から区間音声信号を生成する区間音声信号生成部と、区間音声信号から周波数スペクトル信号の高域成分を抽出する高域成分抽出部と、周波数スペクトル信号の高域成分から成分値の高い周波数値を抽出し、成分値と周波数値とを乗算した値から音声特徴データを生成する特徴量算出部と、音声特徴データから区間音声信号のノイズ成分を検出する音声ノイズ検出部とを備えることを特徴とするものである。 In the audio noise detection device according to the present invention, an audio signal input unit that inputs a digital audio signal, an interval audio signal generation unit that generates an interval audio signal from the digital audio signal based on a set time width, and an interval audio High-frequency component extraction unit that extracts the high-frequency component of the frequency spectrum signal from the signal, and the voice feature based on the value obtained by extracting the high-frequency component value from the high-frequency component of the frequency spectrum signal and multiplying the component value by the frequency value A feature amount calculation unit that generates data and an audio noise detection unit that detects a noise component of the section audio signal from the audio feature data are provided.
 この発明は、周波数スペクトル信号の高域成分から成分値の高い周波数値を抽出し、成分値及び周波数値を乗算したものから音声特徴データを生成してノイズ成分を検出するので、周波数方向に広範囲に強い成分が分布しているようなノイズ成分があるデジタル音声信号に対してもノイズを検出することができるという効果を奏する。 Since the present invention extracts a frequency value having a high component value from a high frequency component of a frequency spectrum signal, generates voice feature data from the product of the component value and the frequency value, and detects a noise component. There is an effect that noise can be detected even for a digital audio signal having a noise component in which a strong component is distributed.
実施の形態1にかかるデジタル放送受信装置の構成を概略的に示すブロック図である。1 is a block diagram schematically showing a configuration of a digital broadcast receiving apparatus according to a first exemplary embodiment. 実施の形態1にかかる音声ノイズ検出装置の構成を概略的に示すブロック図である。1 is a block diagram schematically showing a configuration of an audio noise detection apparatus according to a first exemplary embodiment. 実施の形態1にかかる区間音声信号生成部の抽出区間とオーバーラップ率の関係を示す図である。It is a figure which shows the relationship between the extraction area of the area audio signal production | generation part concerning Embodiment 1, and an overlap rate. 実施の形態1にかかる高域成分抽出部の構成を概略的に示すブロック図である。FIG. 3 is a block diagram schematically showing a configuration of a high frequency component extraction unit according to the first exemplary embodiment. サポートベクターマシンでの境界線決定を説明する図である。It is a figure explaining the boundary line determination in a support vector machine. 実施の形態1にかかる音声信号処理部の構成を概略的に示すブロック図である。1 is a block diagram schematically showing a configuration of an audio signal processing unit according to a first embodiment; 実施の形態1にかかる音声信号処理部の処理の一例を示す図である。FIG. 3 is a diagram illustrating an example of processing of an audio signal processing unit according to the first embodiment. 実施の形態1にかかる音声信号処理部の処理の別の例を示す図である。FIG. 6 is a diagram illustrating another example of the process of the audio signal processing unit according to the first embodiment. 実施の形態1にかかる音声ノイズ検出処理の一例を示すフローチャートである。3 is a flowchart illustrating an example of audio noise detection processing according to the first exemplary embodiment; 実施の形態2にかかる音声ノイズ検出装置の構成を概略的に示すブロック図である。It is a block diagram which shows roughly the structure of the audio | voice noise detection apparatus concerning Embodiment 2. FIG. 実施の形態2にかかる品質情報とオーバーラップ率との関係の一例を示す表である。It is a table | surface which shows an example of the relationship between the quality information concerning Embodiment 2, and an overlap rate. 実施の形態3にかかるデジタル放送受信装置の構成を概略的に示すブロック図である。FIG. 6 is a block diagram schematically showing a configuration of a digital broadcast receiving apparatus according to a third exemplary embodiment. 品質情報マップの一例を示すブロック図である。It is a block diagram which shows an example of a quality information map.
実施の形態1.
 図1は、本実施の形態にかかる音声ノイズ検出装置10を備えたデジタル放送受信装置の構成を概略的に示すブロック図である。デジタル放送受信装置は、音声ノイズ検出装置10と受信部20とデマルチプレクス部30と音声デコード部40と音声信号処理部50と制御部60とを備える。
Embodiment 1 FIG.
FIG. 1 is a block diagram schematically showing a configuration of a digital broadcast receiving device including an audio noise detection device 10 according to the present exemplary embodiment. The digital broadcast receiving apparatus includes an audio noise detection device 10, a receiving unit 20, a demultiplexing unit 30, an audio decoding unit 40, an audio signal processing unit 50, and a control unit 60.
 受信部20は、選局したデジタル放送の電波を受信して復調する。受信部20は、複数のアンテナから受信した信号を復調するものであってもよい。ここで、本実施の形態で扱うデジタル放送は、音声信号を圧縮処理し、圧縮処理されたデータを他のデータ(例えば映像信号を圧縮処理したデータの圧縮)とマルチプレクス処理(多重化処理ともいう)を行なった上でデジタル変調をして送信されるものとする。他のデータとは、例えば映像信号を圧縮処理したデータなどである。このようなデジタル放送として、日本で採用されているデジタルテレビ放送規格であるISDB-T(Integrated Services Digital Broadcasting - Terrestrial)だけでなく、欧州のデジタルテレビ放送規格であるDVB-T(Digital Video Broadcasting - Terrestrial)、中国のデジタルテレビ放送規格であるDTMB(Digital Terrestrial Multimedia Broadcast)、中国のモバイル端末向けの放送規格であるCMMB(China Mobile Multimedia Broadcasting)などを対象としてもよい。また、デジタルラジオ放送規格であるDAB(Digital Audio Broadcast)を対象としてもよいし、他のデジタル放送規格を対象としてもよい。 The receiving unit 20 receives and demodulates the selected digital broadcast radio wave. The receiving unit 20 may demodulate signals received from a plurality of antennas. Here, the digital broadcasting handled in the present embodiment compresses an audio signal, converts the compressed data into other data (for example, compression of data obtained by compressing a video signal) and multiplex processing (both multiplexing processing). It is assumed that the signal is transmitted after digital modulation. The other data is, for example, data obtained by compressing a video signal. As such digital broadcasting, not only ISDB-T (Integrated Services Digital Broadcasting-Terrestrial) which is a digital television broadcasting standard adopted in Japan, but also DVB-T (Digital Video Broadcasting-) which is a digital television broadcasting standard in Europe. Terrestrial), DTMB (Digital Terrestrial Multimedia Broadcast), which is a Chinese digital television broadcast standard, CMMB (China Mobile Multimedia Broadcasting), which is a broadcast standard for mobile terminals in China, and the like. Further, DAB (Digital Audio Broadcast) which is a digital radio broadcast standard may be targeted, or another digital broadcast standard may be targeted.
 デマルチプレクス部30は、復調されたデータについてデマルチプレクス処理(分離化処理ともいう)を行ない、音声圧縮データを取得して音声デコード部40へ供給する。 The demultiplex unit 30 performs demultiplex processing (also referred to as separation processing) on the demodulated data, acquires audio compression data, and supplies it to the audio decoding unit 40.
 音声デコード部40は、デマルチプレクス部30からの音声圧縮データについてデコード処理(復号化処理ともいう)を行なってデジタル音声信号を生成する。 The audio decoding unit 40 performs a decoding process (also referred to as a decoding process) on the audio compression data from the demultiplexing unit 30 to generate a digital audio signal.
 音声ノイズ検出装置10は、音声デコード部40からのデジタル音声信号を入力として音声信号のノイズ成分を検出する。検出方法については後述する。 The audio noise detection device 10 receives a digital audio signal from the audio decoding unit 40 and detects a noise component of the audio signal. The detection method will be described later.
 音声信号処理部50は、音声デコード部40からのデジタル音声信号について、音声ノイズ検出装置10が検出したデジタル音声信号のノイズ成分の情報を用いてノイズ成分のあった期間について補正を行い、音声出力するデジタル音声信号を生成する。音声信号処理部50の補正については後述する。 The audio signal processing unit 50 corrects the period of the noise component of the digital audio signal from the audio decoding unit 40 using the noise component information of the digital audio signal detected by the audio noise detection device 10 and outputs the audio signal. A digital audio signal is generated. The correction of the audio signal processing unit 50 will be described later.
 制御部60は、受信部20、デマルチプレクス部30、音声デコード部40、音声信号処理部50、音声ノイズ検出装置10の動作及び設定について制御を行う。例えば、選局に必要な情報や、その中からデマルチプレクス処理をするべき情報をそれぞれの構成要素に送信して制御を行う。また、音声デコード部40が設定したラベルを管理している場合、単位音声信号の対応付けを指示する制御信号を音声デコード部40、音声信号処理部50、及び音声ノイズ検出装置10に送信する。 The control unit 60 controls the operations and settings of the receiving unit 20, the demultiplexing unit 30, the audio decoding unit 40, the audio signal processing unit 50, and the audio noise detection device 10. For example, control is performed by transmitting information necessary for channel selection and information to be demultiplexed from the information to each component. Further, when the label set by the audio decoding unit 40 is managed, a control signal instructing association of unit audio signals is transmitted to the audio decoding unit 40, the audio signal processing unit 50, and the audio noise detection device 10.
 図2は、本実施の形態にかかる音声ノイズ検出装置10の構成を概略的に示すブロック図である。音声ノイズ検出装置10は、音声信号入力部101と音声特徴データ生成部102と音声ノイズ検出部103とを備える。音声特徴データ生成部102は、区間音声信号生成部1021と高域成分抽出部1022と特徴量算出部1023とを有する。また、音声ノイズ検出部103は、ノイズ識別情報記憶領域1031と音声ノイズ検出処理部1032と検出結果記憶領域1033とを有する。ここで、ノイズ識別情報記憶領域1031と検出結果記憶領域1033とは共通の記憶部の中で記憶されるものとして構成されても構わない。 FIG. 2 is a block diagram schematically showing the configuration of the audio noise detection apparatus 10 according to the present exemplary embodiment. The audio noise detection device 10 includes an audio signal input unit 101, an audio feature data generation unit 102, and an audio noise detection unit 103. The audio feature data generation unit 102 includes a section audio signal generation unit 1021, a high frequency component extraction unit 1022, and a feature amount calculation unit 1023. The audio noise detection unit 103 includes a noise identification information storage area 1031, an audio noise detection processing unit 1032, and a detection result storage area 1033. Here, the noise identification information storage area 1031 and the detection result storage area 1033 may be configured to be stored in a common storage unit.
 音声信号入力部101は、音声デコード部40からのデジタル音声信号を入力する。 The audio signal input unit 101 inputs the digital audio signal from the audio decoding unit 40.
 区間音声信号生成部1021は、音声信号入力部101が入力したデジタル音声信号について設定された時間幅に基づいて抽出することによって区間音声信号を生成する。ここで時間幅は、例えばFFT(Fast Fourier Transform)を実施する場合ではサンプリング単位で2の累乗に対応する時間幅で設定すればよい。 The section audio signal generation unit 1021 generates a section audio signal by extracting based on the time width set for the digital audio signal input by the audio signal input unit 101. Here, for example, when performing FFT (Fast Fourier Transform), the time width may be set as a time width corresponding to a power of 2 in a sampling unit.
 また、区間音声信号生成部1021は、デジタル音声信号から区間音声信号を抽出するときに前回の抽出区間と時間方向に重複(オーバーラップ)する区間を設定して抽出しても構わない。つまり、連続する区間音声信号が時間方向にオーバーラップする区間と前記設定された時間幅とに基づいて前記区間音声信号を抽出しても構わない。図3は、区間音声信号生成部1021の抽出区間とオーバーラップ率の関係を示す図である。図3は、入力されたデジタル音声信号に対して時間幅L及びオーバーラップ率R_o(0≦R_o≦0.5)で第1区間、第2区間、及び第3区間と順に抽出する状況を示している。図3に示すように、それぞれの抽出区間の間の重複区間は時間幅Lにオーバーラップ率R_oを乗算した区間となる。 Also, the section audio signal generation unit 1021 may set and extract a section that overlaps in the time direction with the previous extraction section when extracting the section sound signal from the digital audio signal. That is, the section voice signal may be extracted based on a section in which continuous section voice signals overlap in the time direction and the set time width. FIG. 3 is a diagram illustrating the relationship between the extraction interval of the interval audio signal generation unit 1021 and the overlap rate. FIG. 3 shows a situation in which a first interval, a second interval, and a third interval are sequentially extracted from the input digital audio signal with a time width L and an overlap rate R_o (0 ≦ R_o ≦ 0.5). ing. As shown in FIG. 3, the overlapping section between the respective extraction sections is a section obtained by multiplying the time width L by the overlap rate R_o.
 図4は、高域成分抽出部1022の構成を概略的に示すブロック図である。高域成分抽出部1022は、区間音声信号から周波数の高域成分の抽出を行う。高域成分抽出部1022は、低周波領域の成分を除去するハイパスフィルタである高域通過フィルタ10221又は10223と、周波数領域変換部10222とを有し、区間音声信号生成部1021で生成される区間音声信号から、周波数の高域成分のみを抽出する。これにより、周波数スペクトル信号の高域成分、すなわち周波数と対応する電力値(以下、成分値ともいう)とを獲得する。ここで高域成分の抽出は、図4(a)のように、先に低域成分を除去したデジタル音声信号を周波数変換してもよいし、図4(b)のように先に周波数変換を行ってから低域成分を除去してもよい。除去する低域の範囲は、正常な音声信号の成分を抑制することができればよく、例えば、人の音声であれば、主な成分が含まれる4,000Hz以下の周波数領域を抑制すればよい。この場合、高域成分は4,000Hzを超える周波数領域の成分である。つまり、高域成分は人の音声の主な成分が含まれる周波数領域よりも高い周波数領域の成分である。 FIG. 4 is a block diagram schematically showing the configuration of the high frequency component extraction unit 1022. The high frequency component extraction unit 1022 extracts a high frequency component of the frequency from the section audio signal. The high-frequency component extraction unit 1022 includes a high- pass filter 10221 or 10223 that is a high-pass filter that removes components in the low-frequency region, and a frequency-domain conversion unit 10222, and the section generated by the section audio signal generation unit 1021 Only high frequency components are extracted from the audio signal. Thus, a high frequency component of the frequency spectrum signal, that is, a power value corresponding to the frequency (hereinafter also referred to as a component value) is acquired. Here, the extraction of the high frequency component may be performed by converting the frequency of the digital audio signal from which the low frequency component has been removed first as shown in FIG. 4A, or by converting the frequency first as shown in FIG. 4B. The low-frequency component may be removed after performing the above. The range of the low frequency to be removed is only required to suppress normal audio signal components. For example, in the case of human speech, a frequency region of 4,000 Hz or lower including main components may be suppressed. In this case, the high frequency component is a component in a frequency region exceeding 4,000 Hz. That is, the high-frequency component is a component in a frequency region higher than the frequency region in which main components of human speech are included.
 特徴量算出部1023は、高域成分抽出部1022から受け取った、周波数スペクトル信号の高域成分から特徴量を算出する。特徴量算出部1023は、周波数スペクトル信号から、その電力値が大きい順に上位N個のサンプル(N:自然数)を抽出し、抽出した周波数及び電力値を取得する。そして、それぞれの電力値に周波数で重み付けして音声特徴データである特徴量F_noiseを算出する。 The feature amount calculation unit 1023 calculates a feature amount from the high frequency component of the frequency spectrum signal received from the high frequency component extraction unit 1022. The feature amount calculation unit 1023 extracts the top N samples (N: natural number) in descending order of the power value from the frequency spectrum signal, and acquires the extracted frequency and power value. Then, each power value is weighted with a frequency to calculate a feature amount F_noise, which is voice feature data.
 特徴量算出処理の一例を示す。抽出した周波数をDf、電力値をDpとすると、特徴量算出処理では、以下の(1)式にように、N個のサンプルそれぞれの電力値Dpに周波数Dfで乗算した平均を特徴量F_noiseとして算出する。なお、この特徴量の算出は一例であり、周波数で重み付けした電力値であれば、この式だけによるものではない。 An example of feature amount calculation processing is shown. Assuming that the extracted frequency is Df and the power value is Dp, in the feature amount calculation process, an average obtained by multiplying the power value Dp of each of the N samples by the frequency Df is used as the feature amount F_noise as shown in the following equation (1). calculate. Note that the calculation of the feature amount is an example, and the power value weighted by the frequency is not based on this formula alone.
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000001
 このように、音声特徴データ生成部102は、設定された時間幅に基づいてデジタル音声信号を抽出することによって区間音声信号を生成し、区間音声信号に対して周波数スペクトル信号への周波数変換、及び周波数の高域成分の抽出を行い、周波数スペクトル信号から成分値の高い周波数値を抽出し、成分値及び周波数値を乗算したものから区間音声信号における特徴量F_noiseを生成する。 As described above, the voice feature data generation unit 102 generates a section voice signal by extracting a digital voice signal based on the set time width, converts the section voice signal into a frequency spectrum signal, and A high frequency component of the frequency is extracted, a frequency value having a high component value is extracted from the frequency spectrum signal, and a feature amount F_noise in the section audio signal is generated by multiplying the component value and the frequency value.
 次に音声ノイズ検出部103について説明する。音声ノイズ検出部103は、ノイズ識別情報記憶領域1031と音声ノイズ検出処理部1032と検出結果記憶領域1033とを有し、音声特徴データ生成部102が生成する該当する区間音声信号における特徴量F_noiseから、該当する区間音声信号にノイズがあるかノイズがないかの判定を行う。 Next, the audio noise detection unit 103 will be described. The audio noise detection unit 103 includes a noise identification information storage area 1031, an audio noise detection processing unit 1032, and a detection result storage area 1033. From the feature amount F_noise in the corresponding section audio signal generated by the audio feature data generation unit 102. Then, it is determined whether or not there is noise in the corresponding section audio signal.
 ノイズ識別情報記憶領域1031は、ノイズの検出に使用される情報であるノイズ識別情報を記憶する領域であって独立した記憶部の記憶領域であっても共通の記憶部における一部の記憶領域であっても構わない。ここで、ノイズ識別情報とは、特徴量からノイズか否かを判定する識別器の情報である。 The noise identification information storage area 1031 is an area for storing noise identification information, which is information used for noise detection, and is a partial storage area in a common storage unit even if it is an independent storage unit storage area. It does not matter. Here, the noise identification information is information of a discriminator that determines whether or not there is noise from the feature amount.
 音声ノイズ検出処理部1032は、ノイズ識別情報に基づいて特徴量F_noiseからノイズがあるかノイズがないかの判定を行う。 The audio noise detection processing unit 1032 determines whether there is noise or no noise from the feature amount F_noise based on the noise identification information.
 例えば、識別器が線形識別器である場合、識別器は以下の(2)式に示す判別多項式Dで表わされる。 For example, when the discriminator is a linear discriminator, the discriminator is represented by a discriminant polynomial D shown in the following equation (2).
Figure JPOXMLDOC01-appb-M000002
Figure JPOXMLDOC01-appb-M000002
 ここで、F_noiseは特徴量、A、及びBは判別多項式Dを構成する係数、Mは特徴量の次元数である。ノイズ識別情報記憶領域1031には判別多項式Dにおける各次元の係数A及び係数Bの情報が記憶される。 Here, F_noise is a feature quantity, A and B are coefficients constituting the discriminant polynomial D, and M is the number of dimensions of the feature quantity. The noise identification information storage area 1031 stores information on the coefficients A and B of each dimension in the discriminant polynomial D.
 判別多項式Dは、例えばサポートベクターマシンとよばれる線形識別器を構築するアルゴリズムによって決定される。サポートベクターマシンでは、ノイズの有無をラベル付けした特徴量、すなわち学習データを用いて、ノイズの有無を識別する判別多項式Dを決定する。 The discriminant polynomial D is determined by an algorithm for constructing a linear classifier called a support vector machine, for example. In the support vector machine, a discriminant polynomial D for identifying the presence / absence of noise is determined using a feature quantity labeled with / without noise, that is, learning data.
 図5は、2次元の特徴量からサポートベクターマシンで2つのクラス(クラス1及びクラス2)に識別する境界線の決定を説明する図である。サポートベクターマシンは、2次元の特徴量から2つのクラスを最適に分離するために、マージン最大化の考えに基づいて境界を決定する。マージン最大化とは、クラス間のマージン(距離)を最大にするというものであり、図5の例では、点X、及び、点Yの双方からの距離が最大となる直線を判別多項式Dとして求める。図5の例であれば、(2)式に示す判別多項式Dに特徴量1及び特徴量2の値を代入し、その値が正であれば直線の上側にあるクラス1、値が負であれば直線の下側にあるクラス2であると判定する。 FIG. 5 is a diagram for explaining the determination of the boundary line that distinguishes two classes (class 1 and class 2) from the two-dimensional feature quantity by the support vector machine. The support vector machine determines the boundary based on the idea of margin maximization in order to optimally separate the two classes from the two-dimensional feature quantity. Margin maximization is to maximize the margin (distance) between classes. In the example of FIG. 5, the straight line that maximizes the distance from both the point X and the point Y is defined as the discriminant polynomial D. Ask. In the example of FIG. 5, the values of the feature quantity 1 and the feature quantity 2 are substituted into the discriminant polynomial D shown in the equation (2), and if the values are positive, the class 1 above the straight line and the value is negative. If there is, it is determined to be class 2 below the straight line.
 本実施の形態では、判別多項式Dをサポートベクターマシンによって決定し、音声ノイズ検出処理部1032で使用するために、その係数A及び係数Bをノイズ識別情報としてノイズ識別情報記憶領域1031に事前に記憶する。 In this embodiment, the discriminant polynomial D is determined by the support vector machine, and the coefficient A and the coefficient B are stored in advance in the noise identification information storage area 1031 as noise identification information in order to be used by the audio noise detection processing unit 1032. To do.
 音声ノイズ検出処理部1032は、ノイズ識別情報記憶領域1031に保持されたノイズ識別情報である係数A及び係数Bを取得し、(2)式に示す判別多項式Dを計算し、計算結果の正負によってノイズの有無を判定する。 The audio noise detection processing unit 1032 acquires the coefficient A and the coefficient B, which are noise identification information held in the noise identification information storage area 1031, calculates the discriminant polynomial D shown in the equation (2), and determines whether the calculation result is positive or negative Determine the presence or absence of noise.
 なお、ここではサポートベクターマシンによる線形分離識別面による2クラスの分類を例に取ったが、非線形分離識別面を構築するアルゴリズムを使用してもよいし、ニューラルネットワークなど他のアルゴリズムを使用してもよい。 In this example, two classes are classified by the linear separation / identification plane using a support vector machine. However, an algorithm for constructing a nonlinear separation / identification plane may be used, or another algorithm such as a neural network may be used. Also good.
 ノイズ有無の判定は、区間音声信号ごとに行われる。ここで、入力のデジタル音声信号に対応したノイズ検出結果を得るために、入力のデジタル音声信号を分割した区間音声信号全てのノイズ判定が終了するまで検出結果記憶領域1033に検出結果を蓄積し、入力のデジタル音声信号分の検出結果を蓄積した後に、外部に出力する。ノイズ検出結果の蓄積方法は、区間音声信号ノイズ発生区間のみONとなる0又は1で表わす1ビットの信号でもよいし、ノイズ発生区間の開始時刻及び終了時刻のリストであってもよい。 ノ イ ズ The presence / absence of noise is determined for each section audio signal. Here, in order to obtain a noise detection result corresponding to the input digital audio signal, the detection result is accumulated in the detection result storage area 1033 until the noise determination of all the segment audio signals obtained by dividing the input digital audio signal is completed. After accumulating the detection results for the input digital audio signal, it is output to the outside. The method of accumulating the noise detection result may be a 1-bit signal represented by 0 or 1 that is ON only in the section audio signal noise generation section, or may be a list of the start time and end time of the noise generation section.
 以上のように音声ノイズ検出装置10で検出されたノイズ成分の情報を用いて音声信号処理部50は、音声デコード部40からのデジタル音声信号について、音声ノイズ検出装置10が検出したノイズ成分の情報を用いてノイズ成分のあった期間について補正を行い、音声出力するデジタル音声信号を生成することができる。 As described above, the audio signal processing unit 50 uses the information on the noise component detected by the audio noise detection device 10, and the noise component information detected by the audio noise detection device 10 for the digital audio signal from the audio decoding unit 40. Can be used to correct the period in which there was a noise component, and a digital audio signal for audio output can be generated.
 図6は、音声信号処理部50の構成を概略的に示すブロック図である。音声信号処理部50は、バッファ制御部501と過去信号記憶領域502と補正音声信号生成部503と音声信号補正部504とを有し、デジタル音声信号から音声ノイズ検出装置10で検出されたノイズ成分に基づく補正を行なう。 FIG. 6 is a block diagram schematically showing the configuration of the audio signal processing unit 50. The audio signal processing unit 50 includes a buffer control unit 501, a past signal storage area 502, a corrected audio signal generation unit 503, and an audio signal correction unit 504, and a noise component detected by the audio noise detection device 10 from a digital audio signal. Perform correction based on.
 バッファ制御部501は、音声デコード部40からのデジタル音声信号を過去信号記憶領域502に記憶し、それに対応するノイズ検出結果から補正を行う際に、記憶したデジタル音声信号を補正音声信号生成部503へ出力する。 When the buffer control unit 501 stores the digital audio signal from the audio decoding unit 40 in the past signal storage area 502 and performs correction from the corresponding noise detection result, the buffered control unit 501 converts the stored digital audio signal into the corrected audio signal generation unit 503. Output to.
 過去信号記憶領域502は、音声デコード部40からのデジタル音声信号を記憶する領域であって独立した記憶部の記憶領域であっても共通の記憶部における一部の記憶領域であっても構わない。 The past signal storage area 502 is an area for storing a digital audio signal from the audio decoding unit 40, and may be a storage area of an independent storage unit or a part of a common storage unit. .
 補正音声信号生成部503は、過去信号記憶領域502から記憶したデジタル音声信号と、音声ノイズ検出装置10で検出されたノイズ成分の情報とを入力し、ノイズ成分が検出された区間について補正を行うための補正音声信号を生成する。音声信号補正部504は、補正音声信号生成部503から補正音声信号を入力する。音声信号補正部504は、ノイズが検出された区間では、音声デコード部40からのデジタル音声信号を補正音声信号に切り替えて出力し、デジタル音声信号の補正を行う。 The corrected audio signal generation unit 503 inputs the digital audio signal stored from the past signal storage area 502 and the information of the noise component detected by the audio noise detection device 10, and corrects the section in which the noise component is detected. A corrected audio signal is generated. The audio signal correction unit 504 receives the corrected audio signal from the corrected audio signal generation unit 503. The audio signal correcting unit 504 switches the digital audio signal from the audio decoding unit 40 to the corrected audio signal and outputs the corrected audio signal in a section where noise is detected.
 図7は、音声信号処理部の処理の一例を示す図である。図7の上図は音声デコード部40からのデジタル音声信号であって、時刻taから時刻tbの区間は音声ノイズ検出装置10においてノイズ成分があると検出した結果を示す。図7の下図は補正音声信号生成部503が音声デコード部40からのデジタル音声信号の時刻taから時刻tbの区間について補正をした補正音声信号を示す。図7のように、ノイズが検出された区間時刻taから時刻tbを振幅のない信号に切り替えたものを補正音声信号として生成する。振幅のない信号は無音信号となる。 FIG. 7 is a diagram illustrating an example of processing of the audio signal processing unit. The upper diagram of FIG. 7 shows a digital audio signal from the audio decoding unit 40, and shows the result of detecting that there is a noise component in the audio noise detection device 10 in the section from time ta to time tb. The lower diagram of FIG. 7 shows the corrected audio signal that the corrected audio signal generation unit 503 has corrected for the section from time ta to time tb of the digital audio signal from the audio decoding unit 40. As shown in FIG. 7, a signal obtained by switching the time tb from the section time ta where the noise is detected to a signal having no amplitude is generated as a corrected sound signal. A signal without amplitude becomes a silence signal.
 図8は、音声信号処理部の処理の別の例を示す図である。図8の上図は音声デコード部40からのデジタル音声信号であって、時刻taから時刻tbの区間は音声ノイズ検出装置10においてノイズ成分があると検出した結果を示す。図8の下図は補正音声信号生成部503が音声デコード部40からのデジタル音声信号の時刻taから時刻tbの区間について補正をした補正音声信号を示す。時刻tcから時刻tdの区間は時刻taから時刻tbの区間と同じ長さ(設定された時間幅)で音声ノイズ検出装置10からノイズ成分がない区間であることが示された区間である。時刻tcから時刻tdの区間はノイズ成分が発生する直前である。図8のように、ノイズ成分がない区間であることが示された区間時刻tcから時刻tdの振幅のないデジタル音声信号をコピーして、ノイズが検出された区間時刻taから時刻tbに差替えたものを補正音声信号として生成する。特に振幅の小さい音が続いているような区間であれば、ノイズ成分が発生する直前のノイズ成分がない区間を繰り返すことで無音信号よりも違和感が軽減するという効果を奏する。 FIG. 8 is a diagram illustrating another example of the processing of the audio signal processing unit. The upper diagram of FIG. 8 is a digital audio signal from the audio decoding unit 40, and shows a result of detecting that there is a noise component in the audio noise detection device 10 in the section from time ta to time tb. The lower diagram of FIG. 8 shows a corrected audio signal that the corrected audio signal generation unit 503 has corrected for the section from time ta to time tb of the digital audio signal from the audio decoding unit 40. The section from time tc to time td is a section having the same length (set time width) as the section from time ta to time tb and indicating that there is no noise component from the audio noise detection device 10. The section from time tc to time td is immediately before the noise component is generated. As shown in FIG. 8, a digital audio signal having no amplitude at time td is copied from the section time tc indicated that the section has no noise component, and is replaced from the section time ta at which noise is detected to time tb. Is generated as a corrected audio signal. In particular, in a section where a sound having a small amplitude continues, the section having no noise component immediately before the generation of the noise component is repeated, thereby producing an effect that a sense of incongruity is reduced as compared with the silence signal.
 図9は、音声ノイズ検出処理の一例を示すフローチャートである。音声信号入力部101は、音声デコード部40からのデジタル音声信号を入力する(ステップS1)。区間音声信号生成部1021は、音声信号入力部101が入力したデジタル音声信号について設定された時間幅に基づいて抽出することによって区間音声信号を生成する(ステップS2)。高域成分抽出部1022は、区間音声信号から周波数の高域成分の抽出を行う(ステップS3)。 FIG. 9 is a flowchart showing an example of the audio noise detection process. The audio signal input unit 101 inputs the digital audio signal from the audio decoding unit 40 (step S1). The section audio signal generation unit 1021 generates a section audio signal by extracting based on the time width set for the digital audio signal input by the audio signal input unit 101 (step S2). The high frequency component extraction unit 1022 extracts a high frequency component of the frequency from the section audio signal (step S3).
 特徴量算出部1023は、高域成分抽出部1022から受け取った、周波数スペクトル信号の高域成分から成分値の高い周波数値を抽出し、成分値と周波数値とを乗算した値から音声特徴データを生成する(ステップS4)。音声ノイズ検出部103は、特徴量算出部1023が生成する該当する区間音声信号における音声特徴データから、該当する区間音声信号にノイズがあるかノイズがないかの判定を行い、区間音声信号のノイズ成分を検出する(ステップS5)。そして、音声信号処理部50は、音声デコード部40からのデジタル音声信号について、音声ノイズ検出部103が検出したノイズ成分の情報を用いてノイズ成分のあった期間について補正を行い、音声出力するデジタル音声信号を生成する(ステップS6)。 The feature amount calculation unit 1023 extracts a frequency value having a high component value from the high frequency component of the frequency spectrum signal received from the high frequency component extraction unit 1022, and obtains voice feature data from a value obtained by multiplying the component value and the frequency value. Generate (step S4). The audio noise detection unit 103 determines whether the corresponding interval audio signal has noise or no noise from the audio feature data in the corresponding interval audio signal generated by the feature amount calculation unit 1023, and the noise of the interval audio signal A component is detected (step S5). Then, the audio signal processing unit 50 corrects the digital audio signal from the audio decoding unit 40 for a period in which there is a noise component using information on the noise component detected by the audio noise detection unit 103, and outputs the audio. An audio signal is generated (step S6).
 以上のように、本実施の形態にかかる音声ノイズ検出装置10は、周波数方向に広範囲に強い成分が分布しているようなノイズ成分がある信号に対してもノイズを検出することができる。また、音声ノイズ検出装置10で検出されたノイズ成分のある区間について補正を行うことで、検出したノイズ成分を出力することなく、デジタル放送受信装置が出力するデジタル音声信号の品質を向上することが可能となる。 As described above, the audio noise detection apparatus 10 according to the present embodiment can detect noise even for a signal having a noise component in which strong components are distributed over a wide range in the frequency direction. Further, by correcting a section having a noise component detected by the audio noise detection device 10, it is possible to improve the quality of the digital audio signal output from the digital broadcast receiving device without outputting the detected noise component. It becomes possible.
 また、本実施の形態にかかる音声ノイズ検出装置は、例えば一つのプロセッサーと、RAM(Random Access Memory)又はHDD(Hard Disk Drive)などの記録部とを用いてノイズ検出を行うことができるという効果がある。 In addition, the audio noise detection device according to the present exemplary embodiment can perform noise detection using, for example, a single processor and a recording unit such as a RAM (Random Access Memory) or an HDD (Hard Disk Drive). There is.
実施の形態2.
 実施の形態2にかかる音声ノイズ検出装置は、品質情報を入力する品質情報入力部と、入力された品質情報に基づいて区間音声信号生成部で用いるオーバーラップ率を変更するパラメータ決定部とをさらに備える。
Embodiment 2. FIG.
The audio noise detection apparatus according to the second exemplary embodiment further includes a quality information input unit that inputs quality information, and a parameter determination unit that changes the overlap rate used in the section audio signal generation unit based on the input quality information. Prepare.
 図10は、本実施の形態にかかる音声ノイズ検出装置11の構成を概略的に示すブロック図である。音声ノイズ検出装置11は、音声信号入力部101と音声特徴データ生成部102と音声ノイズ検出部103とに加えて、品質情報入力部114とパラメータ決定部115とをさらに備える。符号が同一の構成要素については上述と構成及び作用が同じであるため説明を省略する。 FIG. 10 is a block diagram schematically showing the configuration of the audio noise detection apparatus 11 according to the present embodiment. The audio noise detection device 11 further includes a quality information input unit 114 and a parameter determination unit 115 in addition to the audio signal input unit 101, the audio feature data generation unit 102, and the audio noise detection unit 103. About the component with the same code | symbol, since a structure and an effect | action are the same as the above, description is abbreviate | omitted.
 品質情報入力部114は、品質情報を入力する。ここで品質情報は、入力されるデジタル音声信号の品質に関連する情報であって、例えば受信部20が選局して受信したデジタル放送の電波の電波強度、復調時に得られる情報から推定するCNR(Carrier to Noise Ratio)又はSNR(Signal to Noise Ratio)、及びパケットエラーレートなどが挙げられる。 The quality information input unit 114 inputs quality information. Here, the quality information is information related to the quality of the input digital audio signal, and is, for example, the CNR estimated from the radio wave intensity of the digital broadcast radio wave received and selected by the receiving unit 20 and information obtained at the time of demodulation. (Carrier to Noise Ratio) or SNR (Signal to Noise Ratio), and packet error rate.
 パラメータ決定部115は、入力された品質情報に基づいて区間音声信号生成部1021で設定する抽出区間のオーバーラップ率を設定する。入力された品質情報が悪い受信状態であることを示す値であればあるほどオーバーラップ率を大きく設定する。 The parameter determination unit 115 sets the overlap rate of the extraction section set by the section audio signal generation unit 1021 based on the input quality information. The overlap ratio is set to be larger as the input quality information is a value indicating that the reception state is poor.
 図11は、品質情報とオーバーラップ率との関係の一例を示す表である。図11では、入力された品質情報に対して閾値を設定して、品質情報の分類を「良い(G)」又は「悪い(B)」の2値に分類してオーバーラップ率を設定する例を示した。図11では品質情報の分類がG(音声品質:良)の場合はオーバーラップ率を0.0つまりオーバーラップすることなく区間分割するように設定する。一方、品質情報の分類がB(音声品質:悪)の場合はオーバーラップ率を0.5つまり分割する区間の半分は前後それぞれの抽出区間と重複するように設定する。 FIG. 11 is a table showing an example of the relationship between the quality information and the overlap rate. In FIG. 11, an example is shown in which a threshold is set for the input quality information, the classification of the quality information is classified into binary values of “good (G)” or “bad (B)”, and the overlap rate is set. showed that. In FIG. 11, when the quality information classification is G (voice quality: good), the overlap rate is set to 0.0, that is, the section is divided without overlapping. On the other hand, when the quality information classification is B (voice quality: bad), the overlap rate is set to 0.5, that is, half of the divided section overlaps with the respective extracted sections.
 なお、品質情報の分類は2値に限らず分類数を増やして段階的に設定するようにしても構わないことは明らかであり、品質情報に基づく音声品質が低いときほどオーバーラップ率を高く(オーバーラップ区間を長く)設定して抽出する。また、複数種類の品質情報を入力してそれぞれの閾値で分類した結果の組合せで最終分類結果を決め、最終分類結果に基づいてオーバーラップ率を設定しても構わない。 It should be noted that the quality information classification is not limited to binary, and it is obvious that the number of classifications may be increased and set in stages. The lower the voice quality based on the quality information, the higher the overlap rate ( Set the overlap section longer) and extract. Alternatively, a final classification result may be determined based on a combination of results obtained by inputting a plurality of types of quality information and classified by respective threshold values, and an overlap rate may be set based on the final classification result.
 これらにより、悪い受信状態であればあるほど区間音声信号生成部1021が抽出する重複期間を増やすことでノイズ検出の取りこぼしを削減することができる。 Thus, it is possible to reduce noise detection miss by increasing the overlap period extracted by the section audio signal generation unit 1021 as the reception state is worse.
 一方、良い受信状態であればあるほど区間音声信号生成部1021が抽出する重複期間を減らすことで効率よくノイズ検出をすることができる。例えば一つのプロセッサーと、RAM(Random Access Memory)又はHDD(Hard Disk Drive)などの記録部とを用いてノイズ検出を行う場合は、ノイズ検出に関連するプロセッサーの負荷を減らすことで他のプロセッサーのプロセス実行に負荷を割り当てることができるという効果がある。 On the other hand, the better the reception state, the more efficiently noise can be detected by reducing the overlapping period extracted by the section audio signal generation unit 1021. For example, when noise detection is performed using one processor and a recording unit such as RAM (Random Access Memory) or HDD (Hard Disk Drive), the load on other processors can be reduced by reducing the load on the processor related to noise detection. There is an effect that a load can be assigned to the process execution.
実施の形態3.
 実施の形態2にかかる音声ノイズ検出装置が入力する品質情報はデジタル放送受信装置が受信して推定した品質情報であったが、実施の形態3にかかる音声ノイズ検出装置はデジタル放送受信装置の外部から位置情報を入力し、位置情報と品質情報とを対応付けて記憶された品質情報マップに基づいて品質情報を音声ノイズ検出装置へ供給する。
Embodiment 3 FIG.
The quality information input by the audio noise detection apparatus according to the second embodiment is the quality information received and estimated by the digital broadcast reception apparatus, but the audio noise detection apparatus according to the third embodiment is external to the digital broadcast reception apparatus. The position information is input from, and the quality information is supplied to the audio noise detection device based on the quality information map stored in association with the position information and the quality information.
 図12は、本実施の形態にかかる音声ノイズ検出装置11を備えたデジタル放送受信装置の構成を概略的に示すブロック図である。デジタル放送受信装置は、音声ノイズ検出装置11と受信部20とデマルチプレクス部30と音声デコード部40と音声信号処理部50と制御部60とに加えて入力部(位置情報入力部ともいう)70と品質情報マップ記憶領域80とを備える。符号が同一の構成要素については上述と構成及び作用が同じであるため説明を省略する。 FIG. 12 is a block diagram schematically showing a configuration of a digital broadcast receiving apparatus provided with the audio noise detection apparatus 11 according to the present embodiment. The digital broadcast receiving apparatus includes an input unit (also referred to as a position information input unit) in addition to the audio noise detection device 11, the receiving unit 20, the demultiplexing unit 30, the audio decoding unit 40, the audio signal processing unit 50, and the control unit 60. 70 and a quality information map storage area 80. About the component with the same code | symbol, since a structure and an effect | action are the same as the above, description is abbreviate | omitted.
 入力部70は、外部から電波を受信している環境に関する情報を取得する。例えば、デジタル放送受信装置を搭載した車などの移動体から位置情報を取得する。位置情報は、カーナビなどの車載装置が取得するGPS(Global Positioning System)の情報などがある。 The input unit 70 acquires information regarding the environment in which radio waves are received from the outside. For example, position information is acquired from a moving body such as a car equipped with a digital broadcast receiver. The position information includes GPS (Global Positioning System) information acquired by an in-vehicle device such as a car navigation system.
 入力部70は、外部から位置情報を入力し、品質情報マップ記憶領域80が記憶する品質情報マップに基づいて、入力された位置情報に対応した品質情報を音声ノイズ検出装置11へ品質情報を供給する。 The input unit 70 inputs position information from the outside, and supplies the quality information corresponding to the input position information to the audio noise detection device 11 based on the quality information map stored in the quality information map storage area 80. To do.
 品質情報マップ記憶領域80は、位置情報と品質情報とを対応付けて記憶する領域であって独立した記憶部の記憶領域であっても共通の記憶部における一部の記憶領域であっても構わない。 The quality information map storage area 80 is an area for storing location information and quality information in association with each other, and may be a storage area of an independent storage unit or a part of a storage area in a common storage unit. Absent.
 なお、品質情報は、デジタル放送受信装置を搭載した車などの移動体が実際に受信したときの電波強度、復調時に得られる情報から推定するCNR又はSNR、及びパケットエラーレートなどが挙げられる。 The quality information includes the radio wave intensity when a mobile body such as a car equipped with a digital broadcast receiver is actually received, the CNR or SNR estimated from information obtained at the time of demodulation, and the packet error rate.
 図13は、品質情報マップ記憶領域80で記憶する品質情報マップの一例を示すブロック図である。図13では、品質情報マップは格子状に分割された地図情報であり、各格子に対応する位置での品質情報を有する。各格子が有する品質情報は、例えば前記した「良い(G),悪い(B)」の2値を保持する。 FIG. 13 is a block diagram showing an example of the quality information map stored in the quality information map storage area 80. In FIG. 13, the quality information map is map information divided in a grid pattern, and has quality information at positions corresponding to the grids. The quality information possessed by each grid holds, for example, the aforementioned binary values of “good (G) and bad (B)”.
 例えば品質情報マップのあるブロックに位置したときにデジタル放送受信装置が受信して得られた品質情報をそのブロック全体での品質情報として品質情報マップ記憶領域80で記憶する。これにより同じブロックの中で異なる位置で受信した場合も以前受信した位置での品質情報を用いて設定することができるので、常時品質情報を推定することなく品質情報を得られる。 For example, the quality information obtained by the digital broadcast receiving apparatus when it is located in a block having a quality information map is stored in the quality information map storage area 80 as quality information for the entire block. As a result, even when the signals are received at different positions in the same block, the quality information can be set using the quality information at the previously received position, so that the quality information can be obtained without always estimating the quality information.
 また、受信品質の推定が未実施のブロックについて、周辺ブロックにおける品質情報からそのブロックの品質情報を推定して記憶しておくものとしても構わない。さらに、同じブロックの中で受信品質を再度推定した結果と品質情報マップ記憶領域80で記憶されている品質情報とに違いがあった場合は、新しい情報で再記憶するものとしても構わない。 Also, with respect to a block for which reception quality estimation has not been performed, the quality information of the block may be estimated from the quality information in the neighboring blocks and stored. Furthermore, when there is a difference between the result of re-estimating the reception quality in the same block and the quality information stored in the quality information map storage area 80, it may be stored again with new information.
 図12では、品質情報マップ記憶領域80をデジタル放送受信装置側に記憶するものとしているが、デジタル放送受信装置がインターネットに接続している場合は、品質情報マップ記憶領域80を外部サーバーなどのクラウドの記憶領域として、入力した位置情報のブロックでの品質情報をインターネット経由で品質情報マップ記憶領域80にアクセスして取得するものとしても所期の目的を果たすことはいうまでもない。 In FIG. 12, the quality information map storage area 80 is stored on the digital broadcast receiving apparatus side. However, when the digital broadcast receiving apparatus is connected to the Internet, the quality information map storage area 80 is stored in a cloud such as an external server. It goes without saying that the intended purpose can be achieved even if the quality information in the input location information block is acquired by accessing the quality information map storage area 80 via the Internet.
 以上実施の形態1~3において、音声ノイズ検出装置、及びデジタル放送受信装置の一部は、処理回路により実現される。処理回路は、専用のハードウェアであっても、メモリに格納されるプログラムを実行するCPUであってもよい。例えば、図1のうち、受信部20、デマルチプレクス部30、音声デコード部40、音声信号処理部50、及び制御部60の機能をそれぞれ別個の処理回路で実現してもよいし、上記の複数の部分の機能をまとめて一つの処理回路で実現してもよい。 In Embodiments 1 to 3 described above, part of the audio noise detection device and the digital broadcast reception device is realized by a processing circuit. The processing circuit may be dedicated hardware or a CPU that executes a program stored in a memory. For example, in FIG. 1, the functions of the receiving unit 20, the demultiplexing unit 30, the audio decoding unit 40, the audio signal processing unit 50, and the control unit 60 may be realized by separate processing circuits. The functions of a plurality of parts may be realized by a single processing circuit.
 同様に、図2のうち、音声信号入力部101、音声特徴データ生成部102、及び音声ノイズ検出部103の機能をそれぞれ別個の処理回路で実現してもよいし、上記の複数の部分の機能をまとめて一つの処理回路で実現してもよい。 Similarly, in FIG. 2, the functions of the audio signal input unit 101, the audio feature data generation unit 102, and the audio noise detection unit 103 may be realized by separate processing circuits, or the functions of the above-described plurality of parts. May be realized by a single processing circuit.
 処理回路がCPUの場合、上記の複数の部分の機能は、ソフトウェア、ファームウェア、又はソフトウェアとファームウェアとの組み合わせにより実現される。ソフトウェア又はファームウェアはプログラムとして記述され、メモリに格納される。CPUは、メモリに記憶されたプログラムを読み出して実行することにより、各部の機能を実現する。また、デジタル放送受信装置の上記の複数の部分の機能のうち、一部の機能を専用のハードウェアで実現し、他の一部の機能をソフトウェア又はファームウェアで実現するようにしてもよい。 When the processing circuit is a CPU, the functions of the plurality of parts described above are realized by software, firmware, or a combination of software and firmware. Software or firmware is described as a program and stored in a memory. The CPU implements the functions of the respective units by reading and executing the program stored in the memory. Also, some of the functions of the plurality of parts of the digital broadcast receiving apparatus may be realized by dedicated hardware, and the other part of the functions may be realized by software or firmware.
 10,11 音声ノイズ検出装置、 20 受信部、 30 デマルチプレクス部、 40 音声デコード部、 50 音声信号処理部、 60 制御部、 70 入力部、 80 品質情報マップ記憶領域、 101 音声信号入力部、 102 音声特徴データ生成部、 103 音声ノイズ検出部、 114 品質情報入力部、 115 パラメータ決定部、 501 バッファ制御部、 502 過去信号記憶領域、 503 補正音声信号生成部、 504 音声信号補正部、 1021 区間音声信号生成部、 1022 高域成分抽出部、 1023 特徴量算出部、 1031 ノイズ識別情報記憶領域、 1032 音声ノイズ検出処理部、 1033 検出結果記憶領域、 10221、10223 高域通過フィルタ、 10222 周波数領域変換部。 10, 11 audio noise detection device, 20 receiving unit, 30 demultiplexing unit, 40 audio decoding unit, 50 audio signal processing unit, 60 control unit, 70 input unit, 80 quality information map storage area, 101 audio signal input unit, 102 voice feature data generation section, 103 voice noise detection section, 114 quality information input section, 115 parameter determination section, 501 buffer control section, 502 past signal storage area, 503 correction voice signal generation section, 504 voice signal correction section, 1021 section Audio signal generation unit, 1022, high frequency component extraction unit, 1023 feature quantity calculation unit, 1031 noise identification information storage area, 1032 audio noise detection processing unit, 1033 detection result storage area, 10221, 10223 high pass filter , 10222 frequency domain conversion unit.

Claims (9)

  1.  デジタル音声信号を入力する音声信号入力部と、
     設定された時間幅に基づいて前記デジタル音声信号から区間音声信号を生成する区間音声信号生成部と、
     前記区間音声信号から周波数スペクトル信号の高域成分を抽出する高域成分抽出部と、
     前記周波数スペクトル信号の高域成分から成分値の高い周波数値を抽出し、前記成分値と前記周波数値とを乗算した値から音声特徴データを生成する特徴量算出部と、
     前記音声特徴データから前記区間音声信号のノイズ成分を検出する音声ノイズ検出部と
    を備える音声ノイズ検出装置。
    An audio signal input unit for inputting a digital audio signal;
    An interval audio signal generation unit for generating an interval audio signal from the digital audio signal based on a set time width;
    A high-frequency component extraction unit that extracts a high-frequency component of a frequency spectrum signal from the section audio signal;
    A feature amount calculating unit that extracts a frequency value having a high component value from a high frequency component of the frequency spectrum signal and generates voice feature data from a value obtained by multiplying the component value and the frequency value;
    An audio noise detection apparatus comprising: an audio noise detection unit that detects a noise component of the section audio signal from the audio feature data.
  2.  前記高域成分抽出部は、前記周波数スペクトル信号を入力として低周波領域の成分を除去するハイパスフィルタをさらに有し、
     前記特徴量算出部は、前記ハイパスフィルタの出力から前記成分値の高い周波数値を抽出する
    ことを特徴とする請求項1に記載の音声ノイズ検出装置。
    The high-frequency component extraction unit further includes a high-pass filter that removes a component in a low-frequency region using the frequency spectrum signal as an input,
    The audio noise detection apparatus according to claim 1, wherein the feature amount calculation unit extracts a frequency value having a high component value from an output of the high-pass filter.
  3.  前記区間音声信号生成部は、連続する前記区間音声信号が時間方向にオーバーラップする区間と前記設定された時間幅とに基づいて前記区間音声信号を生成する
    ことを特徴とする請求項1または請求項2に記載の音声ノイズ検出装置。
    The section audio signal generation unit generates the section audio signal based on a section in which the continuous section audio signals overlap in the time direction and the set time width. Item 3. The audio noise detection device according to Item 2.
  4.  前記時間方向にオーバーラップする区間は、前記デジタル音声信号の品質情報が示す音声品質が低いときほど長く設定される
    ことを特徴とする請求項3に記載の音声ノイズ検出装置。
    4. The audio noise detection apparatus according to claim 3, wherein the interval overlapping in the time direction is set longer as the audio quality indicated by the quality information of the digital audio signal is lower.
  5.  受信した電波から復調してデジタル音声信号を生成する受信部と、
     請求項1から請求項3のいずれか1項に記載の音声ノイズ検出装置と、
     前記デジタル音声信号から前記音声ノイズ検出装置で検出されたノイズ成分に基づく補正を行なって生成したデジタル音声信号を出力する音声信号処理部と
    を備えるデジタル放送受信装置。
    A receiver that demodulates the received radio wave to generate a digital audio signal;
    The audio noise detection device according to any one of claims 1 to 3,
    A digital broadcast receiving apparatus comprising: an audio signal processing unit that outputs a digital audio signal generated by performing correction based on a noise component detected by the audio noise detection device from the digital audio signal.
  6.  受信した電波から復調してデジタル音声信号を生成する受信部と、
     請求項4に記載の音声ノイズ検出装置と、
     前記デジタル音声信号から前記音声ノイズ検出装置で検出されたノイズ成分に基づく補正を行なって生成したデジタル音声信号を出力する音声信号処理部と
    を備え、
    前記受信部は、前記受信部が受信した電波の電波強度、前記復調した信号からCN比を推定した情報、または前記復調した信号から得られるパケットエラーレートを用いて前記品質情報を生成する
    ことを特徴とするデジタル放送受信装置。
    A receiver that demodulates the received radio wave to generate a digital audio signal;
    An audio noise detection device according to claim 4;
    An audio signal processing unit that outputs a digital audio signal generated by performing correction based on a noise component detected by the audio noise detection device from the digital audio signal;
    The receiving unit generates the quality information using the radio wave intensity of the radio wave received by the receiving unit, information obtained by estimating a CN ratio from the demodulated signal, or a packet error rate obtained from the demodulated signal. A digital broadcast receiver characterized by the above.
  7.  位置情報を入力する位置情報入力部と、
     前記位置情報とその位置での前記品質情報とを対応して記憶する記憶部と
    をさらに備える
    ことを特徴とする請求項6に記載のデジタル放送受信装置。
    A position information input unit for inputting position information;
    The digital broadcast receiving apparatus according to claim 6, further comprising a storage unit that stores the position information and the quality information at the position in association with each other.
  8.  前記音声信号処理部は、前記音声ノイズ検出装置が前記区間音声信号のうちノイズ成分があると検出した区間について、無音信号または前記ノイズ成分が発生する直前の前記時間幅のデジタル音声信号に切り替えて補正することを特徴とする
    請求項5から請求項7のいずれか1項に記載のデジタル放送受信装置。
    The audio signal processing unit is configured to switch between a silence signal or a digital audio signal having the time width immediately before the noise component is generated for a section in which the audio noise detection device detects a noise component in the section audio signal. The digital broadcast receiving apparatus according to claim 5, wherein correction is performed.
  9.  デジタル音声信号を入力する音声信号入力ステップと、
     設定された時間幅に基づいて前記デジタル音声信号から区間音声信号を生成する区間音声信号生成ステップと、
     前記区間音声信号から周波数スペクトル信号の高域成分を抽出する高域成分抽出ステップと、
     前記周波数スペクトル信号の高域成分から成分値の高い周波数値を抽出し、前記成分値と前記周波数値とを乗算した値から音声特徴データを生成する特徴量算出ステップと、
     前記音声特徴データから前記区間音声信号のノイズ成分を検出する音声ノイズ検出ステップと
    を備える音声ノイズ検出方法。
    An audio signal input step for inputting a digital audio signal;
    An interval audio signal generating step for generating an interval audio signal from the digital audio signal based on a set time width;
    A high-frequency component extracting step of extracting a high-frequency component of a frequency spectrum signal from the section audio signal;
    Extracting a frequency value having a high component value from a high frequency component of the frequency spectrum signal, and generating a feature value calculating step from the value obtained by multiplying the component value and the frequency value; and
    An audio noise detection method comprising: an audio noise detection step of detecting a noise component of the section audio signal from the audio feature data.
PCT/JP2017/044832 2016-12-20 2017-12-14 Audio noise detection device, digital broadcast receiving device, and audio noise detection method WO2018116944A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2018557717A JP6669277B2 (en) 2016-12-20 2017-12-14 Audio noise detection device, digital broadcast receiving device, and audio noise detection method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2016-246500 2016-12-20
JP2016246500 2016-12-20

Publications (1)

Publication Number Publication Date
WO2018116944A1 true WO2018116944A1 (en) 2018-06-28

Family

ID=62626395

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2017/044832 WO2018116944A1 (en) 2016-12-20 2017-12-14 Audio noise detection device, digital broadcast receiving device, and audio noise detection method

Country Status (2)

Country Link
JP (1) JP6669277B2 (en)
WO (1) WO2018116944A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010046954A1 (en) * 2008-10-24 2010-04-29 三菱電機株式会社 Noise suppression device and audio decoding device
JP2010249940A (en) * 2009-04-13 2010-11-04 Sony Corp Noise reducing device and noise reduction method
JP2015108766A (en) * 2013-12-05 2015-06-11 日本電信電話株式会社 Noise suppression method, device therefor, and program

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010046954A1 (en) * 2008-10-24 2010-04-29 三菱電機株式会社 Noise suppression device and audio decoding device
JP2010249940A (en) * 2009-04-13 2010-11-04 Sony Corp Noise reducing device and noise reduction method
JP2015108766A (en) * 2013-12-05 2015-06-11 日本電信電話株式会社 Noise suppression method, device therefor, and program

Also Published As

Publication number Publication date
JPWO2018116944A1 (en) 2019-04-11
JP6669277B2 (en) 2020-03-18

Similar Documents

Publication Publication Date Title
RU2597001C2 (en) Receiving device, receiving method and program
JP5493803B2 (en) Receiving apparatus and method, program, and receiving system
US8446854B2 (en) Signal processing apparatus, signal processing method, and reception system
JP5745959B2 (en) OFDM transmitter and receiver for wireless microphone
US8509328B2 (en) Reception apparatus, reception method, program, and reception system
JP5278173B2 (en) Receiving apparatus and method, program, and receiving system
US11502762B2 (en) Signal processing device and image display apparatus including the same
JP5754211B2 (en) Receiving device, receiving method, program, and receiving system
WO2018116944A1 (en) Audio noise detection device, digital broadcast receiving device, and audio noise detection method
US9893823B2 (en) Seamless linking of multiple audio signals
WO2012144382A1 (en) Receiver device, reception method, program, and receiving system
US8290071B2 (en) Modulating device and method, demodulating device and method, program, and recording medium
US7852241B2 (en) Demodulating apparatus, demodulating method, and computer-readable medium
JP4338323B2 (en) Digital signal receiver
JP2008301300A (en) Array antenna apparatus and control method thereof, radio receiver, integrated circuit, program, and recording medium
JP6778534B2 (en) OFDM signal transmitter and OFDM signal receiver
JP4708400B2 (en) Diversity receiving apparatus, diversity receiving method, and digital television receiving apparatus
JP2013074328A (en) Transmission device and transmission method
JP2023033974A (en) Receiving device and noise elimination method
JP5686248B2 (en) Receiving device, receiving method, and program
JP2011171947A (en) Broadcast receiver and symbol synthesizing method
JP4551432B2 (en) Diversity receiving apparatus and diversity receiving method
JP6103589B2 (en) Communication system, receiving apparatus, semiconductor device, and jitter correction method in communication system
JP2009021689A (en) Digital broadcast receiver

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2018557717

Country of ref document: JP

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17884092

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17884092

Country of ref document: EP

Kind code of ref document: A1