WO2011148519A1 - Dwelling unit device for interphone system for residential complex - Google Patents
Dwelling unit device for interphone system for residential complex Download PDFInfo
- Publication number
- WO2011148519A1 WO2011148519A1 PCT/JP2010/062581 JP2010062581W WO2011148519A1 WO 2011148519 A1 WO2011148519 A1 WO 2011148519A1 JP 2010062581 W JP2010062581 W JP 2010062581W WO 2011148519 A1 WO2011148519 A1 WO 2011148519A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- unit
- packet
- call
- voice
- processing
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M9/00—Arrangements for interconnection not involving centralised switching
- H04M9/08—Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic
- H04M9/082—Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic using echo cancellers
Definitions
- the present invention relates to a dwelling unit used in a dwelling unit intercom system and installed in each dwelling unit of a dwelling unit.
- a common unit device (lobby intercom) installed at the common entrance of the apartment, a dwelling unit installed inside each dwelling unit of the apartment, and a doorphone slave unit installed at the outside entrance of each dwelling unit
- An intercom system for collective housing is provided.
- a signal trunk line is connected to the shared unit, and each dwelling unit is connected to a dwelling unit branching from the signal trunk line.
- the dwelling unit in the dwelling unit and the doorphone cordless unit at the outside entrance are connected by a cordless handset connection line.
- another dwelling unit may be connected to each dwelling unit by a home connection line.
- the dwelling unit connected to the dwelling unit line is called a dwelling unit main unit
- the dwelling unit connected to the dwelling unit main unit by the in-house connection line is called a dwelling unit sub-master unit.
- the voice transmission method via the signal trunk line and the dwelling unit line is a packet transmission method, so that another dwelling unit ( A dwelling unit intercom system that enables calls between two dwelling units) is described.
- a call direction switching process and an echo suppression process for a hands-free call are performed.
- the common unit device and the plurality of dwelling units can be digitally communicated, and the signal trunk line and the dwelling unit line connecting the common unit device and each dwelling unit are used.
- call processing that compensates for voice loss due to packet loss, delay and fluctuation (jitter) accompanying packet transmission is performed. Necessary.
- a conventional inexpensive device that is, a device that transmits voice by an analog transmission method
- an analog transmission method is adopted as a voice transmission method between the dwelling unit (dwelling unit main unit) and the door phone slave unit or between the dwelling unit main unit and the dwelling unit sub-master unit.
- it is necessary to perform a call direction switching process and an echo suppression process for hands-free call (speech call), but consider the case where digital data is transmitted through the signal trunk line as described above.
- the voice loss compensation process essential for the packet transmission method is not necessary for the analog transmission method.
- an object of the present invention is to use a packet transmission system for voice transmission via a signal trunk line and an analog transmission system for voice transmission in the vicinity of a home not via a signal trunk line, while suppressing the complexity and cost increase of the circuit configuration.
- An object of the present invention is to provide a dwelling unit for an apartment intercom system that can be used and can improve call quality.
- the dwelling unit of the intercom system for collective housing of the present invention is a common unit device installed in the common entrance of the collective housing, a dwelling unit installed in each dwelling unit of the collective housing, and installed in the exterior entrance of the collective housing Doorphone slave unit, signal trunk line connected to the common unit, a dwelling unit branching from the signal trunk line and connected to each dwelling unit, and a slave unit connecting the dwelling unit and the doorphone slave unit A connecting line.
- Call voice is transmitted between the shared device and the dwelling unit, and between the dwelling units by the packet transmission method via the signal trunk line and the dwelling unit line, and between the dwelling unit and the doorphone slave unit.
- Call voice is transmitted by an analog transmission method through the slave unit connection line.
- a microphone and a speaker a transmission processing unit that transmits a voice packet including voice data for calling and a control packet including control data for call control via the dwelling unit line and the signal trunk line; and the slave unit connection line
- An analog signal transmission unit for transmitting an analog audio signal via the first, an analog audio signal output from the microphone is converted into audio data, and the audio data is converted into an analog audio signal and output to the speaker.
- a storage unit that stores first software for speech processing for voice data transmitted in an analog transmission system and second software for speech processing for speech data transmitted in a packet transmission system; And a control unit for instructing execution of call processing.
- the control unit instructs the call processing unit to execute the first software when the door phone call detection unit detects the call, and the shared unit device or When the control data for call control is received from the dwelling unit, the call processing unit is instructed to execute the second software.
- the call processing unit executes the first software when the other party's call terminal is an analog transmission system, and the call processing unit executes the second software when the other terminal is a packet transmission system. Therefore, it is possible to use a packet transmission system for voice transmission via the signal trunk line and an analog transmission system for voice transmission in the vicinity of the house not via the signal trunk line, while suppressing complexity of the circuit configuration and cost increase. The call quality can be improved.
- the second software includes a program for acoustic echo suppression processing for suppressing acoustic echo generated by acoustic coupling between the microphone and a speaker, and a residual for suppressing residual echo that cannot be suppressed by the acoustic echo suppression processing. And an echo suppression processing program.
- the second software since the second software includes the program for acoustic echo suppression processing and the program for residual echo suppression processing, the call quality in the packet transmission method can be further improved.
- the second software includes a fluctuation absorption processing program for absorbing fluctuations in transmission delay in the transmission processing unit.
- the second software since the second software includes a fluctuation absorption processing program, the call quality in the packet transmission method can be further improved.
- a fluctuation absorbing buffer for accumulating voice data included in the voice packet received by the transmission processing unit.
- the fluctuation absorbing processing program counts the number of voice data packets stored in the fluctuation absorbing buffer at a period not longer than the packetization period of the voice packet and calculates a packet count value; It is preferable to cause the call processing unit to perform a buffer size changing step of inserting or deleting a packet in the fluctuation absorbing buffer based on the packet count value calculated in the counting step.
- the call processing unit performs a buffer size changing step for inserting or deleting packets in the fluctuation absorbing buffer based on the packet count value calculated in the counting step. Reduction of call delay can be realized, and call quality can be further improved.
- the fluctuation absorption processing program calculates a representative value of the packet count value based on a past history of the packet count value in the buffer size changing step, and the calculated representative value is a predetermined reference.
- the value is larger than the value, it is preferable to delete the packet from the fluctuation absorbing buffer, and when the representative value is smaller than the reference value, it is preferable to cause the call processing unit to perform a process of inserting the packet into the fluctuation absorbing buffer.
- prevention of packet depletion and reduction of call delay can be realized with higher accuracy.
- the fluctuation absorption processing program causes the call processing unit to record the reception time of the latest packet, and in the counting step, the count value of the latest packet is calculated at the calculation timing of the packet count value.
- a process of setting the difference between a calculation time and the reception time to a value divided by the packetization period, setting the count value of packets other than the latest packet to 1, and calculating the packet count value It is preferable to have the processing unit perform it.
- the call processing unit calculates the packet count value by setting the count value of packets other than the latest packet to 1, it is only necessary to record the reception time only for the latest packet, The recording capacity in the recording medium for recording the reception time can be saved.
- the fluctuation absorption processing program causes the call processing unit to hold the packet count value of the past N (N is a positive integer value) times in the counting step, and in the buffer size changing step, Of the past N packet count values, it is preferable to cause the call processing unit to perform a process using the nth (n is a positive integer value less than N) -th smallest packet count value as the representative value.
- N is a positive integer value
- the call processing unit to perform a process using the nth (n is a positive integer value less than N) -th smallest packet count value as the representative value.
- the fluctuation absorbing processing program determines the presence or absence of a spike delay based on the past N packet count values in the counting step, and determines that the spike delay has occurred Has the process of extracting the packet count value of the past M (M is a positive integer value of M ⁇ N) out of the past N packet count values to be performed by the call processing unit, and the buffer size changing step
- the call processing unit is caused to perform processing for calculating, as the representative value, the packet count value that is the mth (m is an integer less than M) of the past M packet count values extracted in the counting step. It is preferable.
- the representative value can be calculated while eliminating a spike delay that rarely occurs.
- the fluctuation absorption processing program when the packet count value is continuously zero in the counting step, the fluctuation absorption processing program increases in absolute value as the number of times of continuous zero increases. It is preferable to cause the call processing unit to perform a process of calculating a negative value as the packet count value.
- the fluctuation absorption processing program calculates a negative value, which increases in absolute value as the number of times of continuous zero increases, as the packet count value, so that packets can be received periodically.
- the packet count value can be calculated in consideration of the difference between the case where the number of stored packets happens to be 0 at the calculation time and the case where the packets cannot be received regularly. Therefore, the packet is less likely to be deleted in the latter case than in the former case.
- the second software uses the missing voice data
- a program for audio data missing compensation processing for compensating all or part of the missing audio data is included.
- the missing part is compensated by using voice data that is not missing, so that the call quality in the packet transmission method is further improved. Can do.
- a fluctuation absorbing buffer for accumulating voice data included in the voice packet received by the transmission processing unit, and the fluctuation absorbing processing program stores the voice stored in the fluctuation absorbing buffer.
- a counting step for calculating the packet count value by counting the number of data packets; and a buffer size changing step for inserting or deleting packets in the fluctuation absorbing buffer based on the packet count value calculated in the counting step;
- the buffer size changing step when one packet is deleted from the fluctuation absorbing buffer, if there are two or more valid packets including voice data, Located in the middle of consecutive valid packets Possible to perform the process of deleting the two valid packets successive overlap-add to the call processor is preferable. In the present invention, since the call processing unit overlaps and deletes two consecutive valid packets located in the middle, the voice deterioration due to the packet loss concealment process can be reduced.
- the fluctuation absorbing processing program when the fluctuation absorbing processing program inserts a packet into the fluctuation absorbing buffer in the buffer size changing step, if there are two consecutive valid packets, the program is between these two valid packets. It is preferable to cause the call processing unit to perform processing for inserting an invalid packet not including voice. In the present invention, if there are two consecutive valid packets, the call processing unit inserts an invalid packet that does not include voice between the two valid packets. Can be small.
- the second software detects an audio data loss detection processing program for detecting loss of all or part of audio data output from the transmission processing unit, and detects a pitch of audio from the audio data.
- a program for pitch detection processing, and audio data missing compensation processing for compensating for missing voice data based on a pitch detected by the pitch detection processing when voice data missing is detected by the voice data missing detection processing.
- the pitch detection processing program includes a process of setting an audio signal having a time width from the current time to the past as a reference signal, and sliding the reference signal from the current time to the past with respect to the audio signal. And detecting the pitch of the audio signal by obtaining the correlation between the reference signal and the audio signal.
- the reference signal and a process of increasing the time width of the reference signal as the amount of sliding is increased is possible to perform the call processing unit of the preferred.
- the time width of the reference signal increases as the slide amount of the reference signal increases, it is possible to accurately detect the pitch of the audio signal immediately before the loss occurrence time.
- the pitch detection processing program causes the call processing unit to perform a process of setting a time width of the reference signal to a predetermined initial time width until a slide amount of the reference signal reaches a predetermined slide reference value. It is preferable to carry out. According to the present invention, even when the slide amount of the reference signal is small, it is possible to ensure a time width of the reference signal equal to or larger than a certain amount, and the correlation between the reference signal and the audio signal is more accurate. You can ask well.
- the pitch detection processing program causes the call processing unit to perform processing for obtaining a correlation between the reference signal and the voice signal by an average amplitude difference function method.
- the correlation between the reference signal and the audio signal can be obtained with high accuracy with a relatively small amount of calculation.
- the pitch detection processing program causes the call processing unit to perform a process of obtaining a correlation between the reference signal and the voice signal using an average amplitude difference function of Expression (1).
- ⁇ ( ⁇ ) is the correlation value
- N is the time width of the reference signal
- x (j) is the reference signal
- x (j ⁇ ) is the audio signal
- k + 1 is the starting point of the reference signal
- a represents a predetermined coefficient
- ⁇ represents the slide amount of the reference signal.
- the correlation between the reference signal and the audio signal can be obtained with higher accuracy by using Expression (1).
- the second software includes a program for audio data loss detection processing for detecting loss of all or part of the audio data output from the transmission processing unit, and audio data from the audio data.
- a program for pitch detection processing for detecting a pitch, and audio data for compensating for missing audio data based on the pitch detected by the pitch detection processing when audio data loss is detected by the audio data loss detection processing.
- a program for missing compensation processing and a program for speech speed conversion processing for expanding or compressing the audio data using the pitch detected by the pitch detection processing.
- the voice data missing compensation process program and the speech speed conversion process program are respectively pitch detection. Compared to a configuration equipped with a processing program, it is possible to suppress the consumption of memory for loading the program.
- the pitch detection process counts a predetermined detection cycle and repeatedly detects the pitch in synchronization with the detection cycle.
- the audio data loss detection process detects a loss of audio data
- the pitch is detected at the time of detection of the missing audio data and the detection cycle is restarted from the detection time. In the present invention, it is possible to maintain the quality of the voice after the voice data missing compensation process.
- the pitch detection process detects only a pitch in a predetermined frequency range. In the present invention, since the pitch detection in an unnecessary frequency range is not performed, the processing load can be reduced.
- the speech speed conversion process detects a voice section of the voice data and converts only the voice data of the voice section.
- the speech speed conversion process since the speech speed conversion process is not performed in a section other than the voice section (for example, a silent section), the processing load in the speech speed conversion process can be reduced.
- the audio data loss detection processing is performed in synchronization with a first time interval obtained by dividing a time length of the audio data for one packet by a positive integer and the input timing of the audio data. It is preferable that the pitch detection process detects the pitch in synchronization with the detection period obtained by multiplying the first time interval by a positive integer and the first time interval. In the present invention, the pitch detection process detects the pitch in synchronization with the detection period obtained by multiplying the first time interval by a positive integer and the first time interval. There is an advantage that the control becomes simple.
- the speech speed conversion process is performed when the speech data loss detection process detects speech data loss when the speech data loss detection process detects speech data loss. It is preferable that speech speed conversion be performed using the pitch detected by the pitch detection process immediately before. According to the present invention, it is possible to suppress deterioration in voice quality due to the speech speed conversion process.
- the speech speed conversion process is performed by using speech data compensated by the speech data loss compensation process when speech speed conversion is performed when the speech data loss detection process detects a lack of speech data. It is preferable to perform speech speed conversion using the pitch detected by the pitch detection process. In the present invention, even when the speech speed conversion process is started when voice data is missing, the pitch detection process only needs to be executed at a constant detection cycle. There is an advantage that the control becomes simple.
- the pitch detection process discriminates between a voice section and a non-voice section of the voice data, and makes the detection period in the non-voice section longer than the detection period in the voice section.
- the pitch detection since the pitch detection is performed with a relatively short detection period in the voice section, the quality of speech speed conversion processing is ensured, and the pitch detection is performed with a relatively long detection period in the non-voice section. Therefore, the processing load can be reduced.
- the second software includes a voice switch processing program that reduces a loop gain of a closed loop formed by an acoustic echo path generated by acoustic coupling between the microphone and a speaker and suppresses howling.
- the voice switch processing program estimates a feedback gain of the acoustic echo path, and, based on the estimated value of the feedback gain, attenuates the received voice attenuation data received from the transmission processing unit; Calculating the sum of the attenuation on the transmission side that attenuates the voice data of the transmission input to the transmission processing unit, monitoring the voice data of the transmission and reception, estimating the call state, and The distribution of the transmission side attenuation and the reception side attenuation is determined according to the state estimation result and the calculated value of the sum, and the estimated value of the feedback gain is reduced. It is preferred to perform the process for reducing the total depending on the amount to the call processor.
- the call processing unit determines the distribution of the transmission-side attenuation amount and the reception-side attenuation amount in accordance with the estimation result of the call state and the calculated value of the sum, and determines the estimated value of the feedback gain. Since the sum is decreased according to the amount of decrease, call quality in the packet transmission method can be further improved.
- the power supply device includes an extension connection line to which a communication device installed in a house is connected, and an extension analog signal transmission unit that transmits an analog voice signal through the extension connection line. It is preferable that the voice data processed by executing the first software in the call processing unit is transmitted from the extension analog signal transmission unit to the call device via the extension connection line. According to the present invention, an extension call can be made with the call device by an analog transmission method.
- the first software detects a pitch of a voice from a digital voice signal obtained by A / D converting the analog voice signal and uses the pitch for the digital voice. It is preferable to include a speech speed conversion processing program for expanding or compressing a signal. In the present invention, since the first software includes a program for converting the speech speed, the speech speed of the voice uttered by the other party can be made faster or slower even in a call using the analog transmission method.
- FIG. 4A is a block diagram for explaining an operation during an intercom call with the door phone slave unit according to the first embodiment of the present invention, and FIG.
- FIG. 4B illustrates an operation during an extension call with the sub master unit according to the first embodiment of the present invention. It is a block diagram for doing.
- FIG. 5A is a block diagram for explaining an operation during an interphone call with the lobby interphone according to the first embodiment of the present invention
- FIG. 5B explains an operation during an interphone call with the management room device according to the first embodiment of the present invention.
- FIG. 5C is a block diagram for explaining an operation during an interphone call with another dwelling unit according to Embodiment 1 of the present invention
- FIG. 5D is a lobby interphone or management room device according to Embodiment 1 of the present invention.
- FIG. 6 is a block diagram for explaining an operation during an interphone call between the mobile phone and the sub-master.
- Embodiment 1 of this invention It is a figure explaining the process of the template setting part and pitch detection part of Embodiment 1 of this invention.
- the graph of the correlation value of Embodiment 1 of the present invention is shown. It is a flowchart which shows the audio
- FIG. 23A is a schematic diagram showing processing at the time of packet insertion by the buffer size changing unit
- FIG. 23B is a schematic diagram showing processing at the time of packet deletion by the buffer size changing unit.
- 30A and 30B are explanatory diagrams of processing in which the buffer size changing unit deletes one invalid packet.
- 31A and 31B are explanatory diagrams of processing in which the buffer size changing unit inserts one packet by overlap addition.
- 32A and 32B are diagrams for explaining processing when five packets are inserted into the jitter buffer at one time.
- 33A, 33B, and 33C are diagrams for explaining processing when a valid packet corresponding to a deleted invalid packet is received after the invalid packet is deleted.
- FIGS. 34A and 34B are diagrams illustrating processing when the buffer size changing unit inserts a concealed packet in place of an invalid packet into the jitter buffer. It is the flowchart which showed the deletion process by the buffer size change part.
- Embodiment 1 of the present invention will be described in detail with reference to FIGS. First, an intercom system for an apartment house including a dwelling unit according to the present invention will be described.
- the intercom system for an apartment house in this embodiment includes a common unit device (lobby interphone) LI installed at the common entrance (lobby) of the apartment house, and a dwelling unit installed in each unit of the apartment house Unit A (only one shown), door phone slave unit B installed at the entrance of each dwelling unit, signal trunk line Ls connected to lobby interphone LI, and branch unit from signal trunk line Ls A dwell unit line Ld connected to A and a slave unit connection line Lb connecting the dwell unit A and the door phone slave unit B are provided.
- lobby interphone lobby interphone
- control unit CT connected to the dwelling unit A and the lobby intercom LI via the signal trunk line Ls and the dwelling unit line Ld, and the lobby intercom LI and each And a management room device X that exchanges voice information and the like with the dwelling unit A.
- communication devices (secondary master units) C are installed in the dwelling unit, and the dwelling unit (parent unit) A and the second master unit C are connected by the extension connection line Lc. Yes.
- the door phone slave unit B transmits a call signal via the microphone and speaker, a call button that accepts a visitor's call operation, and the slave unit connection line Lb, and transmits and receives voice signals to and from the dwell unit A (analog transmission). ) Communication unit.
- a visitor image captured by the camera is analog-transmitted from the doorphone slave unit B to the dwelling unit A via the slave unit connection line Lb.
- the dwelling unit A transfers the video transmitted from the door phone slave unit B to the sub-master unit C via the extension connection line Lc.
- the dwelling unit A and the sub-main unit C if the video transmitted from the doorphone slave unit B is displayed on the monitor (display unit 3) and the response button of the dwelling unit A is pressed, the dwelling unit A and the doorphone slave unit are displayed. A call can be made with B, and if the response button of the sub-master unit C is pressed, a call can be made between the sub-master unit C and the door phone slave unit B.
- the sub-master C includes a microphone and a speaker, a call button for receiving an extension call operation, a communication unit that transmits a call signal and transmits / receives an audio signal (analog transmission) via the extension connection line Lc. Yes.
- the lobby interphone LI packet-transmits voice information and video information via the signal trunk line Ls, an imaging device that captures the image of the visitor, a microphone and a speaker, a numeric keypad or touch panel for the visitor to enter the dwelling unit number of the visited residence.
- a transmission unit and the like are provided.
- the lobby intercom LI when the ten key switch or touch panel is operated to accept the operation input of the dwelling unit number of any dwelling unit, the packet storing the dwelling unit number in the data field and the video of the visitor captured by the imaging device (video) A packet storing information) in the data field is transmitted (packet transmission) from the transmission unit to the address of the control device CT via the signal trunk line Ls.
- the management room device X includes a microphone and a speaker, a numeric keypad or a touch panel for an administrator to input a dwelling unit number of a contact destination, a transmission unit for transmitting voice information through the signal trunk line Ls, and the like.
- the control unit transmits the packet storing the dwelling unit number in the data field from the transmission unit via the signal trunk line Ls. Send to CT address.
- the control device CT stores the correspondence between the address assigned to the dwelling unit A of each dwelling unit and the dwelling unit number of the dwelling unit, and stores it in the data field of the packet received from the lobby intercom LI or the management room device X.
- the stored unit number is compared with the correspondence and converted into an address, the address is stored in the destination address field, and a call command for notifying the call from the lobby intercom LI or the control room device X is stored in the data field.
- the stored packet and the packet storing the video information in the data field are sent to the signal trunk line Ls.
- the lobby intercom LI, the management room device X, and the control device CT as described above are conventionally known, detailed illustration and description thereof will be omitted.
- the dwelling unit A includes a control unit 1, a microphone 2a and a speaker 2b, a call processing unit 2, a display unit 3, a video processing unit 4, a storage unit 5, a call detection unit 6, a transmission processing unit 7, and a secondary communication processing unit 8. , An analog signal transmission unit 9, a first conversion processing unit 10, a second conversion processing unit 11, a first switching unit 12, a second switching unit 13, a third switching unit 14, and the like.
- the analog voice signal (speech voice signal) output from the microphone 2a is amplified by the amplifier AMP1, and then converted into a digital voice signal (speech voice) by the A / D converter 10a of the first conversion processing unit 10. Data) and input to the call processing unit 2.
- the digital voice signal (received voice signal) after the call processing by the call processing unit 2 is converted into an analog received voice signal by the D / A converter 10b of the first conversion processing unit 10, and then the amplifier. Amplified by AMP2 and output to the speaker 2b.
- the digital transmission voice signal (transmission voice data) processed by the call processing unit 2 is transmitted by the D / A converter 11a of the second conversion processing unit 11 in the case of a door phone call or an extension call to be described later.
- the digital transmitted voice signal after the call processing by the call processing unit 2 is directly output to the transmission processing unit 7.
- the analog reception voice signal output from the analog signal transmission unit 9 is amplified by the amplifier AMP4 and then digitally received by the A / D converter 11b of the second conversion processing unit 11 (reception voice data). And is input to the call processing unit 2.
- the digital received voice signal output from the transmission processing unit 7 is directly input to the call processing unit 2.
- the analog signal transmission unit 9 is composed of a conventionally known 2-wire / 4-wire converter (hybrid transformer).
- the first switching unit 12 is connected to the two-wire side of the analog signal transmission unit 9.
- the first switching unit 12 selectively switches between a state in which the two-wire side of the analog signal transmission unit 9 is connected to the slave unit connection line Lb and a state in which the analog signal transmission unit 9 is connected to the second switching unit 13.
- the second switching unit 13 selectively switches the first switching unit 12 between a state where it is connected to the extension connection line Ld and a state where it is not connected.
- the third switching unit 14 selectively switches between a state in which the slave unit connection line Lb and the extension connection line Lc are connected and a state in which it is not connected. Note that the switching of the first to third switching units 12, 13, and 14 is all controlled by the control unit 1.
- the control unit 1 has a microcomputer as a main component and controls the entire dwelling unit A including the switching control.
- the display unit 3 includes a display device such as a liquid crystal display, a driver circuit that drives the display device, a touch panel as an input device, and the like.
- the video processing unit 4 performs signal processing on the video signal received from the transmission processing unit 7 and displays the video on the display unit 3. Specifically, a video (still image or moving image) of a visitor packet-transmitted from the lobby interphone LI is displayed on the display unit 3.
- the call processing unit 2 includes a microprocessor, an ASIC (Application Specific Integrated Circuit) or a DSP (Digital Signal Processor) and performs various controls and various calculations for call processing. Data and received voice data) are subjected to various signal processing (call processing).
- the storage unit 5 includes an electrically rewritable nonvolatile semiconductor memory (flash memory or the like), and stores first software and second software.
- the first software is composed of a collection of a plurality of programs for performing various call processing on the audio signal transmitted by the analog signal transmission unit 9 by the analog transmission method.
- the second software is composed of a collection of a plurality of programs for performing various call processing on the audio signal transmitted by the packet transmission method by the transmission processing unit 7. Details of each program will be described later.
- the transmission processing unit 7 performs packet transmission with the control device CT and other dwelling units A via the signal trunk line Ls (including the dwelling unit line Ld, the same applies hereinafter).
- the transmission processing unit 7 divides the control signal (control data) created by the control unit 1 to create a packet (control packet), and the transmission voice signal (transmission voice data) also created by the call processing unit 2. ) To create a packet (voice packet). Further, the transmission processing unit 7 encodes the control packet and the voice packet, converts (modulates) the encoded bit string into an electric signal, and sends the electric signal to the signal trunk line Ls.
- the transmission processing unit 7 converts (demodulates) an electric signal flowing through the signal trunk line Ls into a bit string and decodes a packet (voice packet, control packet, video packet) from the demodulated bit string.
- the transmission processing unit 7 discards the packet if the address of the decrypted packet does not match its own address (address of the dwelling unit A), and if the address matches, it is included in the data field of the packet. If the data is video data (video signal), it is output to the video processing unit 4, if it is control data (control signal), it is output to the control unit 1, and if it is audio data (voice signal), it is output to the call processing unit 2.
- the secondary master communication processing unit 8 encodes and frequency-modulates the control data for the secondary master created by the control unit 1 and transmits it to the secondary master C via the extension connection line Lc. Control data obtained by frequency-demodulating and decoding a control signal transmitted from the sub-master unit C via Lc is passed to the control unit 1.
- the door phone call between the dwelling unit A and the door phone slave unit B will be described.
- a call button of the door phone slave unit B is operated by a visitor
- a call signal is transmitted from the door phone slave unit B via the slave unit connection line Lb.
- the call detection unit 6 that has detected the call signal outputs a call detection signal to the control unit 1.
- the control unit 1 sounds a ringing tone from the speaker 2b.
- the doorphone cordless handset B is equipped with a camera
- the camera is activated to image a visitor, and the captured image is transmitted from the doorphone cordless handset B via the cordless handset connection line Lb. Is transmitted.
- the video transmitted through the slave unit connection line Lb is displayed on the display unit 3 by the video processing unit 4.
- the control unit 1 performs the first operation.
- the switching unit 12 is controlled so that the two-wire side of the analog signal transmission unit 9 is connected to the slave unit connection line Lb, and the third switching unit 14 is switched to the disconnected state.
- the call processing unit 2 executes the first software to perform the call processing, so that a resident of the dwelling unit and a visitor make a doorphone call using the dwelling unit A and the doorphone slave unit B. Can do.
- the control unit 1 that has received the call detection signal causes the secondary phone communication processing unit 8 to transmit the doorphone call control signal and switches the third switching unit 14 to the connected state so that the slave unit connection line Lb is connected.
- the video transmitted via the extension connection line Lc is transmitted to the secondary master unit C.
- a ringing tone is generated from the speaker and a video of the visitor is displayed on the monitor. Then, when the resident who has heard the ringing tone confirms the image of the visitor displayed on the monitor and operates the response button of the secondary master unit C, the secondary phone C to the residential unit A via the extension connection line Lc.
- a response control signal is transmitted.
- the control signal (control data) of the doorphone response is output from the peer-to-subordinate communication processing unit 8 to the control unit 1, and the control unit 1 that has received the control data changes the connection state of the third switching unit 14. Keep it as it is.
- a resident of the dwelling unit and a visitor can make a doorphone call using the sub-master C and the doorphone slave unit B.
- the call processing unit 2 of the dwelling unit A does not perform any call processing.
- an extension call between the dwelling unit A and the secondary master unit C will be described.
- a control signal for extension call is transmitted from the secondary master unit C via the extension connection line Lc.
- an extension call control signal (control data) is output from the secondary master communication processing unit 8 to the control unit 1.
- the control unit 1 Upon receiving the extension call control data, the control unit 1 causes the speaker 2b to ring.
- the control unit 1 controls the first switching unit 12 so that the two-wire side of the analog signal transmission unit 9 is switched.
- the second switching unit 13 is connected and the second switching unit 13 is controlled to connect the first switching unit 12 to the extension connection line Lc.
- control unit 1 instructs the call processing unit 2 to load and execute the first software stored in the storage unit 5. Then, as shown in FIG. 4B, when the call processing unit 2 executes the first software to perform the call processing, the residents in the same dwelling unit can make an extension call using the dwelling unit A and the sub-master unit C. it can.
- extension call control signal transmitted from one secondary master unit C is received not only by the dwelling unit A but also by the other secondary master unit C.
- response button is operated on the other secondary master unit C that has received the control signal, a communication path is formed between the two secondary master units C and C via the extension connection line Lc, and the same dwelling unit residents can make extension calls using the sub-masters C and C, respectively.
- the first software includes a voice switch processing program for switching the call direction, an acoustic echo canceller processing program for suppressing acoustic echo, a line echo canceller processing program for suppressing line echo, and an output from the speaker 2b. And a speech speed conversion processing program for reducing or speeding up the speed (speech speed) of the voice of the other party to be called.
- the call processing unit 2 executing the first software includes a voice switch VS, an acoustic side echo canceller EC1, a line side echo canceller EC2, and a speech rate conversion processing unit SE as shown in FIG.
- the voice switch VS, the acoustic side echo canceller EC1, the line side echo canceller EC2, and the speech rate conversion processing unit SE the signal processing circuit such as DSP constituting the speech processing unit 2 is a voice switch processing program, and the acoustic side echo canceller. This is realized by executing a processing program, a line-side echo canceller processing program, and a speech rate conversion processing program, respectively.
- the first and second conversion processing units 10 and 11 are not shown.
- Acoustic side echo canceller EC1 has a conventionally known structure comprising an adaptive filter ADF1 a subtractor SUB1, adapting the impulse response of the feedback path (acoustic echo path) H AC formed by the acoustic coupling between the loudspeaker 2b- microphones 2a
- An echo component (acoustic echo) that is adaptively identified by the filter ADF1 and estimated from the reference signal (output signal to the first conversion processing unit 10) is input by the subtractor SUB1 from the first conversion processing unit 10 ( The echo component is suppressed by subtracting from the transmitted voice signal).
- the line-side echo canceller EC2 also has a conventionally known configuration including an adaptive filter ADF2 and a subtractor SUB2, and impedance between the analog signal transmission unit 9 and the transmission path (slave unit connection line Lb or extension connection line Lc).
- An echo component (line echo) that is adaptively identified by the filter ADF2 and estimated from the reference signal (the output signal to the second conversion processing unit 11, that is, the transmitted voice signal) is subtracted from the received voice signal by the subtractor SUB2. In this way, the echo component is suppressed.
- a voice switch VS is provided between the acoustic echo canceller EC1 and the line echo canceller EC2.
- the voice switch VS includes a transmission side attenuator 100 for attenuating a transmission voice signal, a reception side attenuator 101 for attenuating a reception voice signal, and attenuation amounts (insertion) in the transmission side and reception side attenuators 100, 101.
- an insertion loss amount control unit 102 for controlling the loss amount).
- the insertion loss amount control unit 102 includes a total loss amount calculation unit 103 and an insertion loss amount distribution processing unit 104.
- Total loss amount calculation unit 103 the route for returning from the output point Rout of the receiving side attenuator 101 to the input point Tin of the transmitting end attenuator 100 via the acoustic echo path H AC (hereinafter referred to as "acoustic side feedback path" )
- acoustic side feedback path On the acoustic side feedback gain ⁇ , and a feedback path from the output point Tout of the transmitting side attenuator 100 to the input point Rin of the receiving side attenuator 101 via the line echo path H LIN (hereinafter referred to as ⁇ line side
- the line-side feedback gain ⁇ of the feedback path) is estimated, and the total amount of loss to be inserted into the closed loop based on the estimated values ⁇ ′ and ⁇ ′ of the feedback gains ⁇ and ⁇ on the acoustic side and the line side (transmission)
- the insertion loss amount distribution processing unit 104 monitors the transmission voice signal and the reception voice signal to estimate the call state, and according to the estimation result and the calculated value of the total loss amount calculation unit 103, the transmission side attenuator 100 and The distribution of each attenuation amount (insertion loss amount) of the receiving side attenuator 101 is determined.
- the total loss calculation unit 103 estimates the time-average power in a short time of the input signal (speech voice signal) of the transmission side attenuator 100 using a rectifier smoother, a low-pass filter, etc. and using a low-pass filter or the like to estimate the time average power in a short time of the output signal of the receiving side attenuator 101 (received voice signal), the receiving side in the maximum delay time assumed in acoustic side feedback path H AC
- the minimum value of the estimated value of the time average power of the output signal of the attenuator 101 is obtained, and the value obtained by dividing the estimated value of the time average power of the input signal of the transmission side attenuator 100 by this minimum value is the acoustic feedback gain ⁇ .
- the estimated value ⁇ ′ Further, the total loss calculation unit 103 estimates the time average power of the input signal (received voice signal) of the reception side attenuator 101 in a short time using a rectifier smoother, a low-pass filter, etc. Estimate the short time average power of the output signal (speech voice signal) of the transmission side attenuator 100 using a low-pass filter etc., and send it at the maximum delay time assumed in the line side feedback path H LIN . Obtain the minimum value of the estimated value of the time average power of the output signal of the talker attenuator 100, and divide the estimated value of the time average power of the input signal (received voice signal) of the receive side attenuator 101 by this minimum value.
- the total loss amount calculation unit 103 calculates the total loss amount Lt necessary to obtain a desired gain margin MG from the estimated values ⁇ ′ and ⁇ ′ of the acoustic side feedback gain ⁇ and the line side feedback gain ⁇ , The value Lt is output to the insertion loss amount distribution processing unit 104.
- the insertion loss distribution processing unit 104 monitors the input / output signals of the transmitting side attenuator 100 and the input / output signals of the receiving side attenuator 101, and determines the power level of these signals and information such as the presence or absence of speech. Attenuate the call state (receiving state, transmitting state, etc.) and distribute each loss so that the total loss Lt is distributed to the transmitting side attenuator 100 and the receiving side attenuator 101 at a rate according to the determined call state The attenuation amount (insertion loss amount) of the devices 100 and 101 is adjusted.
- the total loss calculation unit 103 calculates an adaptive update by calculating the sum of loss amounts to be inserted into the closed loop based on the estimated values ⁇ ′ and ⁇ ′ of the feedback gains ⁇ and ⁇ as described above, and There are two operation modes, a fixed mode for fixing the total loss amount to a predetermined initial value.
- the total loss amount calculation unit 103 operates in the fixed mode during the period from the start of the call with the other party's call terminal until the echo cancellers EC1 and EC2 on the acoustic side and the line side sufficiently converge, and the acoustic side and the line In the period after the echo cancellers EC1 and EC2 on the side have sufficiently converged, it operates in the update mode.
- the total loss amount calculation unit 103 has both the estimated values ⁇ ′ and ⁇ ′ of the acoustic feedback gain ⁇ and the line feedback gain ⁇ continuously for a predetermined time (several hundred milliseconds) from the start of a call for a predetermined threshold ⁇ (for example, it is considered that the echo cancellers EC1 and EC2 on the acoustic side and the line side have sufficiently converged when the values are below 10 dB to 15 dB smaller than the estimated values ⁇ ′ and ⁇ ′ at the start of the call.
- the operation mode is switched to the update mode in which the total loss amount is adaptively updated based on the estimated values ⁇ ′ and ⁇ ′.
- the initial value of the total loss amount in the fixed mode is set to a value sufficiently larger than the total loss amount updated as needed in the update mode.
- the total loss amount calculation unit 103 operating in the fixed mode. Since the initial total loss amount is inserted into the closed loop, it is possible to suppress the generation of unpleasant echoes (acoustic echoes and line echoes) and howling, and realize a stable half-duplex call. Also, in the state where the echo cancellers EC1 and EC2 on the acoustic side and the line side have sufficiently converged after the start of the call, the operation mode of the total loss calculation unit 103 is switched from the fixed mode to the update mode and closed loop. Since the total loss amount to be inserted into the value decreases to a value sufficiently lower than the initial value, two-way simultaneous calls can be realized.
- the total loss calculation unit 103 executes an estimation process of the acoustic side feedback gain ⁇ and the line side feedback gain ⁇ at a predetermined sampling period from the time when the fixed mode is changed to the update mode, and the estimated value ⁇ ′ (n), ⁇ ′ (n) is calculated (step 1), and the gain margin of the closed loop is maintained at MG [dB] from the product of these two estimated values ⁇ ′ (n) and ⁇ ′ (n) and the gain margin MG.
- the desired total loss amount Lr (n) required for the above is calculated by the following equation (step 2).
- Lr (n) 20log
- ⁇ ′ (n), ⁇ ′ (n), and Lr (n) indicate an estimated value of feedback gain and a desired total loss amount calculated by the nth sampling from the update mode transition point, respectively.
- the total loss amount calculation unit 103 calculates the n-th total loss amount desired value Lr (n) calculated from the above formula and the previous (n ⁇ 1th) total loss amount Lt (n ⁇ 1), that is, the previous processing.
- the total loss calculation unit 103 by suppressing the increase / decrease in the total loss amount by the total loss calculation unit 103 to a small value of ⁇ i or ⁇ d, just after the start of a call with the other party's call terminal (door phone slave unit B or sub master unit C).
- the acoustic side and line side echo cancellers EC1 and EC2 actively update the coefficients toward convergence, so even when the acoustic side feedback gain ⁇ and the line side feedback gain ⁇ change drastically, there is a sense of discomfort in hearing. Can be eliminated.
- the speech rate conversion processing unit SE converts the speech rate of the original speech by expanding or compressing the speech (received speech) .
- the well-known conventionally called PICOLA (Pointer Interval Controlled OverLap and Add) Based on the speech speed conversion algorithm, the speech speed is converted (fast or slow) by inserting or deleting waveforms in pitch units.
- PICOLA Pointer Interval Controlled OverLap and Add
- the speech speed is converted (fast or slow) by inserting or deleting waveforms in pitch units.
- “Pitch” is the pitch of the voice determined by the vibration period of the vocal cords. If the vibration period of the vocal cords is short, the voice will be high, and if the vibration period is long, the voice will be low. .
- the speech speed conversion processing unit SE performs the speech speed conversion process during a doorphone call with the doorphone slave unit B or an extension call with the sub-master unit C, the other party of the call that is ringed from the speaker 2b of the dwelling unit A
- the speech speed can be made faster or slower than the speech speed actually spoken by the other party.
- the intercom call between the dwelling unit A and the lobby intercom LI will be described.
- the packet storing the dwelling unit number in the data field and the visitor imaged by the imaging device A packet storing video (video data) in the data field is transmitted (packet transmission) from the transmission unit to the address of the control device CT via the signal trunk line Ls.
- the control device CT sends a packet storing a call command for notifying a call from the lobby intercom LI in the data field and a packet storing the video data in the data field to the signal trunk line Ls.
- the transmission processing unit 7 receives the packet via the dwelling unit line Ld
- the paging command (control signal) stored in the data field of the packet is controlled.
- the video data stored in the data field is output to the video processing unit 4 while being output to the unit 1.
- the control unit 1 receives the call command
- the control unit 1 causes the speaker 2b to ring.
- the video processing unit 4 processes the video signal received from the transmission processing unit 7 and causes the display unit 3 to display the video of the visitor.
- the resident who has heard the ringing tone confirms the video of the visitor displayed on the display unit 3 of the dwelling unit A and then operates the response button
- the control unit 1 causes the call processing unit 2 to store the storage unit.
- the second software stored in 5 is instructed to be loaded and executed. Then, as shown in FIG. 5A, when the call processing unit 2 executes the second software to perform the call processing, the resident of the dwelling unit and the visitor can make an interphone call using the dwelling unit A and the lobby intercom LI. it can.
- the lobby interphone LI has almost the same configuration as the right side dwelling unit A in FIG. 5A except for the speech speed conversion processing unit SE as shown on the left side in FIG. Those having the same functions as those of the units of the dwelling unit A are given the same reference numerals.
- the management room device X when the manager operates the numeric keypad or the touch panel and receives the operation input of the dwelling unit number of any dwelling unit, the packet storing the dwelling unit number in the data field is transmitted from the transmission unit via the signal trunk line Ls.
- the control device CT Packet transmission.
- the control device CT sends a packet storing a call command for notifying a call from the management room device X in the data field to the signal trunk line Ls.
- the transmission processing unit 7 receives the packet via the dwelling unit line Ld
- the paging command (control signal) stored in the data field of the packet is controlled.
- the control unit 1 causes the speaker 2b to ring.
- the control unit 1 instructs the call processing unit 2 to load and execute the second software stored in the storage unit 5. Then, as shown in FIG. 5B, the call processing unit 2 executes the second software to perform the call processing, so that the resident and the manager of the dwelling unit make an interphone call using the dwelling unit A and the management room device X. Can do.
- the management room apparatus X has substantially the same configuration as the dwelling unit A on the right side of FIG. Therefore, the same code
- the secondary master unit C responds to a call from the lobby intercom LI or the management room device X.
- the call processing unit 2 of the dwelling unit A executes the second software as shown in FIG. By doing this, the residents of the dwelling unit and the visitors or managers can make interphone calls using the sub-master C and the lobby intercom LI or the management room device X.
- the intercom call between the dwelling units A installed in different dwelling units will be described.
- the dwelling unit A when the resident operates the numeric keypad and receives an operation input of the dwelling unit number of another dwelling unit, a packet storing the dwelling unit number in the data field is transmitted from the transmission unit via the signal trunk line Ls of the control device CT. Send to address (packet transmission).
- the control device CT sends a packet storing a call command for notifying the call from the dwelling unit A in the data field to the signal trunk line Ls.
- a call command (control signal) stored in the data field of the packet Is output to the control unit 1.
- the control unit 1 causes the speaker 2b to ring.
- the control unit 1 instructs the call processing unit 2 to load and execute the second software stored in the storage unit 5. Then, as shown in FIG. 5C, the call processing unit 2 in the dwelling unit A of each dwelling unit executes the second software to perform call processing, so that residents in different dwelling units use the dwelling unit A. Intercom calls can be made.
- the second software includes a voice switch processing program for switching the call direction, an acoustic echo canceller processing program for suppressing acoustic echo, an echo suppressor processing program for suppressing residual echo, and packet loss associated with packet transmission.
- Audio data loss compensation processing program that compensates for loss of audio data due to noise, a fluctuation absorption processing program that absorbs delay and fluctuation (jitter) associated with packet transmission, and the voice of the other party's voice output from the speaker 2b
- a speech speed conversion processing program for decreasing or increasing the speed (speech speed).
- the call processing unit 2 executing the second software includes a voice switch VS, an acoustic echo canceller EC1, an echo suppressor ES, a speech speed conversion processing unit SE, a voice data loss compensation unit VC, and fluctuations.
- Absorption processing unit JA is provided.
- the voice switch VS, the acoustic side echo canceller EC1, the echo suppressor ES, the speech speed conversion unit SE, the voice data loss compensation unit VC, and the fluctuation absorption processing unit JA are signal processing circuits such as a DSP constituting the call processing unit 2.
- the voice switch VS has the same configuration as the voice switch VS when the first software is executed, and therefore detailed illustration of the configuration is omitted.
- the voice switch VS in the second software is different from the first software in that the total loss amount calculated by the total loss amount calculation unit 103 is reduced according to the reduction amount of the estimated value ⁇ ′ of the acoustic feedback gain ⁇ . It is different from the voice switch VS.
- the total loss calculation unit 103 considers two types of feedback gains of the acoustic side feedback gain ⁇ and the line side feedback gain ⁇ and calculates the total loss amount. It is necessary to calculate.
- the packet transmission system since no feedback path is formed, there is no need to consider the line side feedback gain ⁇ . Therefore, in the voice switch VS in the second software, by reducing the total loss amount calculated by the total loss amount calculation unit 103 according to the reduction amount of the estimated value ⁇ ′ of the acoustic feedback gain ⁇ as described above, A two-way simultaneous call can be realized more reliably.
- the echo suppressor ES is provided between the transmission processing unit 7 and the voice switch VS in the signal path of the transmission voice signal, and attenuates residual echo (acoustic echo that could not be suppressed by the acoustic echo canceller EC1, the same applies hereinafter). Is. In other words, in the packet transmission system that divides voice data into packets and transmits it, the transmission delay is longer than in the analog transmission system, and a residual echo that cannot be suppressed by the acoustic echo canceller EC1 occurs. It is necessary to increase the amount of echo suppression by the echo suppressor ES. Note that the echo suppressor ES effectively attenuates the residual echo, while the audio signal to be transmitted (transmitted audio signal) needs not to be attenuated.
- the echo suppressor ES attenuates the transmitted voice signal in conjunction with the voice switch VS, and specifically operates as shown in the flowchart of FIG. That is, the echo suppressor ES always monitors the state of the voice switch VS (the estimation result of the call state ⁇ receiving state or transmitting state> by the insertion loss distribution processing unit 104) (step 1), and the voice switch VS is in the receiving state. In some cases, it is assumed that there is no transmission voice signal to be transmitted to the signal path, and the input signal is attenuated by being multiplied (multiplied) by the input signal (step 2).
- the echo suppressor ES determines that there is no residual echo to be canceled or there is a transmission voice signal to be transmitted, and does not apply an attenuation coefficient to the input signal.
- the output is output as it is without being attenuated (step 3).
- the transmission is caused by the transmission delay.
- the residual echo generated in the signal path of the speech signal can be attenuated by the echo suppressor ES.
- two-way simultaneous calls can be reliably realized even in the packet transmission method.
- the voice switch VS is not in the reception state, for example, when the echo suppressor ES attenuates the transmission voice signal in the transmission state, the near-end speaker (resident who talks on the dwelling unit A). May be attenuated inadvertently, resulting in an increase in the volume of the near-end speaker that can be heard from the other party's call device.
- the echo suppressor ES attenuates the input signal when the voice switch VS is in the receiving state, and the echo suppressor ES does not attenuate the input signal when the voice switch VS is not in the receiving state. It is possible to attenuate only an unpleasant echo (residual echo) during a call without causing any inflection.
- the speech speed conversion processing unit SE is realized by executing the same program as the speech speed conversion processing program included in the first software, and thus the description thereof is omitted.
- FIG. 9 is a waveform diagram of an audio signal for explaining the basic principle of audio data loss compensation processing (hereinafter abbreviated as “compensation processing”).
- the vertical axis indicates the intensity of the received voice signal input from the transmission processing unit 7 to the call processing unit 2, and the horizontal axis indicates time.
- the voice data loss compensation processing unit VC sets the received voice signal of a predetermined period immediately before the packet loss as a reference signal (template). To do.
- the template is slid toward the past from the time when the packet loss occurs with respect to the reception voice signal, and the correlation calculation between the template and the reception voice signal is performed, and the reception voice signal immediately before the packet loss occurs
- the basic period (pitch) is detected.
- the received voice signal for one pitch is extracted retroactively, and the received voice signal is repeatedly applied to the loss period, whereby a loss period (period in which voice data is missing.
- the loss period is compensated by the received voice signal for one pitch. For example, when the speaker utters the voice “A”, the voice “A” is divided into about 20 msec (packetization). This is because the received voice signal for one pitch immediately before the occurrence of the packet loss is likely to be repeated in the loss period because it is transmitted on one voice packet.
- the audio data loss compensation processing unit VC includes a delay fluctuation absorbing buffer (jitter buffer) 20, a timer 21, a packet loss detection unit 22, a detection processing unit 23, and a compensation processing unit 24 as shown in FIG. However, each of these units is realized by executing a voice data loss compensation processing program by the DSP of the call processing unit 2.
- the transmission processing unit 7 outputs the received received voice signal (received voice data) to the jitter buffer 20 in chronological order according to the sequence number.
- the voice packet header includes a time stamp in addition to the sequence number.
- the sequence number indicates the transmission order of the voice packets, and the time stamp indicates the relative position of the voice signal in the original voice waveform.
- the jitter buffer 20 temporarily holds the received voice data output from the transmission processing unit 7, delays it for a predetermined time, and outputs it to the detection processing unit 23 to absorb the delay fluctuation of the voice packet.
- the timer 21 is used when the packet loss detection unit 22 detects a packet loss.
- the packet loss detection unit 22 starts the timer 21 timing when the jitter buffer 20 outputs the reception voice data to the detection processing unit 23, and before the jitter buffer 20 outputs the next reception voice data, the timer 21 If the measured time exceeds a predetermined time in which packet loss is assumed to occur, it is determined that packet loss has occurred.
- the detection processing unit 23 When a packet loss is detected by the packet loss detection unit 22, the detection processing unit 23 performs a basic period (pitch) detection process on the received voice data output from the jitter buffer 20, and the packet loss detection unit 22 If no packet loss is detected, nothing is performed on the received voice data. The detection processing unit 23 holds received voice data for a certain period in the past.
- the detection processing unit 23 includes a template setting unit 23a and a pitch detection unit 23b.
- the template setting unit 23a sets received voice data having a predetermined time width as a template from the loss occurrence time to the past when the packet loss has occurred.
- the template setting unit 23a increases the time width of the template as the pitch detection unit 23b increases the slide amount of the template.
- the pitch detection unit 23b slides the template set by the template setting unit 23a toward the past from the point of occurrence of loss with respect to the reception voice data, obtains the cross-correlation between the template and the reception voice data, and calculates the template and the reception voice data.
- the pitch of the received voice signal immediately before the point of occurrence of loss is detected from the amount of slide when the correlation peak with the maximum appears.
- FIG. 10 is a waveform diagram of a received voice signal for explaining the processing of the template setting unit 23a and the pitch detection unit 23b.
- shaft shown in FIG. 10 shows the intensity
- the horizontal axis shows time by the number of samples.
- a template TJ shown in FIG. 10 indicates a template used in the conventional compensation process.
- a received voice signal for a predetermined period in the past from the loss occurrence time RT is set as a template TJ. Then, by sliding the template TJ toward the past from the loss occurrence time RT with respect to the received voice signal, the cross-correlation between the received voice signal and the template TJ is obtained, and the template TJ when the strongest correlation peak is obtained.
- the pitch of the received voice signal was detected from the slide amount.
- FIG. 11 is a graph showing the calculation result of the correlation value between the template TJ and the received voice signal when the conventional template TJ is used.
- the correlation value is calculated using a conventionally known average amplitude difference function (Average (Magnitude Difference Function).
- the vertical axis indicates the correlation value
- the horizontal axis indicates the time when the loss occurrence time RT is 0 as the number of samples.
- FIG. 11 shows the correlation value by AMDF, the smaller the value, the stronger the correlation between the received voice signal and the template TJ.
- a downwardly-correlated correlation peak PK1 appears at the time of 37 samples, and then a downwardly-correlated correlation peak PK2 appears at the time of 47 samples, and thereafter convex downward at a period of approximately 37 samples.
- the correlation peak of appears repeatedly.
- the correlation peak PK1 appears smaller than the correlation peak PK2. Therefore, in the conventional method, 37 samples are detected as the pitch of the received voice signal.
- the pitch of the received voice signal immediately before the loss occurrence time RT is 47 samples. Therefore, it can be seen that in the conventional method, the pitch of the received voice signal immediately before the loss occurrence time RT is not accurately detected.
- the time width of the template TJ is much larger than 47 samples, and the template TJ includes only one period of the received voice signal whose pitch to be detected is 47 samples, but the pitch that is not to be detected is 37. Since the sample received voice signal includes three periods, it is considered that a strong correlation peak appeared at 37 samples.
- the pitch of 47 samples cannot be detected.
- the time width of the template TM is increased as the slide amount of the template TM is increased as shown in FIG.
- the template TM when the template TM is slid to some extent as in the template TM shown in the third row of FIG. 10, the template includes only 47 samples of received voice signals that are to be detected.
- the template TM at the fourth stage in FIG. 10 includes a received voice signal with a pitch of 37 samples in addition to a received voice signal with a pitch of 47 samples. Therefore, the correlation between the third-stage template TM and the received voice signal is stronger than the correlation between the fourth-stage template TM and the received voice signal, and the pitch of the received voice signal immediately before the loss occurrence time RT is increased. It becomes possible to detect with high accuracy.
- the pitch detection unit 23b adopts, for example, AMDF shown in the equation (1) as the correlation calculation.
- ⁇ ( ⁇ ) is the correlation value
- N is the time width of the template TM
- x (j) is the template TM
- x (j ⁇ ) is the received voice signal
- k + 1 is the starting point of the template TM
- a is in advance
- ⁇ indicates the slide amount of the template TM
- j indicates the sampling number of each sampling point of the received voice signal.
- the template setting unit 23a sets the time width of the template TM to a predetermined initial time width until the slide amount of the template TM reaches a predetermined slide reference value.
- the time width of the template TM is set to the initial time width, and even when the slide amount is small, the time width of the template TM is larger than a certain amount.
- the correlation between the template TM and the received voice signal (input signal) can be obtained with higher accuracy.
- the time width of the template TM is set to the initial time width until the slide amount of the template TM reaches the slide reference value, but the amount of calculation can be reduced by relatively shortening the initial time width. .
- the initial time width it is preferable to adopt the assumed minimum value of the pitch of the received voice signal.
- the slide reference value for example, an initial time width may be adopted.
- FIG. 12 is a diagram for explaining processing of the template setting unit 23a and the pitch detection unit 23b.
- Each point on the straight line shown in FIG. 12 indicates a sampling point of the received voice signal.
- the rightmost sampling point indicates a loss occurrence time RT, and each sampling point indicates a past sampling point toward the left.
- the loss occurrence time RT is set as the 0th sampling point.
- the pitch of the received voice signal is about 3 msec in a short case, and if the sampling frequency is 8 kHz, it corresponds to 24 samples. Therefore, the initial time width may be 24 samples, for example.
- the template setting unit 23a sets the reception voice signals x (k + 1) to x (k + 4) as the template TM0.
- the pitch detection unit 23b calculates a correlation value ⁇ (0) between the template TM0 and the received voice signal x (j-0) using the equation (1).
- the template TM0 is applied to the audio signals x (k + 1) to x (k + 4).
- the template TM0 is applied to the audio signals x (k) to x (k + 3).
- the template setting unit 23a sets the audio signals x (k + 1) to x (k + 5) as the template TM5.
- the pitch detection unit 23b obtains a correlation value ⁇ (5) between the template TM5 and the audio signal x (j-5) using Expression (1).
- the template TM5 is applied to the audio signals x (k-4) to x (k).
- the pitch detection unit 23b repeats the above processing until ⁇ reaches the maximum slide amount ⁇ max, and obtains ⁇ ( ⁇ ). As a result, the time width of the template TM is increased as the slide amount increases.
- FIG. 13 shows a graph of the correlation value ⁇ ( ⁇ ) when the correlation value ⁇ ( ⁇ ) is obtained for the received voice signal shown in FIG. 10 using the method according to the present embodiment.
- the vertical axis indicates the correlation value ⁇ ( ⁇ )
- the horizontal axis indicates time in terms of the number of samples.
- the correlation value ⁇ ( ⁇ ) is calculated by AMDF. Therefore, as in FIG. 11, the correlation peak with the lower correlation value has a stronger correlation between the received voice signal and the template TM.
- the correlation peak PK1 when the template TM is shifted by 47 samples is the smallest.
- the pitch detector 23b detects 47 samples, which are the time when the minimum correlation peak PK1 appears, as the pitch of the received voice signal immediately before the loss occurrence time RT. Therefore, it can be seen that the pitch detector 23b can detect 47 samples, which are the pitches of the received voice signal immediately before the loss occurrence time RT shown in FIG.
- the compensation processing unit 24 extracts a reception voice signal for one pitch detected by the pitch detection unit 23b from the loss occurrence time point RT to the past, and compensates for a loss period in which a packet loss has occurred in the extracted reception voice signal Process.
- the received voice signal shown in FIG. 10 is input to the compensation processing unit 24 and the pitch detection unit 23b detects 47 samples as the pitch, the reception of 47 samples from the loss occurrence time RT to the past is performed. A voice signal is extracted, and the received reception voice signal is repeatedly applied to the end of the loss period to compensate for the loss period.
- FIG. 14 is a flowchart showing the procedure of the operation (audio data loss compensation processing) of the audio data loss compensation processing unit VC.
- the pitch detection unit 23b sets a reference sampling point k so that k + 1 becomes the starting point of the template TM, and assigns a sampling number to each sampling point (step S4).
- the pitch detection unit 23b calculates a correlation value between the template TM and the received voice signal using the equation (1) (step S5).
- step S7 the pitch detection unit 23b advances the process to step S8, where ⁇ ⁇ slide reference value If so (step S7), the process returns to step S5.
- step S7 the template TM having the initial time width is slid toward the past with respect to the received voice signal until the slide TM becomes the slide reference value.
- step S8 if ⁇ ⁇ max (step S8), the process returns to step S3, and the processes of steps S3 to S8 are repeated until ⁇ ⁇ ⁇ max. Thereby, the time width of the template TM is increased as ⁇ which is the slide amount increases.
- step S8 when ⁇ ⁇ ⁇ max (step S8), the pitch detector 23b detects a correlation peak from the correlation value calculated in step S5, and among the detected correlation peaks, the template TM and the received voice signal The slide amount of the correlation peak with the strongest correlation is identified, and the pitch is detected from the identified slide amount (step S9).
- the correlation peak indicating the minimum correlation value indicates the strongest correlation between the template TM and the received voice signal.
- the pitch detection unit 23b may calculate the pitch by multiplying the specified slide amount by the sampling period of the audio signal.
- the compensation processing unit 24 extracts the received voice signal according to the pitch detected in step S9, and compensates the loss period using the received received voice signal (step S10).
- a is set to 1 ⁇ a ⁇ 2 until the slide amount of the template TM exceeds a predetermined change reference value.
- the value of a may be gradually decreased so as to approach 1 as the slide amount approaches the maximum slide amount ( ⁇ max).
- the change reference value for example, the above-described slide reference value can be adopted.
- the time width of the template TM can be set larger than the slide amount, and when the slide amount is large, the time width of the template TM can be set to a value about the slide amount. it can. Therefore, when the slide amount is small, it is possible to prevent the correlation calculation accuracy from being lowered due to the time width of the template TM becoming too small.
- the received voice signal having a time width from the packet loss occurrence time point RT to the past is set as the template TM. Then, the set template TM is slid toward the past from the present time with respect to the received voice signal. Then, the correlation between the template TM and the received voice signal is obtained, and the pitch of the received voice signal is detected.
- the time width of the template TM increases as the slide amount increases. Therefore, at a relatively early stage where the slide amount is small, a timing occurs when the received voice signal for one pitch almost immediately before the current time is used as the template TM. At this time, a strong correlation peak appears between the template TM and the received voice signal. On the other hand, when the slide amount increases, the time width of the template TM increases accordingly, and the template TM includes a plurality of frequency components. Therefore, it becomes impossible to obtain a stronger correlation peak as the correlation peak obtained at the above timing. Therefore, it is possible to accurately detect the pitch of the received voice signal almost immediately before the current time.
- the fluctuation absorption processing unit JA includes a jitter buffer 30, a counting unit 31, a buffer size changing unit 32, a reception time recording unit 33, a reference value storage unit 34, a concealment processing unit 35, an output unit 36, and an observation history.
- a holding part 37 is provided.
- these units are realized by the DSP of the call processing unit 2 executing a fluctuation absorbing processing program in the second software.
- the jitter buffer 30 is shared with the jitter buffer 20 of the audio data loss compensation processing unit VC.
- the reception time recording unit 33 records the time (time stamp) when the transmission processing unit 7 receives the voice packet (received voice packet) in association with the sequence number of the received packet.
- the jitter buffer 30 is configured by, for example, a ring buffer, and accumulates packets received by the transmission processing unit 7 in chronological order. As a result, fluctuations in the transmission delay of the voice packet transmitted via the signal trunk line Ls are absorbed. As the size of the jitter buffer 30, a size larger than a reference value described later is adopted.
- the counting unit 31 calculates a packet count value by counting the number of accumulated packets accumulated in the jitter buffer 30 at a predetermined period (count period) that is equal to or less than a period in which voice is packetized (packetization period).
- the packet count value calculated by the count unit 31 is held in the observation history holding unit 37.
- the observation history holding unit 37 is composed of, for example, a volatile semiconductor memory, and holds the packet count value of the past N (N is a positive integer) calculated by the counting unit 31.
- FIG. 16 is an explanatory diagram of packet count value calculation processing by the count unit 31. As shown in FIG. 16, the count unit 31 calculates a packet count value at the count cycle Tb.
- the counting unit 31 sets the count value to a value obtained by ⁇ T / Ta for the packet PS received in the past in the packetization period Ta from the calculation time Tk that is the calculation timing of the packet count value, For the packet PL received before the packetization period Ta from the calculation time Tk, the packet count value is calculated by setting the count value to 1. That is, the packet count value of the packet PS decreases as the difference ⁇ T decreases as the reception time approaches the calculation time Tk.
- the reception time since the reception time is used in calculating the packet count value, it is necessary to hold the reception time.
- the packet PL since the reception time is not necessary for calculating the packet count value, it is not necessary to record the reception time.
- the counting unit 31 is At time Tk + 1, the reception time of the packet received in the past in the packetization period Ta can be acquired. In this way, the capacity of the reception time recording unit 33 can be saved.
- the buffer size changing unit 32 reads the past N packet count values of the packet count value calculated by the counting unit 31 from the observation history holding unit 37, and the nth smallest packet from the read N packet count values The count value is calculated as a representative value of the packet count value. If the calculated representative value is larger than a predetermined reference value, the packet stored in the jitter buffer 30 is deleted. If the representative value is smaller than the reference value, the jitter buffer Insert packet into 30. The reference value is stored in the reference value storage unit 34.
- the buffer size changing unit 32 may insert a packet into the jitter buffer 30 so that the representative value is not less than the reference value and less than the reference value + 1. For example, when the representative value is 2.1 and the reference value is 4, two packets are inserted into the jitter buffer 30 so that the representative value is 4.1. In addition, when the representative value is larger than the reference value, the buffer size changing unit 32 may delete the packet from the jitter buffer 30 so that the representative value is not less than the reference value and less than the reference value + 1. For example, when the representative value is 4.2 and the reference value is 2, two packets are deleted from the jitter buffer 30 so that the representative value is 2.2.
- n it is preferable to adopt a value rounded to an integer value by N ⁇ ⁇ .
- the reference value a value determined in advance based on a call delay time allowed by the intercom system for collective housing in an interphone call (call using a packet transmission method) is adopted. That is, if the number of packets stored in the jitter buffer 30 is larger than the reference value, the number of packets waiting for output in the jitter buffer 30 increases, so that a call delay occurs. Therefore, as described above, when the representative value that is the nth packet count value is larger than the reference value, it is possible to prevent call delay by deleting the packet from the jitter buffer 30.
- the packet is inserted into the jitter buffer 30.
- the concealment processing unit 35 performs a packet loss concealment process on invalid packets (packets that do not include voice; the same applies hereinafter) inserted into the jitter buffer 30 and when the packets are depleted in the jitter buffer 30.
- Perform packet loss concealment processing for example, the pitch of the received voice signal is detected from the received voice signal in the past from the invalid packet, and the valid packet immediately before the invalid packet (packet including voice; the same applies hereinafter).
- the voice waveform of the section one pitch before the end is taken out, and the voice waveform obtained by repeating this voice waveform for the period of packetization period (for example, 20 msec) is generated as the received voice signal of the invalid packet. It is sufficient to adopt a technique to do this.
- the pitch detection a method common to the pitch detection process in the audio data loss compensation process described above may be employed.
- the output unit 36 When the number of packets stored in the jitter buffer 30 exceeds the reference value, the output unit 36 reads packets (received voice data) from the jitter buffer 30 in chronological order in synchronization with the packetization period Ta, and receives the received voice signal Output to the route.
- the output unit 36 causes the concealment processing unit 35 to execute the packet loss concealment process, and outputs the voice data after the execution process.
- the observation history holding unit 37 is configured by, for example, a non-volatile semiconductor memory, and holds the packet count value of the past N times calculated by the counting unit 31.
- FIG. 17 is a diagram for explaining the role of the jitter buffer 30.
- a packet including a received voice signal is transmitted from the other party's call terminal (lobby interphone LI, management room device X, or other dwelling unit) at a packetization period (20 msec in the illustrated example).
- FIG. 17 shows a situation in which 8 packets with numbers 1 to 8 (sequence numbers) are transmitted at intervals of 20 msec.
- the packet transmitted from the other party's call terminal is received by the dwelling unit A via the signal trunk line Ls.
- voice packets transmitted from the partner telephone terminal at the packetization period reach the dwelling unit A.
- the time until the transmission time (transmission delay) is greatly different for each voice packet, and so-called transmission delay fluctuation occurs. Therefore, the reception intervals of voice packets by the dwelling unit A are unequal intervals.
- a jitter buffer 30 is provided to absorb this transmission delay fluctuation.
- the buffer size of the jitter buffer 30 is three packets.
- the output unit 36 starts the output by performing the decoding process and the D / A conversion process on the first packet at the time T1 when the delay time Td has elapsed since the reception of the first packet. .
- the jitter buffer 30 stores the second packet at time T2, which is the output time of the second packet after 20 msec from time T1. Therefore, the output unit 36 can output the second packet at time T2.
- the third packet since the third packet has an extremely large transmission delay, it does not reach the dwelling unit A at the time T3 and the jitter buffer 30 is depleted. For this reason, the output unit 36 cannot output the third packet at time T3, and sound loss (voice data loss) occurs.
- the third to seventh packets reach the dwelling unit A continuously in a short time after the congestion is eliminated.
- the jitter buffer 30 includes the fifth and sixth pieces. However, since the jitter buffer 30 is empty, the seventh packet is not discarded and stored in the jitter buffer 30. Therefore, the seventh packet is output from the output unit 36 at time T7.
- the buffer size of the jitter buffer 30 is set to a fixed size, the transmission delay fluctuation must be sufficiently longer than the assumed transmission delay fluctuation. Moreover, if the buffer size of the jitter buffer 30 is made sufficiently long and the delay time Td is made sufficiently long, the occurrence of sound omission can be prevented, but if the delay time Td is long, the jitter buffer 30 waits for output. Packets increase and call delay occurs.
- FIG. 18 shows an example of a transmission delay characteristic graph showing the relationship between the transmission delay and the frequency of occurrence of the transmission delay.
- the vertical axis indicates the occurrence frequency
- the horizontal axis indicates the transmission delay.
- FIG. 19 is a diagram for explaining an optimum buffer size of the jitter buffer 30.
- dmin represents the minimum transmission delay
- dmax represents the maximum transmission delay.
- the transmission delay of the (k-1) th packet is dmin
- the transmission delay of the kth packet is d
- the transmission delay of the (k + 1) th packet is dmax.
- the optimum output waiting time by the output unit 36 is as follows. i) Packets received with dmax are output immediately. ii) Wait for dmax-dmin before outputting packets that arrive at dmin. iii) The packet arrived at d is output after waiting dmax-d.
- the buffer size buf of the jitter buffer 30 may be set to buf ⁇ dmax ⁇ dmin.
- dmax of the transmission delay characteristic becomes extremely large, that is, FIG. If the tail at the right end of the graph becomes extremely long, the buffer size buf will increase.
- the frequency of occurrence decreases as the transmission delay increases, in order to observe the true dmax, it is necessary to observe the transmission delay of a huge number of packets. For this reason, in the graph of FIG. 18, not true dmax but a value obtained by rounding down the upper few percent of the distribution of transmission characteristics is regarded as dmax. In this case, when a transmission delay exceeding the value considered as dmax occurs, packet depletion occurs.
- FIG. 20 is a flowchart showing the fluctuation absorption processing of the fluctuation absorption processing unit JA.
- the counting unit 31 determines whether or not the packet count value calculation timing comes after the count period Tb has elapsed since the packet count value calculation timing was calculated last time. If the counting unit 31 determines that the packet count value calculation timing has come (YES in step S1), the counting unit 31 counts the number of accumulated packets that are currently accumulated in the jitter buffer 30 (step S2). On the other hand, when determining that the packet count value calculation timing has not come (NO in step S1), the counting unit 31 returns the process to step S1.
- the count unit 31 executes a packet count value calculation process to calculate a packet count value (step S3).
- FIG. 21 is a flowchart showing details of packet count value calculation processing.
- the count unit 31 specifies the current time as the packet count value calculation time (step S21).
- the control unit 1 of the dwelling unit A has a clock function, the calculated time can be specified using the clock function.
- the counting unit 31 specifies the reception time of each packet received in the past in the packetization period Ta from the calculation time Tk as shown in FIG. 16 among the packets stored in the jitter buffer 30. (Step S22). In this case, the count unit 31 specifies the reception time of each packet by specifying the sequence number associated with the reception time recorded in the reception time recording unit 33.
- the counting unit 31 calculates a difference ⁇ T between the calculation time Tk and the reception time for each packet received in the past in the packetization period Ta (step S23).
- the counting unit 31 calculates ⁇ T / Ta for each packet received in the past in the packetization period Ta, and sets this ⁇ T / Ta as the count value of each packet (step S24).
- the count unit 31 sets the count value to 1 for packets received from the calculation time Tk before the packetization period Ta among the packets stored in the jitter buffer 30 (step S25). ).
- the count unit 31 calculates the packet count value by counting the number of packets stored in the jitter buffer 30 using the count value set in steps S24 and S25 (step S26). For example, from the calculation time Tk, the number of packets received before the packetization cycle Ta in the past is 1, and from the calculation time Tk, the number of packets received in the past within the packetization cycle Ta is two. When the reception time of each packet is Ti and Tj, the packet count value is 1+ (Tk ⁇ Ti) / Ta + (Tk ⁇ Tj) / Ta.
- the counting unit 31 deletes the reception time from the reception time recording unit 33 for packets received in the past and before Ta-Tb from the calculation time Tk (step S27).
- the counting unit 31 causes the observation history holding unit 37 to hold the packet count value at the calculation time Tk. In this case, the count unit 31 deletes the oldest packet count value from the observation history holding unit 37 so that the number of packet count values held in the observation history holding unit 37 is N.
- the buffer size changing unit 32 specifies the nth smallest packet count value among the N packet count values stored in the observation history holding unit 37 as a representative value (step S5).
- FIG. 22 is a schematic diagram showing the relationship between the packet count value and the calculation time of the packet count value.
- the vertical axis shows the packet count value
- the horizontal axis shows the calculation time of the packet count value.
- the buffer size changing unit 32 determines whether or not the representative value is greater than the reference value. If representative value ⁇ reference value + 1 (YES in step S6), the representative value is greater than or equal to the reference value and the reference value + The number of packets that is less than 1 is deleted from the jitter buffer 30 (step S7).
- the buffer size changing unit 32 subtracts the number of packets deleted in step S7 from each of the N packet count values held in the observation history holding unit 37, and updates the N packet count values.
- the observation history is updated (step S8). For example, assuming that the number of deleted packets is 1, 1 is subtracted from all N packet count values. Thereby, the fact that the packet is deleted from the jitter buffer 30 is reflected in the observation history.
- step S6 when the representative value is less than the reference value +1 (NO in step S6) and the representative value is equal to or larger than the reference value (NO in step S9), the buffer size changing unit 32 is configured to use the jitter buffer 30. The packet is not deleted or inserted in step S10.
- the buffer size changing unit 32 inserts into the jitter buffer 30 a number of packets whose representative value is greater than or equal to the reference value and less than the reference value + 1 (step S11). ).
- the buffer size changing unit 32 adds the number of packets inserted in step S11 to each of the N packet count values held in the observation history holding unit 37, and updates the N packet count values. Then, the observation history is updated (step S12). For example, if the number of inserted packets is 1, 1 is added to all N packet count values. Thereby, the fact that the packet is inserted into the jitter buffer 30 is reflected in the observation history.
- step S8 the process returns to step S1, and when the next packet count value calculation time comes, the processes after step S2 are executed.
- FIG. 23A is a schematic diagram showing processing at the time of packet insertion by the buffer size changing unit 32
- FIG. 23B is a schematic diagram showing processing at the time of packet deletion by the buffer size changing unit 32.
- the buffer size changing unit 32 inserts an invalid packet between the fourth packet and the fifth packet, which are valid packets.
- the buffer size changing unit 32 overlaps the fourth packet and the fifth packet, which are valid packets, so that two packet lengths become one packet length. Has been deleted.
- the packet count value is calculated from the number of packets stored in the jitter buffer 30, and the nth smallest packet count value is specified as the representative value among the past N packet count values. If the identified representative value is larger than the reference value, the packet is deleted from the jitter buffer 30. For this reason, the number of packets stored in the jitter buffer 30 tends to be larger than the reference value from the past history of the packet count value, and if output delay occurs, the packet is deleted from the jitter buffer 30 and the output delay is reduced. Is done.
- the packet is inserted into the jitter buffer 30 Therefore, it is possible to prevent packet depletion.
- the count unit 31 sets the count value for the latest packet to a value obtained by the difference ⁇ T / Ta between the calculation time Tk and the reception time of the latest packet, and sets the count value to 1 for other packets. To calculate a packet count value.
- the counting unit 31 has received the packet received in the packetization period Ta in the jitter buffer 30 when the packets received in the past in the packetization period Ta have been accumulated from the calculation time Tk.
- the packet PS having the latest reception time is identified from the packets, and the count value of the latest packet PS is set to ⁇ T / Ta.
- the count unit 31 uniformly sets the count value to 1 for the packets PL1 and PL2 other than the latest packet PS among the packets stored in the jitter buffer 30.
- the packet count value calculation process is performed. After the completion, the reception record recorded in the reception time recording unit 33 is deleted.
- step S31, S33, S34, and S36 in FIG. 25 are the same as steps S21, S23, S24, and S26 in FIG.
- the counting unit 31 specifies the reception time of the latest packet among the packets received in the past in the packetization period Ta from the calculation time Tk in the jitter buffer 30. Further, the count unit 31 uniformly sets the count value to 1 for packets other than the latest packet from the calculation time Tk (step S35).
- step S37 the count unit 31 deletes the latest packet reception time from the reception time recording unit 33.
- the packet count value is calculated by the above-described method, it is only necessary to record the reception time for only the latest packet, so that the capacity of the reception time recording unit 33 can be further saved.
- the fluctuation absorption processing unit JA determines whether or not a spike delay has occurred. If a spike delay has occurred, the window width of the past packet count value to be referred to is shortened, and packets within the shortened window width are detected. It is preferable to calculate the representative value from the count value.
- the count unit 31 stores the calculated packet count value in the observation history holding unit 37 in association with an index for indicating the time-series order of each packet count value. Specifically, since the observation history holding unit 37 holds the packet count value of the past N times, the count unit 31 has an index of N for the latest packet count value and an index of 1 for the oldest packet count value. Thus, an index is added to the past N packet count values so that the index increases as the calculation time becomes new.
- the counting unit 31 determines the presence or absence of a spike delay based on the past N packet count values held in the observation history holding unit 37, and determines that the spike delay has occurred. From the packet count value of the number of times, the packet count value of the past M (M ⁇ N) times is extracted.
- the counting unit 31 determines the presence or absence of a spike delay as follows.
- FIG. 26 is a graph for explaining the determination processing for the presence or absence of spike delay.
- the vertical axis indicates the packet count value
- the horizontal axis indicates the index.
- N 100.
- the count unit 31 specifies a packet count value that is equal to or less than the reference value.
- the packet count values at points PP1 to PP6 are below the reference value.
- the count unit 31 specifies the smallest index, that is, the oldest point, and the largest index, that is, the latest point among packet count values equal to or less than the reference value.
- the counting unit 31 specifies the points PP1 and PP6.
- the count unit 31 obtains a difference ⁇ I between the minimum index and the maximum index.
- the counting unit 31 determines that a spike delay has occurred if the difference ⁇ I is smaller than a predetermined threshold, and determines that no spike delay has occurred if the difference ⁇ I is larger than the threshold.
- FIG. 27 is a graph showing the relationship between the packet count value and the index when spike delay occurs.
- the vertical axis represents the packet count value
- the horizontal axis represents the index.
- the packet count values at points PP1 to PP5 are equal to or less than the reference value.
- the point PP1 has the smallest index
- the point PP5 has the largest index.
- the difference ⁇ I between the index of the point PP1 and the index of the point PP5 is smaller than the threshold value. Therefore, the count unit 31 determines that a spike delay has occurred.
- the count unit 31 determines that the spike delay has occurred as shown in FIG. 27, the count unit 31 extracts the past M packet count values from the calculation time Tk.
- the buffer size changing unit 32 calculates the m-th smallest packet count value among the past M packet count values as a representative value. Thereafter, the buffer size changing unit 32 compares the representative value with the reference value, and inserts or deletes the packet in the jitter buffer 30.
- m a value obtained by rounding M ⁇ ⁇ with an integer can be adopted.
- the window width of the past packet count value to be referred to is narrowed, and a packet is inserted into or deleted from the jitter buffer 30. Therefore, the representative value can be calculated in such a manner that spike delays that rarely occur are eliminated.
- the packet count value when the number of accumulated packets of 0 occurs continuously, it is preferable to calculate the packet count value as follows.
- the count unit 31 sets, as the packet count value, a negative value that increases in absolute value as the number of consecutive 0 stored packet numbers increases when the number of 0 stored packet numbers continues. calculate.
- FIG. 28A and 28B are diagrams for explaining the processing of the counting unit 31.
- FIG. 28A packets are received immediately after the packet count value calculation times Tk-4, Tk-3, Tk-2, and Tk-1 in each section of the count cycle Tb.
- the output unit 36 receives the packet from the jitter buffer 30 in each section until the next packet count value calculation time Tk-3, Tk-2, Tk-1, Tk elapses. Reading (received voice data). For example, a packet received immediately after the calculation time Tk-4 is read out until the next calculation time Tk-3 elapses. Therefore, at each calculation time Tk-4, Tk-3, Tk-2, Tk-1, Tk, the number of stored packets in the jitter buffer 30 is zero. Therefore, the count unit 31 calculates the packet count value as 0 at each of the calculation times Tk-4, Tk-3, Tk-2, Tk-1, and Tk.
- FIG. 28A and 28B the situation of the signal trunk line Ls is greatly different. That is, in FIG. 28A, the packet periodically reaches the dwelling unit A, and the output unit 36 can continuously output the packet. However, in FIG. Therefore, the output unit 36 cannot output continuously.
- the counting unit 31 performs the following processing. First, the difference between the calculated time (current time) and the latest packet reception time is compared with the count cycle Tb. If the difference is smaller than the count cycle Tb, it is determined that the situation in FIG. On the other hand, if the difference is greater than the count cycle Tb, it is determined that no packet has been received since the previous calculation time, that is, the situation in FIG. 28B, and the following processing is performed. That is, as shown in FIG. 28B, the number of accumulated packets is 0 at the calculation time Tk-3, and the number of accumulated packets is 0 at the calculation time Tk-2. The number of consecutive numbers is one. In this case, the count unit 31 calculates 0 as the packet count value at the calculation time Tk-2.
- the count unit 31 calculates ⁇ 1, which is a value obtained by multiplying the value obtained by subtracting 1 from 2 that is the number of consecutive times by ⁇ 1, as the packet count value at the calculation time Tk ⁇ 1.
- the count unit 23 calculates -2, which is a value obtained by multiplying the value obtained by subtracting 1 from 3 which is the number of consecutive times, and -1. Calculated as the packet count value at Tk. Therefore, the counting unit 31 calculates (number of consecutive times ⁇ 1) ⁇ ( ⁇ 1) as the packet count value.
- the packet can be received periodically as shown in FIG. 28A
- the packet can be received periodically as shown in FIG. 28B when the number of stored packets happens to be zero at the calculation time.
- the packet count value can be calculated in consideration of the difference from the case where the packet is not received. Therefore, in the case of FIG. 28B, packets are less likely to be deleted from the jitter buffer 30 than in the case of FIG. 28A.
- the buffer size changing unit 32 deletes one packet from the jitter buffer 30, if there are two or more valid packets including voice in succession, two consecutive consecutive packets located in the middle of these consecutive valid packets will be described. Two valid packets are overlapped and deleted.
- FIG. 29A, 29B, and 29C are explanatory diagrams of processing in which the buffer size changing unit 32 deletes one packet by overlap addition, FIG. 29A shows the jitter buffer 30 before deletion, and FIG. 29B shows jitter after deletion. A buffer 30 is shown.
- the read pointer RP indicates the start address of the jitter buffer 30 having a ring buffer structure
- the write pointer WP indicates the end address of the jitter buffer 30.
- each ⁇ indicates one packet, and the numbers in ⁇ indicate the time-series order of the packets.
- a white wrinkle indicates an invalid packet
- a gray wrinkle indicates a valid packet.
- the packets are combined into one packet by addition, and one packet is deleted.
- one packet can be deleted by overlap addition, but packet loss concealment processing is performed when overlap addition is performed in a section where there are many consecutive valid packets. It is possible to reduce voice deterioration when
- overlap addition using triangular window functions RF1 and RF2 can be adopted as shown in FIG. 29C.
- the buffer size changing unit 32 performs window function processing using the triangular window function RF1 on the audio signal of the fifth packet, and applies the triangular window to the audio signal of the sixth packet.
- the window function processing using the function RF2 is performed, the two audio signals after the window function processing are added to generate one audio signal, and this is packetized into one to perform overlap addition.
- the triangular window function RF1 a linear function having a time width of 20 msec, a maximum value of 1 and a minimum value of 0 and decreasing in value as time passes can be adopted.
- the triangular window function RF2 a linear function having a time width of 20 msec, a maximum value of 1 and a minimum value of 0 and increasing in value as time passes can be adopted.
- the buffer size changing unit 32 deletes the invalid packet if there is an invalid packet inserted in the past.
- FIG. 30A and 30B are explanatory diagrams of processing in which the buffer size changing unit 32 deletes one invalid packet.
- FIG. 30A shows the jitter buffer 30 before deletion
- FIG. 30B shows the jitter buffer 30 after deletion. Yes.
- the third and fourth packets are invalid packets. Therefore, the buffer size changing unit 32 deletes one packet by deleting either the third or the fourth packet.
- the buffer size changing unit 32 preferentially extracts invalid packets in a continuous area, and randomly selects one invalid packet from the extracted invalid packets. A packet may be selected and deleted.
- the buffer size changing unit 32 inserts an invalid packet between these two valid packets if there are two consecutive valid packets.
- FIG. 31A and 31B are explanatory diagrams of processing in which the buffer size changing unit 32 inserts one packet.
- FIG. 31A shows the jitter buffer 30 before insertion
- FIG. 31B shows the jitter buffer 30 after insertion. .
- one invalid packet is inserted between the fifth valid packet and the sixth valid packet. This is because inserting one invalid packet between the fifth valid packet and the sixth valid packet increases the number of consecutive valid packets.
- the buffer size changing unit 32 inserts invalid packets in the middle of a section where the number of consecutive valid packets is large.
- the buffer size changing unit 32 has a predetermined upper limit value for the number of packets that can be inserted or deleted at a time.
- 32A and 32B are diagrams for explaining processing when five packets are inserted into the jitter buffer 30 at once, FIG. 32A shows the jitter buffer 30 before insertion, and FIG. 32B shows the jitter buffer after insertion. 30 is shown.
- 32A and 32B five invalid packets are inserted between the first valid packet and the second valid packet. In this case, since there are continuous invalid packets, there is a risk that voice deterioration will increase. Therefore, an upper limit is set for the number of invalid packets inserted.
- “at once” refers to one process executed when the above-described count cycle Tb has been reached.
- the upper limit value is set to 3 in FIG. 32A, even if it is necessary to insert five invalid packets, only three invalid packets are inserted.
- the buffer size changing unit 32 receives another valid packet corresponding to the deleted invalid packet. Replace the packet with the received valid packet.
- FIG. 33A, 33B, and 33C are diagrams for explaining processing when a valid packet corresponding to a deleted invalid packet is received after deleting the invalid packet.
- FIG. 33A shows the jitter buffer 30 before deletion
- FIG. 33B shows the jitter buffer 30 after deletion
- FIG. 33C shows the jitter buffer 30 after replacement.
- the third invalid packet has been deleted. Thereafter, as shown in FIG. 33C, the third valid packet corresponding to the third invalid packet is received.
- the buffer size changing unit 32 replaces the fourth invalid packet with the received third valid packet. As a result, the third valid packet can be restored, and voice deterioration can be reduced.
- the buffer size changing unit 32 determines whether or not invalid packets corresponding to the accumulated packet are accumulated in the jitter buffer 30. Then, if the corresponding invalid packet is accumulated in the jitter buffer 30, the buffer size changing unit 32 determines whether the invalid packet is stored next to the invalid packet, and the invalid packet is stored. If it is, the next invalid packet is deleted, and the received valid packet is inserted into the deleted location, so that the next invalid packet and the received valid packet are exchanged.
- the buffer size changing unit 32 may determine that a valid packet corresponding to an invalid packet has been received when a packet having the same sequence number as that of the invalid packet is accumulated in the jitter buffer 30.
- the buffer size changing unit 32 causes the concealment processing unit 35 to execute a packet loss concealment process using the previous valid packet, thereby concealing.
- a processed packet may be generated and inserted into the jitter buffer 30.
- FIG. 34A and 34B are diagrams for explaining processing when the buffer size changing unit 32 inserts a concealed packet in place of an invalid packet into the jitter buffer 30, and FIG. 34A shows the jitter buffer 30 before insertion. FIG. 34B shows the jitter buffer 30 after insertion.
- a concealed packet is inserted between the third valid packet and the fourth valid packet.
- the output unit 36 reads a packet (voice data) from the jitter buffer 30, it is not necessary to execute the packet loss concealment process, and the processing delay of the packet loss concealment process at the time of output can be reduced.
- the buffer size changing unit 32 preferably inserts an invalid packet between two consecutive packets including vowel sounds. Thereby, the voice generated by executing the packet loss concealment process on the inserted invalid packet is continuously connected to the voice included in the preceding and succeeding packets, and voice deterioration can be reduced.
- FIG. 35 is a flowchart showing the deletion process by the buffer size changing unit 32.
- step S51 the buffer size changing unit 32 determines whether or not the number of packet deletion requests is equal to or less than a predetermined maximum packet deletion number (upper limit), and the number of deletion requests is equal to or less than the upper limit value. If so (YES in step S51), the deletion count value DN is set to the number of deletion requests (step S52). On the other hand, when the number of deletion requests is larger than the upper limit value (NO in step S51), the deletion count value DN is set to the upper limit value (step S53).
- a predetermined maximum packet deletion number upper limit
- the buffer size changing unit 32 has a maximum continuous number that is twice or more the deletion count value DN. It is determined whether or not (step S55). Here, it is determined whether or not the maximum continuous number is twice the deletion count value DN. When one packet is deleted, two packets are overlap-added. This is because twice the value DN is required.
- the buffer size changing unit 32 determines that the maximum number of consecutive times is twice or more the deletion count value DN (YES in step S55)
- the buffer size changing unit 32 deletes the packet corresponding to the deletion count value DN by overlap addition
- the delete count value DN is updated by subtracting the number of deleted packets from the value DN (step S58).
- step S55 when the maximum continuous number is less than twice the deletion count value DN in step S55 (NO in step S55), the buffer size changing unit 32 deletes the deleteable packet by overlap addition, and deletes the deletion count value. The number of deleted packets is subtracted from the DN, the deletion count value DN is updated (step S56), and the process returns to step S54.
- step S54 if the maximum number of consecutive valid packets is 1 or less (1 or less in step S54), invalid packets are deleted, and the deleted count value DN is subtracted from the deleted count value DN. Is updated (step S57).
- step S59 the buffer size changing unit 32 determines whether or not the deletion count value DN is 0. If the deletion count value DN is 0 (YES in step S59), the process ends.
- step S59 if the deletion count value DN is not 0 (NO in step S59), the buffer size changing unit 32 deletes the effective packet and processes it if there is a valid packet (YES in step S60). Is finished (step S61). In this case, since the valid packet to be deleted is not continuous with other valid packets, it is simply deleted regardless of overlap addition. On the other hand, if there is no valid packet (NO in step S60), the process is terminated as it is.
- FIG. 36 is a flowchart showing the insertion processing by the buffer size changing unit 32.
- step S71 the buffer size changing unit 32 determines whether or not the number of packet insertion requests is equal to or less than a predetermined maximum packet insertion number (upper limit), and the number of deletion requests is equal to or less than the maximum number of insertions. If there is (YES in step S71), the number of insertions is set to the number of insertion requests (step S72). On the other hand, if the number of insertion requests is larger than the maximum number of insertions (NO in step S71), the number of insertions is set to the maximum number of insertions (step S73).
- a predetermined maximum packet insertion number upper limit
- Step S75 the process is terminated.
- the buffer size changing unit 32 inserts invalid packets by the number of insertions in the middle of the continuous valid packet section. Is inserted (step S76), and the process is terminated.
- the buffer size changing unit 32 inserts invalid packets for the number of insertions immediately after the valid packets (step S77). ), The process ends.
- one packet is deleted from the jitter buffer 30
- one packet is generated by overlapping and adding two packets located in the middle of a section where two or more valid packets are continuous. Therefore, voice quality degradation can be reduced.
- packet loss concealment processing performed by the concealment processing unit 35 of the fluctuation absorption processing unit JA can be replaced by the voice data loss compensation processing by the voice data loss compensation processing unit VC described above.
- the call processing unit 2 executes the first software when the other party's call terminal is an analog transmission method, and the call processing unit 2 is the case when the other terminal is a packet transmission method.
- the second software By executing the second software, call processing suitable for each transmission method can be selectively executed.
- the packet transmission method is used for voice transmission via the signal trunk line Ls
- the analog transmission method is used for voice transmission in the vicinity of the house not via the signal trunk line Ls. It is possible to improve the call quality.
- Embodiment 2 Hereinafter, the second embodiment of the present invention will be described in detail with reference to FIGS.
- the same elements as those in the intercom system for multi-dwelling houses of Embodiment 1 are assigned to the same elements, and the description thereof is omitted.
- both the voice data loss compensation process and the speech speed conversion process in the first embodiment described above use the pitch of the voice, it is necessary to perform a pitch detection process for detecting the pitch of the voice.
- the audio data loss compensation processing program and the speech speed conversion processing program are each equipped with a pitch detection processing program (program module)
- a memory for loading the program is wasted. Therefore, in this embodiment, the pitch detection processing program for detecting the pitch of the speech is made independent of the speech data missing compensation processing and the speech speed conversion processing program, and is detected by the pitch detection processing in the speech data missing compensation processing and speech speed conversion processing. This is characterized in that the same pitch is shared, and this can reduce wasteful consumption of memory.
- the speech speed conversion processing unit SE of the present embodiment may execute voice quality conversion processing other than speech speed conversion processing, speech segment detection processing, speech enhancement processing, speaker discrimination processing, speech recognition processing, and the like. I do not care.
- the call processing unit 2 of the present embodiment includes an acoustic echo canceller EC1, a voice switch VS, a voice data missing detection unit 15, a pitch detection unit 16, a voice data missing compensation processing unit VC, and a speech speed conversion process.
- Department SE is provided.
- the audio data loss detection unit 15 detects the loss of audio data output from the transmission processing unit 7, and the audio data is lost when the audio data output from the jitter buffer of the transmission processing unit 7 is not continuous. A detection flag is set up. Note that the cause of missing audio data includes packet loss, delay, and jitter (fluctuation) associated with transmission as described in the first embodiment.
- the pitch detection unit 16 Based on the detection flag from the audio data loss detection unit 15 and the counter inside the pitch detection unit 16, the pitch detection unit 16 outputs audio data (audio data with missing compensation or This is to detect the pitch of audio from audio data that has not been compensated for omission (the same applies hereinafter).
- audio data audio data with missing compensation or This is to detect the pitch of audio from audio data that has not been compensated for omission (the same applies hereinafter).
- a specific method of pitch detection for example, a method of calculating the autocorrelation of speech while changing the frame length and estimating the frame length having the highest correlation as the pitch of the speech may be used.
- the audio data loss compensation processing unit VC detects the audio data loss based on the pitch detected by the pitch detection unit 16 when the audio data loss detection unit 15 detects the audio data loss (when the detection flag is set). To compensate.
- the audio data loss compensation processing unit VC extracts audio data for one pitch from past audio data held in the buffer and makes up for it so that the audio is not interrupted. However, if there is no missing voice data, the voice data missing compensation processing unit VC outputs the input voice data as it is without missing compensation.
- the speech rate conversion processing unit SE converts the speech rate of the original speech by expanding or compressing the speech data output from the speech data loss compensation processing unit VC.
- PICOLA Pointer Interval Controlled OverLap
- the speech speed is converted (fast or slow) by inserting or deleting waveforms in units of pitches based on a conventionally known speech speed conversion algorithm called “and Add”. These units are realized by causing a DSP (Digital Signal Processor) to execute a predetermined program.
- DSP Digital Signal Processor
- the voice data loss compensation processing unit VC and the speech speed conversion processing unit SE individually perform pitch detection processing, when the voice data loss compensation processing and the speech speed conversion processing are simultaneously executed in the call processing unit 2
- the processing load increases.
- the call processing unit 2 of the present embodiment has only one pitch detection unit 16, and both the voice data loss compensation processing unit VC and the speech rate conversion processing unit SE are a common pitch detection unit 16. The detected pitch is used. Therefore, when both the voice data loss compensation processing unit VC and the speech speed conversion processing unit SE share the pitch detected by the pitch detection unit 16, the voice data loss compensation processing and the speech speed conversion processing are executed simultaneously. An increase in processing load (DSP processing load on the DSP) can be suppressed.
- the pitch detection unit 16 in the present embodiment counts a predetermined detection cycle Tx and repeatedly detects the pitch in synchronization with the detection cycle Tx, and the audio data loss detection unit 15 detects that audio data is missing.
- the pitch is detected at the detection time point t1 of the missing audio data, and the detection cycle Tx is restarted from the detection time point t1. That is, when the pitch detection unit 16 repeatedly detects the pitch in synchronization with a certain detection cycle Tx, the speech speed conversion processing unit SE detects the pitch of the speech section in which the speech speed conversion process is executed and the pitch detection unit 16 detects the pitch. Therefore, the quality of speech after conversion of speech speed can be maintained. It should be noted that it is desirable to set the detection cycle Tx to a time during which the voice can be regarded as steady, for example, about 10 milliseconds.
- the pitch detection unit 16 immediately detects the pitch regardless of the detection cycle Tx, so that the audio data loss compensation processing unit VC performs the audio data loss compensation. Quality in processing can be maintained.
- the pitch detection unit 16 detects only a pitch in a predetermined frequency range. In other words, since the frequency of the voice waveform in a normal voice call is within the frequency range of a few hundred tens to a few hundreds of hertz, if only the pitch in the frequency range is detected, the pitch detection in the unnecessary frequency range can be performed. By not doing so, the processing load can be reduced.
- the speech speed conversion processing unit SE detects the speech section of the speech data and converts only the speech data in the speech section. That is, the processing load in the speech speed conversion process can be reduced by not performing the speech speed conversion process in a section other than the speech section (for example, a silent section).
- the voice data loss detection unit 15 and the pitch detection unit 16 perform a voice data loss detection process and a pitch detection process every ⁇ / 4 hours.
- the control of the timing at which the pitch detection unit 16 executes the pitch detection process is simplified. There is.
- the speech speed conversion processing unit SE detects that the voice data is missing. If speech speed conversion is performed using the pitch detected by the pitch detection unit 16 immediately before detection, it is possible to suppress deterioration in speech quality due to the speech speed conversion processing.
- the speech speed conversion may be performed using the pitch detected by the pitch detection unit 16 from the voice data compensated by the unit VC. In this way, even when the speech speed conversion process is started when audio data is missing, the pitch detection unit 16 only needs to execute the pitch detection process at a constant detection cycle Tx. 16 has an advantage that the control of the timing for executing the pitch detection process becomes simple.
- the dwelling unit A has a recording unit (not shown) that can record the audio data output from the audio data loss compensation processing unit VC.
- a recording unit (not shown) that can record the audio data output from the audio data loss compensation processing unit VC.
- speech speed conversion processing is performed by the speed conversion processing section SE.
- the ease of listening is improved by performing the speech speed conversion process not only on the speech section but also on the non-speech section.
- the speech speed conversion process is performed even for a non-speech section during a normal call, a delay due to the speech speed conversion process increases, which hinders natural conversation.
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Interconnected Communication Systems, Intercoms, And Interphones (AREA)
- Telephone Function (AREA)
- Telephonic Communication Services (AREA)
Abstract
Description
以下、図1~36を参照して本発明の実施形態1を詳細に説明する。まず、本発明に係る住戸機が含まれる集合住宅用インターホンシステムについて説明する。 (Embodiment 1)
Hereinafter,
なお、α'(n),β'(n),Lr(n)はそれぞれ更新モード移行時点からn回目のサンプリングによって算出された帰還利得の推定値並びに総損失量所望値を示す。さらに、総損失量算出部103は上式から算出したn回目の総損失量所望値Lr(n)と、前回(n-1回目)の総損失量Lt(n-1)、すなわち前回の処理で決定されて実際に挿入された総損失量に対して今回算出した総損失量所望値Lr(n)が大きい場合、前回の総損失量Lt(n-1)に微少な増加量Δi[dB]を加算した値を今回の総損失量Lt(n)=Lt(n-1)+Δiとし(ステップ3、ステップ4)、前回の総損失量Lt(n-1)に対して今回算出した総損失量所望値Lr(n)が小さい場合、前回の総損失量Lt(n-1)から微少な減少量Δd[dB]を減算した値を今回の総損失量Lt(n)=Lt(n-1)-Δdとする(ステップ5、ステップ6)。 Lr (n) = 20log | α '(n) · β' (n) | + MG [dB]
Note that α ′ (n), β ′ (n), and Lr (n) indicate an estimated value of feedback gain and a desired total loss amount calculated by the nth sampling from the update mode transition point, respectively. Further, the total loss
以下、図37、38を参照して本発明の実施形態2を詳細に説明する。なお、明瞭のため同様の要素には実施形態1の集合住宅用インターホンシステムと同じ符号が割り当てられて説明を省略する。 (Embodiment 2)
Hereinafter, the second embodiment of the present invention will be described in detail with reference to FIGS. For the sake of clarity, the same elements as those in the intercom system for multi-dwelling houses of
以下、図39A~42を参照して本発明の実施形態3を詳細に説明する。なお、明瞭のため同様の要素には実施形態2の集合住宅用インターホンシステムと同じ符号が割り当てられて説明を省略する。 (Embodiment 3)
Hereinafter, the third embodiment of the present invention will be described in detail with reference to FIGS. 39A to 42. For the sake of clarity, similar elements are assigned the same reference numerals as those for the intercom system for collective housing of the second embodiment, and description thereof is omitted.
Claims (27)
- 集合住宅の共用玄関に設置される共用部装置と、当該集合住宅の各住戸内に設置される住戸機と、前記集合住宅の外玄関に設置されるドアホン子機と、前記共用部装置に接続された信号幹線と、当該信号幹線から分岐されて前記各住戸機に接続される住戸線と、前記住戸機とドアホン子機を接続する子機接続線とを有し、前記共用部装置と前記住戸機の間、並びに前記住戸機同士の間では前記信号幹線及び住戸線を介したパケット伝送方式によって通話音声が伝送され、前記住戸機と前記ドアホン子機との間では前記子機接続線を介してアナログ伝送方式によって通話音声が伝送される集合住宅用インターホンシステムの住戸機であって、
マイクロホン及びスピーカと、通話用の音声データが含まれる音声パケット及び呼制御用の制御データが含まれる制御パケットを前記住戸線並びに前記信号幹線を介して伝送する伝送処理部と、前記子機接続線を介してアナログの音声信号を伝送するアナログ信号伝送部と、前記マイクロホンから出力されるアナログの音声信号を音声データに変換し、音声データをアナログの音声信号に変換して前記スピーカに出力する第1の変換処理部と、前記アナログ信号伝送部で受信するアナログの音声信号を音声データに変換し、音声データをアナログの音声信号に変換して前記アナログ信号伝送部に出力する第2の変換処理部と、音声データに対して所定の通話処理を行う通話処理部と、前記ドアホン子機からの呼出を検出するドアホン呼出検出部と、アナログ伝送方式で伝送される音声データに対する通話処理用の第1のソフトウェアとパケット伝送方式で伝送される音声データに対する通話処理用の第2のソフトウェアを記憶する記憶部と、前記通話処理部に対して通話処理の実行を指示する制御部とを備え、
当該制御部は、前記ドアホン呼出検出部が前記呼出を検出した場合は前記第1のソフトウェアを実行するように前記通話処理部に指示し、前記共用部装置若しくは住戸機から呼制御用の制御データを受信した場合は前記第2のソフトウェアを実行するように前記通話処理部に指示することを特徴とする集合住宅用インターホンシステムの住戸機。 Connected to the common unit device installed in the common entrance of the apartment house, the dwelling unit installed in each dwelling unit of the apartment building, the door phone slave unit installed in the outer entrance of the apartment building, and the shared unit device A signal main line, a dwell unit line branched from the signal main line and connected to each dwell unit, a slave unit connection line connecting the dwell unit and the door phone slave unit, the shared unit device and the Call voice is transmitted between the dwelling units and between the dwelling units by the packet transmission method via the signal trunk line and the dwelling unit line, and between the dwelling unit and the door phone slave unit, the cordless handset connection line is connected. A dwelling unit of an intercom system for apartment houses, in which call voice is transmitted via an analog transmission method,
A microphone and a speaker; a transmission processing unit that transmits a voice packet including voice data for calling and a control packet including control data for call control via the dwelling unit line and the signal trunk line; and the slave unit connection line An analog signal transmission unit for transmitting an analog audio signal via the first, an analog audio signal output from the microphone is converted into audio data, and the audio data is converted into an analog audio signal and output to the speaker. 1 conversion processing unit and a second conversion process for converting an analog audio signal received by the analog signal transmission unit into audio data, converting the audio data into an analog audio signal, and outputting the analog audio signal to the analog signal transmission unit Unit, a call processing unit that performs predetermined call processing on voice data, and a door phone call detection that detects a call from the door phone slave unit A storage unit that stores first software for speech processing for voice data transmitted in an analog transmission system and second software for speech processing for speech data transmitted in a packet transmission system; And a control unit for instructing execution of call processing to
The control unit instructs the call processing unit to execute the first software when the door phone call detection unit detects the call, and receives control data for call control from the shared unit device or the dwelling unit. When the mobile phone is received, the call processing unit is instructed to execute the second software. - 前記第2のソフトウェアは、前記マイクロホンとスピーカの音響結合によって生じる音響エコーを抑圧する音響エコー抑圧処理のプログラムと、前記音響エコー抑圧処理では抑圧しきれない残留エコーを抑圧する残留エコー抑圧処理のプログラムとを含むことを特徴とする請求項1記載の集合住宅用インターホンシステムの住戸機。 The second software includes an acoustic echo suppression processing program for suppressing acoustic echo generated by acoustic coupling of the microphone and a speaker, and a residual echo suppression processing program for suppressing residual echo that cannot be suppressed by the acoustic echo suppression processing. The dwelling unit of the intercom system for collective housing according to claim 1, characterized in that
- 前記第2のソフトウェアは、前記伝送処理部における伝送遅延の揺らぎを吸収する揺らぎ吸収処理のプログラムを含むことを特徴とする請求項1又は2記載の集合住宅用インターホンシステムの住戸機。 3. The dwelling unit for an intercom system for an apartment house according to claim 1 or 2, wherein the second software includes a fluctuation absorption processing program for absorbing fluctuations in transmission delay in the transmission processing section.
- 前記伝送処理部で受信した前記音声パケットに含まれている音声データを蓄積する揺らぎ吸収用バッファを備え、
前記揺らぎ吸収処理プログラムは、前記音声パケットのパケット化周期よりも長くない周期で前記揺らぎ吸収用バッファに蓄積されている音声データのパケット数をカウントしてパケットカウント値を算出するカウントステップと、前記カウントステップで算出される前記パケットカウント値に基づいて、前記揺らぎ吸収用バッファにパケットを挿入又は削除するバッファサイズ変更ステップとを前記通話処理部に行わせることを特徴とする請求項3記載の集合住宅用インターホンシステムの住戸機。 A fluctuation absorbing buffer for accumulating voice data included in the voice packet received by the transmission processing unit;
The fluctuation absorbing processing program counts the number of voice data packets stored in the fluctuation absorbing buffer at a period not longer than the packetization period of the voice packet and calculates a packet count value; 4. The set according to claim 3, wherein the call processing unit is caused to perform a buffer size changing step of inserting or deleting a packet in the fluctuation absorbing buffer based on the packet count value calculated in the counting step. 5. Residential intercom system dwelling unit. - 前記揺らぎ吸収処理用プログラムは、前記バッファサイズ変更ステップにおいて、前記パケットカウント値の過去の履歴を基に、パケットカウント値の代表値を算出し、算出した代表値が所定の基準値より大きい場合、前記揺らぎ吸収用バッファからパケットを削除し、前記代表値が前記基準値より小さい場合、前記揺らぎ吸収用バッファにパケットを挿入する処理を前記通話処理部に行わせることを特徴とする請求項4記載の集合住宅用インターホンシステムの住戸機。 The fluctuation absorption processing program calculates a representative value of the packet count value based on the past history of the packet count value in the buffer size changing step, and if the calculated representative value is larger than a predetermined reference value, 5. The call processing unit according to claim 4, wherein when the packet is deleted from the fluctuation absorbing buffer and the representative value is smaller than the reference value, the call processing unit performs processing to insert the packet into the fluctuation absorbing buffer. Intercom system dwelling unit for multiple dwelling houses.
- 前記揺らぎ吸収処理用プログラムは、最新のパケットの受信時刻を前記通話処理部に記録させ、前記カウントステップにおいて、前記最新のパケットのカウント値を、前記パケットカウント値の算出タイミングである算出時刻と前記受信時刻との差分を前記パケット化周期で除した値に設定し、前記最新のパケット以外のパケットのカウント値を1に設定して前記パケットカウント値を算出する処理を前記通話処理部に行わせることを特徴とする請求項4又は5記載の集合住宅用インターホンシステムの住戸機。 The fluctuation absorption processing program causes the call processing unit to record the latest packet reception time, and in the counting step, the latest packet count value is calculated as the calculation time of the packet count value and the calculation time. Set the difference from the reception time divided by the packetization period, set the count value of packets other than the latest packet to 1, and cause the call processing unit to perform the process of calculating the packet count value The dwelling unit of the intercom system for collective housing according to claim 4 or 5.
- 前記揺らぎ吸収処理用プログラムは、前記カウントステップにおいて、過去N(Nは正の整数値)回のパケットカウント値を前記通話処理部に保持させ、前記バッファサイズ変更ステップにおいて、前記過去N回のパケットカウント値のうち、n(nはN未満の正の整数値)番目に小さいパケットカウント値を前記代表値とする処理を前記通話処理部に行わせることを特徴とする請求項5記載の集合住宅用インターホンシステムの住戸機。 The fluctuation absorbing processing program causes the call processing unit to hold the packet count value of the past N (N is a positive integer value) times in the counting step, and in the buffer size changing step, the packet of the past N times 6. The housing complex according to claim 5, wherein said call processing unit is caused to perform a process of setting a packet count value that is nth smallest (n is a positive integer value less than N) among said count values as said representative value. Intercom system dwelling unit.
- 前記揺らぎ吸収処理用プログラムは、前記カウントステップにおいて、前記過去N回のパケットカウント値に基づいて、スパイク遅延の有無を判定し、当該スパイク遅延が発生していると判定した場合は、前記過去N回のパケットカウント値のうち、過去M(MはM<Nの正の整数値)回のパケットカウント値を抽出する処理を前記通話処理部に行わせ、前記バッファサイズ変更ステップにおいて、前記カウントステップにより抽出された過去M回のパケットカウント値のうち、m(mはM未満の整数)番目に小さいパケットカウント値を前記代表値として算出する処理を前記通話処理部に行わせることを特徴とする請求項5記載の集合住宅用インターホンシステムの住戸機。 In the counting step, the fluctuation absorbing processing program determines the presence or absence of a spike delay based on the past N packet count values, and determines that the spike delay has occurred. The packet processing unit is caused to perform a process of extracting the packet count value of the past M (M is a positive integer value of M <N) out of the packet count value of the number of times, and in the buffer size changing step, the counting step The call processing unit is caused to perform a process of calculating, as the representative value, a packet count value that is mth (m is an integer less than M) among the past M packet count values extracted by The dwelling unit of the intercom system for apartment houses according to claim 5.
- 前記揺らぎ吸収処理用プログラムは、前記カウントステップにおいて、前記パケットカウント値が連続してゼロとなった場合、当該連続してゼロとなった回数が増大するにつれて絶対値が増大する負の値を前記パケットカウント値として算出する処理を前記通話処理部に行わせることを特徴とする請求項4~8の何れか1項に記載の集合住宅用インターホンシステムの住戸機。 In the counting step, when the packet count value is continuously zero in the counting step, the fluctuation absorbing processing program sets a negative value that increases in absolute value as the number of times of continuous zero increases. 9. The dwelling unit for an intercom system for an apartment house according to any one of claims 4 to 8, wherein the call processing unit is caused to perform processing for calculating a packet count value.
- 前記第2のソフトウェアは、前記伝送処理部で受信した前記音声パケットに含まれている音声データの全部又は一部が欠落した場合、欠落していない音声データを利用して、欠落した前記音声データの全部又は一部を補償する音声データ欠落補償処理のプログラムを含むことを特徴とする請求項1~9の何れか1項に記載の集合住宅用インターホンシステムの住戸機。 When all or part of the audio data included in the audio packet received by the transmission processing unit is missing, the second software uses the audio data that is not missing, and the missing audio data 10. The dwelling unit for an apartment intercom system according to any one of claims 1 to 9, comprising a program for audio data loss compensation processing that compensates for all or part of the intercom system.
- 前記伝送処理部で受信した前記音声パケットに含まれている音声データを蓄積する揺らぎ吸収用バッファを備え、
前記揺らぎ吸収処理プログラムは、前記揺らぎ吸収用バッファに蓄積されている音声データのパケット数をカウントしてパケットカウント値を算出するカウントステップと、前記カウントステップで算出される前記パケットカウント値に基づいて、前記揺らぎ吸収用バッファにパケットを挿入又は削除するバッファサイズ変更ステップとを前記通話処理部に行わせるとともに、前記バッファサイズ変更ステップにおいて、前記揺らぎ吸収用バッファから1つのパケットを削除する場合、音声データを含む有効なパケットが連続して2つ以上存在すれば、これら連続する有効パケットのうち、中間に位置する連続する2つの有効パケットをオーバーラップ加算して削除する処理を前記通話処理部に行わせることを特徴とする請求項3記載の集合住宅用インターホンシステムの住戸機。 A fluctuation absorbing buffer for accumulating voice data included in the voice packet received by the transmission processing unit;
The fluctuation absorption processing program counts the number of packets of audio data stored in the fluctuation absorption buffer to calculate a packet count value, and based on the packet count value calculated in the count step A buffer size changing step for inserting or deleting a packet in the fluctuation absorbing buffer is performed by the call processing unit, and in the buffer size changing step, one packet is deleted from the fluctuation absorbing buffer. If there are two or more valid packets containing data in succession, the call processing unit performs processing for overlapping and deleting two consecutive valid packets located in the middle of these consecutive valid packets. 4. A set according to claim 3, wherein Dwelling units machine intercom system for the home. - 前記揺らぎ吸収処理用プログラムは、前記バッファサイズ変更ステップにおいて、前記揺らぎ吸収用バッファにパケットを挿入する場合、連続する2つの有効パケットが存在すれば、これら2つの有効パケットの間に、音声を含まない無効なパケットを挿入する処理を前記通話処理部に行わせることを特徴とする請求項11記載の集合住宅用インターホンシステムの住戸機。 In the fluctuation absorption processing program, when a packet is inserted into the fluctuation absorption buffer in the buffer size changing step, if there are two consecutive valid packets, audio is included between the two valid packets. 12. The dwelling unit of an intercom system for an apartment house according to claim 11, wherein the call processing unit is caused to perform processing for inserting a non-invalid packet.
- 前記第2のソフトウェアは、前記伝送処理部が出力する音声データの全部又は一部の欠落を検出する音声データ欠落検出処理のプログラムと、前記音声データから音声のピッチを検出するピッチ検出処理のプログラムと、前記音声データ欠落検出処理で音声データの欠落が検出されたときに前記ピッチ検出処理で検出されるピッチに基づいて、欠落した音声データを補償する音声データ欠落補償処理のプログラムとを含み、
前記ピッチ検出処理プログラムは、現時点から過去に向けてある時間幅の音声信号を基準信号として設定する処理と、前記基準信号を前記音声信号に対して現時点から過去に向けてスライドさせ、前記基準信号と前記音声信号との相関を求めることで、前記音声信号のピッチを検出するとともに、前記基準信号のスライド量が増大するにつれて前記基準信号の時間幅を増大させる処理とを前記通話処理部に行わせることを特徴とする請求項1~12の何れか1項に記載の集合住宅用インターホンシステムの住戸機。 The second software includes: a program for detecting missing audio data that detects all or part of the audio data output from the transmission processing unit; and a program for detecting pitch from the audio data. And a program of audio data loss compensation processing that compensates for missing audio data based on the pitch detected in the pitch detection processing when audio data loss is detected in the audio data loss detection processing,
The pitch detection processing program is a process of setting a sound signal having a time width from the present time to the past as a reference signal, and sliding the reference signal from the present time to the past with respect to the sound signal, And a process of increasing the time width of the reference signal as the slide amount of the reference signal is increased in the call processing unit. 13. The dwelling unit for an apartment intercom system according to any one of claims 1 to 12, characterized in that - 前記ピッチ検出処理プログラムは、前記基準信号のスライド量が所定のスライド基準値になるまで、前記基準信号の時間幅を所定の初期時間幅に設定する処理を前記通話処理部に行わせることを特徴とする請求項13記載の集合住宅用インターホンシステムの住戸機。 The pitch detection processing program causes the call processing unit to perform a process of setting a time width of the reference signal to a predetermined initial time width until a slide amount of the reference signal reaches a predetermined slide reference value. The dwelling unit of the intercom system for collective housing according to claim 13.
- 前記ピッチ検出処理プログラムは、平均振幅差関数法により前記基準信号と前記音声信号との相関を求める処理を前記通話処理部に行わせることを特徴とする請求項13又は14記載の集合住宅用インターホンシステムの住戸機。 15. The intercom for collective housing according to claim 13 or 14, wherein the pitch detection processing program causes the call processing unit to perform processing for obtaining a correlation between the reference signal and the audio signal by an average amplitude difference function method. System dwelling machine.
- 前記ピッチ検出処理プログラムは、式(1)の平均振幅差関数を用いて前記基準信号と前記音声信号との相関を求める処理を前記通話処理部に行わせることを特徴とする請求項15記載の集合住宅用インターホンシステムの住戸機。
- 前記第2のソフトウェアは、前記伝送処理部が出力する音声データの全部又は一部の欠落を検出する音声データ欠落検出処理のプログラムと、前記音声データから音声のピッチを検出するピッチ検出処理のプログラムと、前記音声データ欠落検出処理で音声データの欠落が検出されたときに前記ピッチ検出処理で検出されるピッチに基づいて、欠落した音声データを補償する音声データ欠落補償処理のプログラムと、前記ピッチ検出処理で検出されるピッチを利用して前記音声データを伸長又は圧縮する話速変換処理のプログラムとを含むことを特徴とする請求項3記載の集合住宅用インターホンシステムの住戸機。 The second software includes: a program for detecting missing audio data that detects all or part of the audio data output from the transmission processing unit; and a program for detecting pitch from the audio data. A voice data missing compensation processing program that compensates for missing voice data based on a pitch detected by the pitch detection processing when voice data missing is detected by the voice data missing detection processing, and the pitch The dwelling unit of the intercom system for an apartment house according to claim 3, further comprising: a speech speed conversion processing program that expands or compresses the audio data using a pitch detected by the detection processing.
- 前記ピッチ検出処理は、所定の検出周期をカウントするとともに当該検出周期に同期して前記ピッチを繰り返し検出し、前記音声データ欠落検出処理で音声データの欠落が検出されたときは当該音声データ欠落の検出時点で前記ピッチを検出するとともに当該検出時点から前記検出周期のカウントを再開することを特徴とする請求項17記載の集合住宅用インターホンシステムの住戸機。 The pitch detection process counts a predetermined detection period and repeatedly detects the pitch in synchronization with the detection period. When the voice data loss detection process detects a lack of voice data, The dwelling unit of the intercom system for an apartment house according to claim 17, wherein the pitch is detected at a detection time and counting of the detection period is restarted from the detection time.
- 前記ピッチ検出処理は、所定の周波数範囲のピッチのみを検出することを特徴とする請求項17または18記載の集合住宅用インターホンシステムの住戸機。 19. The dwelling unit of an intercom system for an apartment house according to claim 17 or 18, wherein the pitch detection process detects only a pitch in a predetermined frequency range.
- 前記話速変換処理は、前記音声データの音声区間を検出し、当該音声区間の音声データのみを話速変換することを特徴とする請求項17記載の集合住宅用インターホンシステムの住戸機。 18. The dwelling unit for an intercom system for an apartment house according to claim 17, wherein the speech speed conversion processing detects a voice section of the voice data, and converts only the voice data of the voice section.
- 前記音声データ欠落検出処理は、1パケット分の前記音声データの時間長を正の整数で除した第1の時間間隔と前記音声データの入力タイミングに同期して音声データの欠落を検出し、前記ピッチ検出処理は、前記第1の時間間隔を正の整数倍した前記検出周期と当該第1の時間間隔に同期してピッチを検出することを特徴とする請求項18記載の集合住宅用インターホンシステムの住戸機。 The voice data loss detection process detects a voice data loss in synchronization with a first time interval obtained by dividing a time length of the voice data for one packet by a positive integer and the input timing of the voice data, 19. The collective housing intercom system according to claim 18, wherein the pitch detection processing detects the pitch in synchronization with the detection period obtained by multiplying the first time interval by a positive integer and the first time interval. Dwelling machine.
- 前記話速変換処理は、前記音声データ欠落検出処理が音声データの欠落を検出しているときに話速変換を行う場合、前記音声データ欠落検出処理が音声データの欠落を検出する直前に前記ピッチ検出処理で検出されたピッチを用いて話速変換を行うことを特徴とする請求項17記載の集合住宅用インターホンシステムの住戸機。 When the speech speed conversion process performs speech speed conversion when the voice data loss detection process detects a loss of voice data, the pitch immediately before the voice data loss detection process detects a voice data loss 18. The dwelling unit of an intercom system for an apartment house according to claim 17, wherein speech rate conversion is performed using the pitch detected in the detection process.
- 前記話速変換処理は、前記音声データ欠落検出処理が音声データの欠落を検出しているときに話速変換を行う場合、前記音声データ欠落補償処理で補償された音声データから前記ピッチ検出処理で検出されたピッチを用いて話速変換を行うことを特徴とする請求項17記載の集合住宅用インターホンシステムの住戸機。 In the speech speed conversion process, when the speech speed conversion is performed when the voice data loss detection process detects a loss of voice data, the pitch detection process uses the voice data compensated in the voice data loss compensation process. 18. The dwelling unit for an intercom system for an apartment house according to claim 17, wherein speech speed conversion is performed using the detected pitch.
- 前記ピッチ検出処理は、前記音声データの音声区間と非音声区間とを判別し、前記音声区間における前記検出周期よりも前記非音声区間における前記検出周期を長くすることを特徴とする請求項18記載の集合住宅用インターホンシステムの住戸機。 19. The pitch detection process determines a speech interval and a non-speech interval of the speech data, and makes the detection cycle in the non-speech interval longer than the detection cycle in the speech interval. Intercom system dwelling unit for multiple dwelling houses.
- 前記第2のソフトウェアは、前記マイクロホンとスピーカの音響結合によって生じる音響エコー経路により形成される閉ループの一巡利得を低減してハウリングを抑制する音声スイッチ処理のプログラムを含み、当該音声スイッチ処理プログラムは、前記音響エコー経路の帰還利得を推定し、当該帰還利得の推定値に基づいて、前記伝送処理部から出力される受話の音声データを減衰させる受話側減衰量と、前記伝送処理部に入力される送話の音声データを減衰させる送話側減衰量との総和を算出するとともに、送話及び受話の各音声データを監視して通話状態を推定し、当該通話状態の推定結果と前記総和の算出値に応じて前記送話側減衰量と前記受話側減衰量の配分を決定し、前記帰還利得の推定値の減少量に応じて前記総和を減少させる処理を前記通話処理部に行わせることを特徴とする請求項1~24の何れか1項に記載の集合住宅用インターホンシステムの住戸機。 The second software includes a voice switch processing program that suppresses howling by reducing a loop gain of a closed loop formed by an acoustic echo path generated by acoustic coupling between the microphone and a speaker. A feedback gain of the acoustic echo path is estimated, and based on the estimated value of the feedback gain, a reception-side attenuation amount for attenuating received voice data output from the transmission processing unit, and input to the transmission processing unit The sum of the attenuation on the transmission side for attenuating the voice data of the transmission is calculated, and the call state is estimated by monitoring each voice data of the transmission and reception, and the estimation result of the call state and the calculation of the sum are calculated. The distribution of the transmission-side attenuation and the reception-side attenuation is determined according to the value, and the sum is decreased according to the decrease in the estimated value of the feedback gain. Dwelling machine collective housing intercom system according to any one of claims 1 to 24 that processes, characterized in that to perform the call processing unit.
- 住宅内に設置される通話装置が接続される内線接続線と、当該内線接続線を介してアナログの音声信号を伝送する内線用アナログ信号伝送部とを備え、前記通話処理部で前記第1のソフトウェアを実行して通話処理された音声データが前記内線用アナログ信号伝送部から前記内線接続線を介して前記通話装置に伝送されることを特徴とする請求項1~25の何れか1項に記載の集合住宅用インターホンシステムの住戸機。 An extension connection line to which a communication device installed in a house is connected, and an extension analog signal transmission unit for transmitting an analog voice signal via the extension connection line, and the call processing unit includes the first 26. The voice data that has been subjected to call processing by executing software is transmitted from the extension analog signal transmission unit to the call device via the extension connection line. The dwelling unit of the intercom system for apartment houses described.
- 前記第1のソフトウェアは、前記アナログの音声信号がA/D変換されたデジタルの音声信号から音声のピッチを検出するとともに当該ピッチを利用して前記デジタルの音声信号を伸長又は圧縮する話速変換処理のプログラムを含むことを特徴とする請求項1~26の何れか1項に記載の集合住宅用インターホンシステムの住戸機。 The first software detects speech pitch from a digital audio signal obtained by A / D converting the analog audio signal and uses the pitch to expand or compress the digital audio signal. The dwelling unit for an intercom system for an apartment house according to any one of claims 1 to 26, comprising a processing program.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2012517086A JP5544012B2 (en) | 2010-05-24 | 2010-07-27 | Apartment unit intercom system dwelling unit and apartment house intercom system |
CN201080067044.6A CN102918825B (en) | 2010-05-24 | 2010-07-27 | Dwelling master unit multidwelling intercom system |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2010118723 | 2010-05-24 | ||
JP2010-118723 | 2010-05-24 | ||
JP2010-129196 | 2010-06-04 | ||
JP2010129196 | 2010-06-04 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2011148519A1 true WO2011148519A1 (en) | 2011-12-01 |
Family
ID=45003524
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2010/062581 WO2011148519A1 (en) | 2010-05-24 | 2010-07-27 | Dwelling unit device for interphone system for residential complex |
Country Status (4)
Country | Link |
---|---|
JP (1) | JP5544012B2 (en) |
CN (1) | CN102918825B (en) |
TW (1) | TWI442759B (en) |
WO (1) | WO2011148519A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2014110554A (en) * | 2012-12-03 | 2014-06-12 | Denso Corp | Hands-free speech apparatus |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2824917A1 (en) * | 2013-07-08 | 2015-01-14 | Fermax Design & Development, S.L.U. | Two-wire multichannel video door system |
US9947334B2 (en) * | 2014-12-12 | 2018-04-17 | Qualcomm Incorporated | Enhanced conversational communications in shared acoustic space |
JP5984029B1 (en) * | 2015-12-24 | 2016-09-06 | パナソニックIpマネジメント株式会社 | Doorphone system and communication control method |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001211254A (en) * | 2000-01-27 | 2001-08-03 | Matsushita Electric Ind Co Ltd | Information terminal and information terminal system |
JP2003324525A (en) * | 2002-05-06 | 2003-11-14 | Sharp Corp | System and method for virtual multiline telephony in a home-network telephone |
JP2005109833A (en) * | 2003-09-30 | 2005-04-21 | Aiphone Co Ltd | Interphone device |
JP2008061005A (en) * | 2006-08-31 | 2008-03-13 | Aiphone Co Ltd | Apartment building intercom system |
JP2010028771A (en) * | 2008-07-24 | 2010-02-04 | Panasonic Electric Works Co Ltd | Intercom system for multiple dwelling houses |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0588041U (en) * | 1992-04-27 | 1993-11-26 | 日通工株式会社 | Home bus |
CA2327813A1 (en) * | 1999-12-07 | 2001-06-07 | Kazuo Yahiro | Information terminal and information terminal system |
FR2911598B1 (en) * | 2007-01-22 | 2009-04-17 | Soitec Silicon On Insulator | SURFACE RUGOSIFICATION METHOD |
-
2010
- 2010-07-27 JP JP2012517086A patent/JP5544012B2/en active Active
- 2010-07-27 WO PCT/JP2010/062581 patent/WO2011148519A1/en active Application Filing
- 2010-07-27 CN CN201080067044.6A patent/CN102918825B/en not_active Expired - Fee Related
- 2010-07-27 TW TW99124750A patent/TWI442759B/en active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001211254A (en) * | 2000-01-27 | 2001-08-03 | Matsushita Electric Ind Co Ltd | Information terminal and information terminal system |
JP2003324525A (en) * | 2002-05-06 | 2003-11-14 | Sharp Corp | System and method for virtual multiline telephony in a home-network telephone |
JP2005109833A (en) * | 2003-09-30 | 2005-04-21 | Aiphone Co Ltd | Interphone device |
JP2008061005A (en) * | 2006-08-31 | 2008-03-13 | Aiphone Co Ltd | Apartment building intercom system |
JP2010028771A (en) * | 2008-07-24 | 2010-02-04 | Panasonic Electric Works Co Ltd | Intercom system for multiple dwelling houses |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2014110554A (en) * | 2012-12-03 | 2014-06-12 | Denso Corp | Hands-free speech apparatus |
Also Published As
Publication number | Publication date |
---|---|
TWI442759B (en) | 2014-06-21 |
JP5544012B2 (en) | 2014-07-09 |
CN102918825B (en) | 2014-05-07 |
CN102918825A (en) | 2013-02-06 |
JPWO2011148519A1 (en) | 2013-07-25 |
TW201143350A (en) | 2011-12-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5061853B2 (en) | Echo canceller and echo cancel program | |
US20080205632A1 (en) | Packet voice system with far-end echo cancellation | |
CN103391381A (en) | Method and device for canceling echo | |
JP5086769B2 (en) | Loudspeaker | |
JPS59193660A (en) | Conference telephone set | |
JP5544012B2 (en) | Apartment unit intercom system dwelling unit and apartment house intercom system | |
JP3385221B2 (en) | Echo canceller | |
JP5821022B2 (en) | External line transfer device for intercom system for apartment houses | |
JP5923705B2 (en) | Call signal processing device | |
US8737601B2 (en) | Echo canceller | |
JP2003051879A (en) | Speech device | |
JP5963077B2 (en) | Telephone device | |
JP4543896B2 (en) | Echo cancellation method, echo canceller, and telephone repeater | |
JP4380688B2 (en) | Telephone device | |
JP3864915B2 (en) | Loudspeaker calling system for apartment houses | |
JP4079008B2 (en) | Loudspeaker calling system for apartment houses | |
JP2007124163A (en) | Call apparatus | |
JP3903933B2 (en) | Telephone device | |
JP4346414B2 (en) | Signal processing device, computer program | |
JP2004274683A (en) | Echo canceler, echo canceling method, program, and recording medium | |
JP4655719B2 (en) | Intercom system for housing complex | |
JP3442535B2 (en) | Echo canceller | |
JP2005333586A (en) | Interphone | |
JP2004260491A (en) | Voice switching apparatus | |
JP2003324369A (en) | Battery operated calling apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201080067044.6 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10852185 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2012517086 Country of ref document: JP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 10852185 Country of ref document: EP Kind code of ref document: A1 |