Specific embodiment
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, with reference to the accompanying drawing and specific real
Applying mode, the present invention is described in further detail.
Referring to Fig.1, a kind of step flow chart of echo cancel method embodiment one of the invention is shown, specifically can wrap
Include following steps:
Step 101, the first video conference terminal determines filter factor;
It should be noted that this method can be applied to view networking.It is the important milestone of network Development depending on networking, is one
It is a to can be realized HD video transmission, push numerous Internet applications to HD video, the aspectant network system of high definition.
Real-time high-definition video switching technology is used depending on networking, it can be such as high in a network platform by required service
Clear video conference, Intellectualized monitoring analysis, emergency command, digital broadcast television, delay TV, the Web-based instruction, shows video monitoring
Field live streaming, VOD program request, TV Mail, individual character records (PVR), Intranet (manages) channel by oneself, intelligent video Broadcast Control, information publication
All be incorporated into a system platform etc. services such as tens of kinds of videos, voice, picture, text, communication, data, by TV or
Computer realizes that high-definition quality video plays.
Embodiment in order to enable those skilled in the art to better understand the present invention makes a presentation to depending on networking first below.
Depending on networking, applied portion of techniques is as described below:
Network technology (Network Technology): the network technology innovation depending on networking improves traditional ethernet
(Ethernet), with huge video flow potential on network.Different from simple network packet packet switch (Packet
Switching) or lattice network exchanges (Circuit Switching), full using Packet Switching depending on networking technology
Sufficient Streaming demand.Have flexible, the simple and low price of packet switch depending on networking technology, is provided simultaneously with the product of circuit switching
Matter and safety assurance realize the seamless connection of the whole network switched virtual circuit and data format.
Switching technology (Switching Technology): asynchronous and packet switch two for regarding networking using Ethernet are excellent
Point eliminates Ethernet defect under the premise of complete compatible, has an end-to-end seamless connection of the whole network, direct user terminal, directly
Carry IP data packet.User data is not required to any format conversion in network-wide basis.It is the more advanced form of Ethernet depending on networking,
It is a real-time exchange platform, can be realized the whole network large-scale high-definition realtime video transmission that current internet cannot achieve, it will
Numerous network video applications push high Qinghua to, unitize.
Server technology (Server Technology): view networking is different with the server technology on unified video platform
In traditional server, its streaming media be built upon it is connection-oriented on the basis of, data-handling capacity with
Flow, communication time are unrelated, and single network layer can be transmitted comprising signaling and data.For voice and video business,
Depending on the complexity many simpler than data processing networked and unified video platform Streaming Media is handled, efficiency is significantly than traditional server
Improve hundred times or more.
Reservoir technology (Storage Technology): unify the ultrahigh speed reservoir technology of video platform to adapt to
The media content of vast capacity and super-flow and use state-of-the-art real time operating system, by the program in server instruction
Information MAP is to specific hard drive space, and media content is no longer pass through server, and moment is directly delivered to user terminal, user etc.
To typical time less than 0.2 second.The sector distribution of optimization greatly reduces the mechanical movement of hard disc magnetic head tracking, resource consumption
The 20% of the internet ad eundem IP is only accounted for, but generates the concurrent flow greater than 3 times of traditional disk array, overall efficiency promotes 10 times
More than.
Network security technology (Network Security Technology): the structural design for regarding networking passes through every time
Service the network that independent licence system, equipment and the modes such as user data is completely isolated have thoroughly eradicated puzzlement internet from structure
Safety problem does not need antivirus applet, firewall generally, has prevented the attack of hacker and virus, provides for user structural
Carefree secure network.
Service innovative technology (Service Innovation Technology): unified video platform is by business and transmits
Be fused together, whether single user, private user or a network sum total, be all only primary automatic connection.With
Family terminal, set-top box or PC are attached directly to unified video platform, obtain the multimedia video service of colourful various forms.
Unified video platform substitutes traditional complicated applications with table schema using " menu type " and programs, and considerably less code can be used
Complicated application can be realized, realize the new business innovation of " endless ".
Networking depending on networking is as described below:
It is a kind of central controlled network structure depending on networking, which can be Tree Network, Star network, ring network etc. class
Type, but centralized control node is needed to control whole network in network on this basis.
As shown in Fig. 2, being a kind of networking schematic diagram of view networking of the invention.As shown in Figure 2, it is divided into access net depending on networking
With Metropolitan Area Network (MAN) two parts.
The equipment of access mesh portions can be mainly divided into 3 classes: node server, access switch, terminal (including various machines
Top box, encoding board, memory etc.).Node server is connected with access switch, and access switch can be with multiple terminal phases
Even, and it can connect Ethernet.
Wherein, node server is the node that centralized control functions are played in access net, can control access switch and terminal.
Node server can directly be connected with access switch, can also directly be connected with terminal.
As shown in figure 3, being a kind of hardware structural diagram of node server of the invention.Node server mainly includes
Network Interface Module 301, switching engine module 302, CPU module 303, disk array module 304;
Wherein, Network Interface Module 301, the Bao Jun that CPU module 303, disk array module 304 are come in enter switching engine
Module 302;Switching engine module 302 look into the operation of address table 305 to the packet come in, to obtain the navigation information of packet;
And the packet is stored according to the navigation information of packet the queue of corresponding pack buffer 306;If the queue of pack buffer 306 is close
It is full, then it abandons;All pack buffer queues of 302 poll of switching engine mould, are forwarded: 1) port if meeting the following conditions
It is less than to send caching;2) the queue package counting facility is greater than zero.Disk array module 304 mainly realizes the control to hard disk, including
The operation such as initialization, read-write to hard disk;CPU module 303 is mainly responsible between access switch, terminal (not shown)
Protocol processes, to address table 305 (including descending protocol packet address table, uplink protocol package address table, data packet addressed table)
Configuration, and, the configuration to disk array module 304.
As shown in figure 4, being a kind of hardware structural diagram of access switch of the invention.Access switch mainly includes
Network Interface Module (downstream network interface module 401, uplink network interface module 402), switching engine module 403 and CPU mould
Block 404;
Wherein, the packet (upstream data) that downstream network interface module 401 is come in enters packet detection module 405;Packet detection mould
Whether mesh way address (DA), source address (SA), type of data packet and the packet length of the detection packet of block 405 meet the requirements, if met,
It then distributes corresponding flow identifier (stream-id), and enters switching engine module 403, otherwise abandon;Uplink network interface mould
The packet (downlink data) that block 402 is come in enters switching engine module 403;The data packet that CPU module 404 is come in enters switching engine
Module 403;Switching engine module 403 look into the operation of address table 406 to the packet come in, to obtain the navigation information of packet;
If the packet into switching engine module 403 is that downstream network interface is gone toward uplink network interface, in conjunction with flow identifier
(stream-id) packet is stored in the queue of corresponding pack buffer 407;If the queue of the pack buffer 407 is close full,
It abandons;If the packet into switching engine module 403 is not that downstream network interface is gone toward uplink network interface, according to packet
Navigation information is stored in the data packet queue of corresponding pack buffer 407;If the queue of the pack buffer 407 is close full,
Then abandon.
All pack buffer queues of 403 poll of switching engine module, are divided to two kinds of situations in embodiments of the present invention:
If the queue is that downstream network interface is gone toward uplink network interface, meets the following conditions and be forwarded: 1)
It is less than that the port sends caching;2) the queue package counting facility is greater than zero;3) token that rate control module generates is obtained;
If the queue is not that downstream network interface is gone toward uplink network interface, meets the following conditions and is forwarded:
1) it is less than to send caching for the port;2) the queue package counting facility is greater than zero.
Rate control module 408 is configured by CPU module 404, to all downlink networks in programmable interval
Interface generates token toward the pack buffer queue that uplink network interface is gone, to control the code rate of forwarded upstream.
CPU module 404 is mainly responsible for the protocol processes between node server, the configuration to address table 406, and,
Configuration to rate control module 408.
The equipment of access mesh portions further includes that Ethernet association turns gateway.As shown in figure 5, being a kind of Ethernet association of the invention
Turn the hardware structural diagram of gateway.It mainly includes Network Interface Module (downstream network interface module that Ethernet association, which turns gateway,
501, uplink network interface module 502), switching engine module 503, CPU module 504, packet detection module 505, code rate control mould
Block 508, address table 506, pack buffer 507 and MAC adding module 509, MAC removing module 510.
Wherein, the data packet that downstream network interface module 501 is come in enters packet detection module 505;Packet detection module 505 is examined
Ethernet mac DA, ethernet mac SA, Ethernet length or frame type, the view networking mesh way address of measured data packet
DA, whether meet the requirements depending on networking source address SA, depending on networking data Packet type and packet length, corresponding stream is distributed if meeting
Identifier (stream-id);Then, MAC DA, MAC SA, length or frame type are subtracted by MAC removing module 510
(2byte), and enter corresponding receive and cache, otherwise abandon;
Downstream network interface module 501 detects the transmission caching of the port, according to the view of packet networking mesh if there is Bao Ze
Address D A knows the ethernet mac DA of corresponding terminal, adds the ethernet mac DA of terminal, Ethernet assists the MAC for turning gateway
SA, Ethernet length or frame type, and send.
The function that Ethernet association turns other modules in gateway is similar with access switch.
The terminal for accessing mesh portions mainly includes Network Interface Module, Service Processing Module and CPU module;For example, machine top
Box mainly includes Network Interface Module, video/audio encoding and decoding engine modules, CPU module;Encoding board mainly includes network interface mould
Block, video encoding engine modules, CPU module;Memory mainly includes Network Interface Module, CPU module and disk array mould
Block.
Similar, the equipment of metropolitan area mesh portions can also be divided into 3 classes: metropolitan area server, node switch, node serve
Device.Metropolitan area server is connected with node switch, and node switch can be connected with multiple node servers.Node switching owner
It to include Network Interface Module, switching engine module and CPU module;Metropolitan area server mainly includes Network Interface Module, exchange
Engine modules and CPU module are constituted.
Wherein, node server is the node server for accessing mesh portions, i.e. node server had both belonged to access wet end
Point, and belong to metropolitan area mesh portions.
Metropolitan area server is the node that centralized control functions are played in Metropolitan Area Network (MAN), can control node switch and node serve
Device.Metropolitan area server can be directly connected to node switch, can also be directly connected to node server.
It can be seen that be entirely a kind of central controlled network structure of layering depending on networking network, and node server and metropolitan area
The network controlled under server can be the various structures such as tree-shaped, star-like, cyclic annular.
Visually claim, access mesh portions can form unified video platform (part in virtual coil), and multiple unified videos are flat
Platform can form view networking;Each unified video platform can be interconnected by metropolitan area and wide area depending on networking.
It include access network data packet and Metropolitan Area Network (MAN) data packet depending on networking data packet.
Access net data packet mainly include following sections: destination address (DA), source address (SA), reserve bytes,
payload(PDU)、CRC。
As shown in the table, the data packet for accessing net mainly includes following sections:
DA |
SA |
Reserved |
Payload |
CRC |
Wherein, destination address (DA) is made of 8 bytes (byte), first character section indicate data packet type (such as
Various protocol packages, multicast packet, unicast packet etc.), be up to 256 kinds of possibility, the second byte to the 6th byte is metropolitan area
Net address, the seven, the 8th bytes are access net address.
Source address (SA) is also to be made of 8 bytes (byte), is defined identical as destination address (DA).
Reserve bytes are made of 2 bytes.
The part payload has different length according to the type of different datagrams, is if it is various protocol packages
64 bytes are 32+1024=1056 bytes if it is single group unicast packets words, are not restricted to above 2 kinds certainly.
CRC is made of 4 bytes, and calculation method follows the Ethernet CRC algorithm of standard.
The topology of Metropolitan Area Network (MAN) is pattern, may there is 2 kinds, connection even of more than two kinds, i.e. node switching between two equipment
It can all can exceed that 2 kinds between machine and node server, node switch and node switch, node switch and node server
Connection.But the metropolitan area net address of metropolitan area network equipment is uniquely, to close to accurately describe the connection between metropolitan area network equipment
System introduces parameter in depending on networking: label uniquely to describe a metropolitan area network equipment.
In view networking, (Multi-Protocol Label Switch, multiprotocol label are handed over by the definition of label and MPLS
Change) label definition it is similar, it is assumed that between equipment A and equipment B there are two connection, then data packet from equipment A to equipment B just
There are 2 labels, data packet also there are 2 labels from equipment B to equipment A.Label is divided into label, outgoing label, it is assumed that data packet enters
The label (entering label) of equipment A is 0x0000, and the label (outgoing label) when this data packet leaves equipment A may reform into
0x0001.The networking process of Metropolitan Area Network (MAN) is to enter network process under centralized control, also means that address distribution, the label of Metropolitan Area Network (MAN)
Distribution be all to be dominated by metropolitan area server, node switch, node server be all passively execute, this point with
The label distribution of MPLS is different, and the distribution of the label of MPLS is the result that interchanger, server are negotiated mutually.
As shown in the table, the data packet of Metropolitan Area Network (MAN) mainly includes following sections:
DA |
SA |
Reserved |
Label |
Payload |
CRC |
That is destination address (DA), source address (SA), reserve bytes (Reserved), label, payload (PDU), CRC.Its
In, the format of label, which can refer to, such as gives a definition: label is 32bit, wherein high 16bit retains, only with low 16bit, its position
Set is between the reserve bytes and payload of data packet.
It in embodiments of the present invention, may include regarding networked server depending on networking, the first video conference terminal, and, the
Two video conference terminals.
First video conference terminal and the second video conference terminal can be set-top box (SetTopBox, STB), be one
The equipment for connecting television set and outside source, the digital signal of compression can be changed into television content by it, and on a television set
It shows.
In general, set-top box can connect camera and microphone, it is more for acquiring video data and audio data etc.
Media data also can connect television set, for multi-medium datas such as playing video data and audio datas.
In the application scenarios such as video conference, the first video conference terminal and the second video conference terminal external signal each other
Source, i.e. the first video conference terminal can acquire multi-medium data and by being sent to the second video conference end depending on networked server
End plays the above-mentioned multi-medium data received by the second video conference terminal;Meanwhile second video conference terminal can also adopt
Collect multi-medium data and by being sent to the first video conference terminal depending on networked server, is played and connect by the first video conference terminal
The above-mentioned multi-medium data received.
In the present embodiment, audio data acquired with the second video conference terminal and by being sent to the depending on networked server
It is introduced for one video conference terminal.It should be noted that in video conference scene, the first video conference terminal and
The operation that two video conference terminals are carried out should be it is identical, i.e., it is whole to receive the second video conference in the first video conference terminal
When holding the audio data sent and carrying out echo cancellation operation to above-mentioned audio data, the second video conference can also use this reality
The method for applying example carries out echo cancellation operation to the audio data by the acquisition of the first video conference terminal received.
In embodiments of the present invention, before carrying out echo cancellation operation, the first video conference terminal can determine filter first
Wave system number.The filter factor, which can refer in the first video conference terminal, actually makees echo cancellor behaviour to the audio data received
The coefficient of the sef-adapting filter of work.
In the concrete realization, can fix the filter factor is a certain particular value, after determining filter factor, Ke Yixiang
Determine the range of the work delay of sef-adapting filter corresponding with the filter factor with answering.Filter factor determines echo
The convergence of elimination algorithm, in practical application, it is desirable that algorithm has fast convergence and stability, that is, requires filter factor can
Fast convergence and the steady operation under this coefficient.
It should be noted that those skilled in the art can set the specific value of the filter factor according to actual needs
Size, the embodiment of the present invention are not construed as limiting this.
Step 102, first video conference terminal calculates the constant time lag between the second video conference terminal;
In embodiments of the present invention, the first video conference terminal, the second video conference terminal, view networked server and its
He can be constructed as a video conferencing system at equipment jointly.Between first video conference terminal and the second video conference terminal
Constant time lag can refer to the constant time lag of current video conferencing system.
In the concrete realization, constant time lag can be real in the case where not making data buffering to the first video conference terminal
When will acquisition and play audio data, save as audio file, analyzed to obtain by audio analysis tool.
Step 103, first video conference terminal obtains the primary data amount of reference data buffer area;
In embodiments of the present invention, primary data amount, which can refer to, is carrying out echo cancellor behaviour in the first video conference terminal
Before work, the data volume that is buffered in reference data buffer area.
Step 104, first video conference terminal adjusts the initial number according to the filter factor and constant time lag
According to amount to target data amount;
In general, the first video conference terminal will connect after the audio data for receiving the acquisition of the second video conference terminal
While the audio data received is sent to sound card, it is also desirable to the audio data are sent to echo cancellation algorithm and referred to.It returns
The echo cancellation operation that sound elimination algorithm carries out is the treatment process of adaptive-filtering.Echo cancellor is mainly supported using echo
The method to disappear, that is, pass through the size of adaptive approach estimated echo data, then this estimated value is subtracted in receiving signal
To offset echo.This requires reference data must shift to an earlier date to be reached in echo data.
Therefore, system delay can be changed by adjusting the size of primary data amount in reference data buffer area, makes system
In constant time lag close to filter factor corresponding best effort delay, to meet above-mentioned requirements.
For example, the corresponding best effort delay of filter factor is 200ms, if system constant time lag is 300ms in order to make to be
The constant time lag of system is delayed close to the algorithm of filter, can be mostly slow by the size of data volume in increase reference data buffer area
Rush the data volume of 100ms.
Step 105, first video conference terminal receives the view networked server and is sent by downstream communications link
Audio data, the audio data acquires by second video conference terminal;
In the concrete realization, in video conference, the video conference terminal of distal end, i.e. the second video conference terminal can be adopted
Collect audio data, and be sent to view networked server by uplink communication links, receives the audio number depending on networked server
According to rear, it is first determined then the destination address of the audio data is sent to the first video conference terminal by downstream communications link.
First video conference terminal, i.e., local video conference terminal, can be to the audio datas after receiving above-mentioned audio data
Make echo cancellation operation.
Step 106, first video conference terminal executes echo to the audio data according to the target data amount
Eliminate operation.
In general, the audio data of distal end is by the sound card of local terminal after being played back, by echo path with dialect
Sound is formed by data, the echo to be eliminated of the echo cancellation algorithm of instant sef-adapting filter by acquisition again.
For example, the first video conference terminal play the second video conference terminal send audio data and in local broadcasting
Afterwards, the sound that loudspeaker plays is reflected by air borne or wall, can be passed to microphone again, and with dialect
Sound is resurveyed, if being transferred to the second video conference terminal again, distal end can hear apparent echo, can be interfered
Normal talking.Therefore, in order to improve speech quality when video conference, this portion of audio data should just be eliminated as far as possible.
In the concrete realization, the local voice data that can will be acquired when playing the audio data that distal end is sent, through joining
It examines data buffer zone and is transmitted to sef-adapting filter, echo cancellation process is carried out by the sef-adapting filter, to eliminate echo.
In embodiments of the present invention, the first video conference terminal pass through determine filter factor, and with the second video conference
Constant time lag between terminal can adjust the primary data amount in reference data buffer area to target data amount, thus
It receives and is acquired by the second video conference terminal, and after the audio data that view networked server is sent, it can be to the audio
Data execute echo cancellation operation, to eliminate echo.The present embodiment by adjusting the data volume in reference data buffer area so that
The constant time lag of system can be close to the corresponding work delay of filter factor, thus the timing between reference data and echo data
It can be realized dynamic equilibrium, meet requirement of the echo cancellation algorithm to the timing synchronization of audio data, and then realize disappearing for echo
It removes, improves the speech quality in video conference.
Referring to Fig. 6, a kind of step flow chart of echo cancel method embodiment two of the invention is shown, specifically can wrap
Include following steps:
Step 601, the first video conference terminal determines filter factor;
It should be noted that this method can be applied to view networking, this may include view networked server and view depending on networking
Frequency conference terminal.Video conference terminal may include at least two, i.e. the first video conference terminal and the second video conference terminal.
One video conference terminal can acquire the multi-medium datas such as video data and audio data, by regarding the Internet services
Above-mentioned multi-medium data is transmitted to another video conference terminal by device, and more matchmakers are played on received video conference terminal
Volume data, to realize the real-time video meeting between at least two parties.
It in embodiments of the present invention, is local video conference terminal with the first video conference terminal, with the second video council
Terminal is discussed to be introduced for the video conference terminal of distal end.That is, the first video conference terminal receives the second video conference end
Hold the video data and multi-medium datas such as audio data and in local broadcasting of transmission, with realize local user and remote subscriber it
Between video conference.Certainly, during video conference, local video conference terminal can also acquire local video in real time
The multi-medium datas such as data and audio data and the video conference terminal broadcasting for being transmitted to distal end, bipartite operating process base
This is consistent.
In general, the first video conference terminal is receiving the audio data of the second video conference terminal transmission and is passing through loudspeaking
When device plays back, the sound of above-mentioned broadcasting is reflected by air borne or wall, can be passed to microphone again, and adjoint
Local speech is resurveyed, if the local speech of acquisition is transferred to the second video conference terminal, remote subscriber can
Hear obviously echo.Therefore, it in order to improve speech quality when video conference, needs to disappear to this partial echo
It removes.
In embodiments of the present invention, before eliminating echo, the filter factor of sef-adapting filter can be determined first.In reality
In border, the first video conference terminal is after receiving the audio data of distal end, by the way that the audio data is transmitted to adaptive filter
Wave device carries out echo cancellation process to the audio data by echo cancellation algorithm by sef-adapting filter.
Step 602, the constant time lag between the second video conference terminal is calculated;
In embodiments of the present invention, the constant time lag between the first video conference terminal and the second video conference terminal, can
To refer to the constant time lag of the video conferencing system collectively constituted by distinct devices such as above-mentioned multiple video conference terminals.
It in embodiments of the present invention, can be in the case where not making data buffering to the first video conference terminal, in real time
By acquisition and the audio data played, audio file is saved as, is obtained by the audio file that audio analysis tool analysis is saved
To the constant time lag.
In the concrete realization, the data volume of reference data buffer area can be emptied first, then acquire and play target sound
Frequency evidence, and according to acquisition and the target audio data played, the first audio file and the second audio file are generated respectively, are passed through
Audio analysis tool calculates the constant time lag between the first audio file and the second audio file, so that the fixation for obtaining system is prolonged
When.
Step 603, the primary data amount of reference data buffer area is obtained;
In embodiments of the present invention, primary data amount, which can refer to, is carrying out echo cancellor behaviour in the first video conference terminal
Before work, the data volume that is buffered in reference data buffer area.
Step 604, the corresponding work delay of the filter factor is determined;
In embodiments of the present invention, the work delay can refer to it is corresponding with the filter factor of sef-adapting filter most
Good work delay.In general, after determining filter factor, it can obtain work delay.
Step 605, the time delayed difference value between the constant time lag and work delay is calculated;
For example, it is assumed that the constant time lag of system is 300ms, best effort delay corresponding with filter factor is 200ms, then
Time delayed difference value between the two is 100ms.
Certainly, the constant time lag of system may also be less than the corresponding best effort delay of filter factor.For example, system is consolidated
Fixed delay is 200ms, and best effort delay corresponding with filter factor is 250ms, then time delayed difference value between the two be-
50ms。
Step 606, according to the time delayed difference value, the primary data amount is adjusted to target data amount;
In embodiments of the present invention, the primary data amount in reference data buffer area is adjusted to target data amount, it can be with
Change system delay, to make the constant time lag in system close to the corresponding best effort delay of filter factor.
Therefore, in the concrete realization, when time delayed difference value be greater than zero when, can in reference data buffer area buffered data,
Data volume data volume corresponding with the time delayed difference value after making buffering is equal;And when time delayed difference value is less than zero, portion can be abandoned
Divided data keeps remaining data volume data volume corresponding with the time delayed difference value in reference data buffer area equal.
It should be noted that the target data amount in buffer area adjusted can not data volume corresponding with time delayed difference value
It is essentially equal, as long as and meeting with the time delayed difference value in a certain range.
Step 607, the audio data that the view networked server is sent by downstream communications link, the audio number are received
It is acquired according to by second video conference terminal;
In the concrete realization, in video conference, the video conference terminal of distal end, i.e. the second video conference terminal can be adopted
Collect audio data, and be sent to view networked server by uplink communication links, receives the audio number depending on networked server
According to rear, it is first determined then the destination address of the audio data is sent to the first video conference terminal by downstream communications link.
Step can be executed sequentially after receiving above-mentioned audio data in first video conference terminal, i.e., local video conference terminal
608 and step 609, echo cancellation operation is made to the audio data.
Step 608, in playing audio data, local voice data is acquired;
In embodiments of the present invention, the local voice data of acquisition is in the audio number for transmitting the second video conference terminal
It when according to being played back by loudspeaker, being reflected by air borne or wall, being passed to microphone again, and with local speech
The data resurveyed, the partial data are time that sef-adapting filter needs to eliminate when carrying out echo cancellation process
Sound.
Step 609, the local voice data is transmitted to sef-adapting filter through the reference data buffer area, by institute
It states sef-adapting filter and echo cancellation operation is carried out to the local voice data.
It in embodiments of the present invention, can be by local voice data through reference data buffer transfer to adaptive-filtering
Device.Due to being adjusted to the data volume in reference data buffer area so that between reference data and echo data when
Sequence can be realized synchronization.Therefore, echo cancellation process can be effectively performed in sef-adapting filter, to eliminate echo.
In order to make it easy to understand, being made a presentation below with a specific example to echo cancel method of the invention.
By taking a certain video conferencing system as an example.Firstly, the parameter in audio system is as follows:
Collection terminal, audio sample rate 32kHz, sampling precision 16bit, monophonic;
Play end, audio sample rate 32kHz, sampling precision 16bit, two-channel.
It since when carrying out echo cancellor, the parameter of reference data and acquisition data must be consistent, is turned by played data
It is changed to and obtains after monophonic, i.e. audio sample rate 32kHz, sampling precision 16bit, monophonic.
The code rate of audio collection is 512kbps, and the data volume of 1ms is 64B;
The code rate that audio plays is 1024kbps, and the data volume of 1ms is 128B.
For the ease of compared with the delay requirement in echo cancellation algorithm, in the following description, usually by data volume
Be converted to time quantum.
Secondly, clear caching system sound intermediate frequency acquisition and played:
Acquisition caching, that is, adopt and use, typically not greater than the minimum data amount of next stage required by task;
Caching is played, sound card caching is often referred to, for the timing synchronization of echo cancellor, can control the data of sound card caching
Amount that is, between the data volume of 24ms to 48ms, minimizes delay jitter between 3kB to 6kB.
One, correctly the constant time lag in estimation current system working environment, the filter factor of selected echo cancellation algorithm are
The reference time delay that some value, i.e. algorithm effectively work;
For example, being estimated under different operating environment by multiple samples, the fixation of the audio system of the video conference terminal
Delay is 200ms to 300ms, and the coefficient that can select the filter in echo cancellation algorithm is some value, so that its effective work
Making delay is 150ms to 250ms.
Two, by adjusting the data volume in reference data buffer area, calculate the delay in audio system close to echo cancellor
The corresponding best effort delay of the coefficient of filter in method;
While the dual channel data of broadcasting is sent to sound card, replicates and be converted to mono data to be sent to reference data slow
Rush area.It is assumed that the constant time lag of current system be 250ms, then by set the primary data amount in reference data buffer area as
The data volume of 100ms.It there is no data, thus the number of reference data buffer area and acquisition data buffer zone in acquisition buffer area at this time
The data volume for being 100ms according to amount difference meets algorithm and effectively works latency requirement so that system delay becomes 150ms.
It should be noted that the size of primary data amount is with system delay with correlation in reference data buffer area.Ginseng
Echo cancellation algorithm should be admitted to prior to echo data by examining data, and the primary data amount of reference data buffer area is bigger, reference
The timing that data are sent into algorithm more lags, then the timing difference between reference data and echo data is smaller, that is, is delayed smaller.By
In the factors such as VFP voice-frequency dispatching, Network Packet Loss, Multi-channel audio sound mixing will cause played data cutout, reference data cannot supplement
The data of situation, reference data buffer area are consumed, and delay always has the tendency that becoming larger.Thus, reference data buffer area
Data volume setting principle be: biggish buffer area is set as far as possible so that echo cancellation algorithm work is in lesser delay section,
It avoids causing algorithm to fail because data jitter time delay increases.Since the size of data volume in reference data buffer area is that dynamic is set
Fixed, therefore it can be abstracted into a parameter in the application, for setting the size of buffer area primary data amount.
Three, a data synchronization mechanism is established, reference data and acquisition data is made to reach dynamic equilibrium under instant messaging.
Synchronous final purpose is that reference data and the delay of acquisition data is allowed to be delayed as close as possible to algorithm and keep steady
It is fixed, to uniformly be handled by echo cancellation algorithm, to realize the elimination of echo.This just needs strict control audio system
Links and external factor in system, the constant time lag including system operation, sound card play caching, and reference data caching is adopted
Collect data buffer storage and Network Packet Loss etc..
It confirmed system constant time lag, securing algorithm delay, primary data amount in adjustment reference data buffer area
After matching of the size to meet system delay and algorithm delay, that is, establish the primary condition of synchronization mechanism.
In practical applications, on the one hand, along with factors such as network jitter, audio data scheduling, Multi-channel audio sound mixings, broadcast
It puts the case where data will appear cutout, reference data die-offs, and can seriously destroy the synchronous condition initially set up.In such case
Under, it needs to be replenished in time played data, reference data buffer area can be usually filled into quiet data.On the other hand, due to
The factors such as network delay or system congestion, have formerly been automatically replenished data, then and in a short time receive a large amount of data, make
At redundancy, the synchronous condition initially set up can be equally destroyed.In this case, then need to abandon the data of redundancy.This is just needed
A data balancing is established on the primary condition for establishing synchronization mechanism, establish reference data buffer area and need
The data volume lower limit of supplementary data and the data volume upper limit for needing to abandon redundant data.The upper and lower bound refers to reference number
According to the difference between the data volume in the data volume in buffer area and acquisition data buffer zone.This boundary and echo cancellation algorithm
The performance of middle sef-adapting filter is related, i.e. the corresponding algorithm delay of the coefficient of filter and echo estimation range.
For example, the corresponding algorithm delay of the coefficient of filter is 200ms, echo estimation range is ± 50ms, corresponding to filter
The wave device reference time delay that effectively works is then 150 to 250ms, and the difference of the data volume bound of reference data buffer area then can be
The data volume of 100ms.
Therefore, can according to the data volume of lower and upper limit critical value, supplement or abandon data so that reference data with adopt
Collection data reach dynamic equilibrium under instant messaging, realize timing synchronization.
It should be noted that for simple description, therefore, it is stated as a series of action groups for embodiment of the method
It closes, but those skilled in the art should understand that, embodiment of that present invention are not limited by the describe sequence of actions, because according to
According to the embodiment of the present invention, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art also should
Know, the embodiments described in the specification are all preferred embodiments, and the related movement not necessarily present invention is implemented
Necessary to example.
Referring to Fig. 7, a kind of structural block diagram of echo cancelling device embodiment of the invention is shown, described device can answer
It networks for regarding, described depending on may include regarding networked server in networking, the first video conference terminal, and, the second video council
Terminal is discussed, described device can specifically include following module:
Determining module 701, for determining the filter factor of the first video conference terminal;
Computing module 702, for calculating the fixation between first video conference terminal and the second video conference terminal
Delay;
Obtain module 703, the primary data amount of the reference data buffer area for obtaining first video conference terminal;
Module 704 is adjusted, for adjusting the primary data amount to number of targets according to the filter factor and constant time lag
According to amount;
Receiving module 705, the audio data sent for receiving the view networked server by downstream communications link, institute
Stating audio data can be acquired by second video conference terminal;
Execution module 706, for executing echo cancellation operation to the audio data according to the target data amount.
In embodiments of the present invention, the computing module 702 can specifically include following submodule:
Data volume empties submodule, for emptying the data volume of reference data buffer area;
The acquisition of target audio data plays submodule, for acquiring and playing target audio data;
Audio file generates submodule, for generating the first audio respectively according to acquisition and the target audio data played
File and the second audio file;
Constant time lag computational submodule is prolonged for calculating the fixation between first audio file and the second audio file
When.
In embodiments of the present invention, the adjustment module 704 can specifically include following submodule:
Work, which is delayed, determines submodule, for determining the corresponding work delay of the filter factor;
Time delayed difference value computational submodule, for calculating the time delayed difference value between the constant time lag and work delay;
Target data amount adjusting submodule, for adjusting the primary data amount to number of targets according to the time delayed difference value
According to amount.
In embodiments of the present invention, the target data amount adjusting submodule can specifically include such as lower unit:
Buffer cell, for when the time delayed difference value is greater than zero, the buffered data in the reference data buffer area to make
Data volume data volume corresponding with the time delayed difference value after buffering is equal;
Discarding unit, for abandoning partial data, making the reference data buffer area when the time delayed difference value is less than zero
Interior remaining data volume data volume corresponding with the time delayed difference value is equal.
In embodiments of the present invention, the execution module 706 can specifically include following submodule:
Local speech data-acquisition submodule, in playing audio data, acquiring local voice data;
Local speech data transmission module, for through the reference data buffer area by the local voice data transmission
To sef-adapting filter, echo cancellation operation is carried out to the local voice data by the sef-adapting filter.
For device embodiment, since it is basically similar to the method embodiment, related so being described relatively simple
Place illustrates referring to the part of embodiment of the method.
All the embodiments in this specification are described in a progressive manner, the highlights of each of the examples are with
The difference of other embodiments, the same or similar parts between the embodiments can be referred to each other.
It should be understood by those skilled in the art that, the embodiment of the embodiment of the present invention can provide as method, apparatus or calculate
Machine program product.Therefore, the embodiment of the present invention can be used complete hardware embodiment, complete software embodiment or combine software and
The form of the embodiment of hardware aspect.Moreover, the embodiment of the present invention can be used one or more wherein include computer can
With in the computer-usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) of program code
The form of the computer program product of implementation.
The embodiment of the present invention be referring to according to the method for the embodiment of the present invention, terminal device (system) and computer program
The flowchart and/or the block diagram of product describes.It should be understood that flowchart and/or the block diagram can be realized by computer program instructions
In each flow and/or block and flowchart and/or the block diagram in process and/or box combination.It can provide these
Computer program instructions are set to general purpose computer, special purpose computer, Embedded Processor or other programmable data processing terminals
Standby processor is to generate a machine, so that being held by the processor of computer or other programmable data processing terminal devices
Capable instruction generates for realizing in one or more flows of the flowchart and/or one or more blocks of the block diagram
The device of specified function.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing terminal devices
In computer-readable memory operate in a specific manner, so that instruction stored in the computer readable memory generates packet
The manufacture of command device is included, which realizes in one side of one or more flows of the flowchart and/or block diagram
The function of being specified in frame or multiple boxes.
These computer program instructions can also be loaded into computer or other programmable data processing terminal devices, so that
Series of operation steps are executed on computer or other programmable terminal equipments to generate computer implemented processing, thus
The instruction executed on computer or other programmable terminal equipments is provided for realizing in one or more flows of the flowchart
And/or in one or more blocks of the block diagram specify function the step of.
Although the preferred embodiment of the embodiment of the present invention has been described, once a person skilled in the art knows bases
This creative concept, then additional changes and modifications can be made to these embodiments.So the following claims are intended to be interpreted as
Including preferred embodiment and fall into all change and modification of range of embodiment of the invention.
Finally, it is to be noted that, herein, relational terms such as first and second and the like be used merely to by
One entity or operation are distinguished with another entity or operation, without necessarily requiring or implying these entities or operation
Between there are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant meaning
Covering non-exclusive inclusion, so that process, method, article or terminal device including a series of elements not only wrap
Those elements are included, but also including other elements that are not explicitly listed, or further includes for this process, method, article
Or the element that terminal device is intrinsic.In the absence of more restrictions, being wanted by what sentence "including a ..." limited
Element, it is not excluded that there is also other identical elements in process, method, article or the terminal device for including the element.
Above to a kind of echo cancel method provided by the present invention and a kind of echo cancelling device, detailed Jie has been carried out
It continues, used herein a specific example illustrates the principle and implementation of the invention, and the explanation of above embodiments is only
It is to be used to help understand method and its core concept of the invention;At the same time, for those skilled in the art, according to this hair
Bright thought, there will be changes in the specific implementation manner and application range, in conclusion the content of the present specification should not manage
Solution is limitation of the present invention.