CN101001485A

CN101001485A - Finite sound source multi-channel sound field system and sound field analogy method

Info

Publication number: CN101001485A
Application number: CN 200610113968
Authority: CN
Inventors: 张勤; 刘剑波; 王京玲; 蔡娟娟
Original assignee: Communication University of China
Current assignee: Communication University of China
Priority date: 2006-10-23
Filing date: 2006-10-23
Publication date: 2007-07-18

Abstract

This invention relates to a multichannel sound field system with limited sound sources and an analog method for sound field including: a microphone array used in recording M paths of audio information and detecting the sound field property, an audio collection subsystem used in A/D conversion of each channel audio information, packing audio data, channel number and timestamps, a server processing audio data of the microphones, finishing separation and process of sound source and compressing and storing the data and mixing the sound source data and converting them to the output data and control signals of N paths of loudspeakers, an audio recovery subsystem forming multichannel analog signals with the audio data of different sound sources and controlling synchronization of the loudspeakers and a loudspeaker array used in playing N paths of audio signals.

Description

A kind of limited sound source multiple channel acousto field system and simulation of acoustic field method

Technical field

The present invention relates to audio collection, reduction technique, specifically, relate to a kind of limited sound source multiple channel acousto field system.

Background technology

Music is an important component part of human civilization history.Since the recording technology invention, people put down in writing with regard to attempt to want the melody that it is beautiful always faithfully.Low voice speaking discharge technique has experienced from the monophony to the dual track, five-sound channel and the development course of multichannel more, and a target of Zhui Qiuing is original sound field or the spatial impression of reproduction more true to nature all the time.Compare with stereophony, although multi-channel system has made people obtain beautiful outstanding audition and has enjoyed, but undistorted reduction can not be accomplished to source of sound by in fact such system, exist problems such as sound field phase distortion, sound source intermodulation distortion, dynamic range compression, in the development of more advanced multiple channel acousto field system, running into unprecedented technical sophistication degree and theoretical challenge.Bakelite disc, tape be from needless to say, even if the stereophony source of sound that adopts digital technology to record is also play spatial impression that can not the true reappearance three-dimensional on the system for electrical teaching in current comparatively popular 5.1/6.1 sound channel.In addition, traditional audio amplifier all is that certain radiation axial angle is arranged, therefore in dual-channel stereo system, more significantly exist " emperor position ", this position is on the perpendicular bisector of two audio amplifier lines, when departing from this axis to the left or to the right, proportional shifting will significantly take place in acoustic image, no matter be the sound system of family expenses or movie theatre also system for electrical teaching, all the directional distortion problem of ubiquity source of sound of specialty.In the ambiophonic system of multichannel, be subjected to the restriction of each audio amplifier, best listening zone becomes a narrower point, promptly by the mid point of each audio amplifier around the zone that is surrounded.As long as leave this point, the auditor will " be engulfed " by the sound field of certain audio amplifier institute, and it is out of proportion to produce serious sound field.

Stereo recording with playback format replaced around audio format at present.Many art and technology have been developed various surround sound technology in using, and provide spatial impression by suitable recording with the playback mode for the sense of hearing.In the research in this respect, mainly contain two class methods, a class is the perception simulation, and a class is a simulation of acoustic field.

The perception analogy method:

The binaural sound technology belongs to the perception analogy method, this class technology is thought the spatial impression that only just can effectively reappear the sense of hearing at hearer's eardrum reproduced sound pressure, and this is based on the fact----ears that everybody knows and that be verified and head and chest can offer an explanation the direction and the distance of sound source position.The cross-talk technology for eliminating also belongs to the binaural sound technology, can eliminate the cross-talk between left speaker and the hearer's auris dextra.Recording and stereosonic conventional art are based on the observation of perceptual phenomena and some experiences, have generated artificial synthetic guiding principle.At present, common multichannel also system for electrical teaching all be to utilize technology to design based on the perception simulation.The employed various sound field technical specifications of people are very many at present, the most common surround sound form has Dolby (Doby), DTS (Digital Theatre System, the digitlization cinema system), SACD (SuperAudio CD, super-audio optical disk system) and DVD Audio (DVD audio frequency).Wherein, SACD and DVD Audio are the disc forms of high-res, and film does not use this two kinds of forms, film mainly adopt DTS ES 6.1 and Dolby Digital EX 6.1 around form.

The simulation of acoustic field method:

System based on simulation of acoustic field is then rarely found, because technical and notion physically is very complicated and need deep acoustics and signal processing basis background.Berkhout you can well imagine out wave field comprehensive (Wave Field Synthesis, WFS) technology and wave field analysis (Wave Field Analysis, WFA) theory in 1988 and 1997 the fraction of the year.Berkhout and J.Meyer have launched the research that microphone array carries out the analysis harmony script holder record of sound field around this theory, and Paul D.Henderson and X.Shen then study using loudspeaker array to carry out the sound field reduction.

The basic assumption of simulation of acoustic field is: the sound pressure reproduction sound field that apparatus has living space and distributes in reappearing the space, and make a complete auditory system (external ear) be subjected to the stimulation of nature, this stimulation is virtual stimulation, will reappear just.Obviously this task is more natural physically, but the spatial sound cognitive method of depending merely on theory or intuition is difficult to realize.Wave field synthetic (Wave Field Synthesis, WFS) and holophonic (holophonic) be two kinds of simulation of acoustic field methods, they all be with a loud speaker around the zone reappear sound field.Ambisonic also is a kind of simulation of acoustic field method, and sound field can obtain part at the center of Ambisonic circulating loudspeaker array and reappear.Generating directional diagram around the sound source of reappearing, also is a kind of simulation of acoustic field technology.

Above research is based on all that Huygen's principle (Huygens ' Principle) is decomposed sound field and comprehensive.Need unlimited a plurality of secondary sound source on such sound field rebuilding Systems Theory, can't realize in practice, need carry out a large amount of simplification with approximate, be easy to generate problems such as source of sound information distortion, sound field phase distortion, sound source intermodulation distortion, dynamic range compression simultaneously.

Also system for electrical teaching will reproduce original sound field really, and the research of recording technology is a key issue.How the music harmony played of the symphony orchestra that various tune differences, the different musical instrument of sound intensity are formed, balance, clear, include accurately, be the recording key of success.Modern recording technology often adopts the pickup transmitter of high-fidelity, take the record type of loquacity tube, many rails, to greatest extent near the real sound of gathering, then through audio mixing again, the volume of each track of balance, make various musical instruments with a kind of harmony accurately state show.It is the problem that source of sound separates that undistorted sound source is extracted, and its objective is to disturb source of sound and noise signal to separate target source of sound and other.Statistical method, as neural net, HMM (Hidden Markov Model, HMM), (Support VectorMachines SVM) is the method that present source of sound separation field generally uses to SVMs.But statistical method is in the application study that source of sound separates at present, and existing needs to suppose that noise signal is the restriction of white Gaussian noise.So existing source of sound Separation Research can only be regarded the research that a kind of signal strengthens as, and when signal noise was the interference of other source of sound (as musical instrument), existing solution just lost using value because of the non-Gauss of system.

Summary of the invention

Technical problem to be solved by this invention provides a kind of limited sound source multiple channel acousto field system and simulation of acoustic field method, realizes that undistorted source of sound separates, gathers, and comprehensive accurate reproduction original sound field.

For solving the problems of the technologies described above, the invention provides a kind of limited sound source multiple channel acousto field system, comprising: have a plurality of microphones microphone array, have the loudspeaker array of a plurality of loud speakers, also comprise: audio collection subsystem, server, audio frequency playback subsystem, wherein

Microphone array is used to record M road audio-frequency information and surveys sound field characteristic;

The audio collection subsystem is used for each the road audio signal from the microphone array collection is carried out analog-to-digital conversion, and voice data mark acquisition channel after will changing number and timestamp, and packing also sends;

Server, in the sound field gatherer process, be used to receive and resolve the packets of audio data that the audio collection subsystem sends, will be converted into different single sound source datas from the voice data of each microphone array collection, and the single sound source data after will transforming transforms and is compressed into audio file formats, and preserves; In sound field playback process, be used to read the audio file of having preserved, according to the characteristic of M road sound source data and reconstruction sound field, mix sound source data and mate dateout and the control signal that is converted to N road loud speaker by intelligence, be sent to audio frequency playback subsystem;

Audio frequency playback subsystem is used for coming synchronously according to the control signal that receives from server the dateout of each loud speaker of receiving from server, and is reduced into the multichannel analog audio signal, is sent to loudspeaker array and plays;

Loudspeaker array is used to play N road audio signal and rebuilds sound field.

In a preferred embodiment, described audio collection subsystem further comprises: a plurality of audio collection daughter boards and an audio collection motherboard; Each audio collection daughter board comprises one or more audio collection passages, an analog to digital converter group and a logic processing device; The audio collection motherboard comprises: gather daughter board data-interface and server communication interface; Wherein, each audio collection daughter board is gathered audio signal by the audio collection passage from microphone array, and the audio signal that the audio collection passage collects is sent to the analog to digital converter group, the analog to digital converter group transforms audio-frequency information voice data and is sent to logic processing device, with the voice data mark upper channel of each analog to digital converter output in the analog to digital converter group number and timestamp, and be sent to collection daughter board data-interface in the audio collection motherboard, by the server communication interface in the audio collection motherboard voice data is sent to server again.

In a preferred embodiment, described audio collection motherboard further comprises: gather the daughter board control interface; Server to audio collection daughter board transmitting control commands, and obtains the state information of audio collection daughter board feedback by the collection daughter board control interface in the audio collection motherboard from described collection daughter board control interface.

In a preferred embodiment, described server further comprises: monitor acquisition module, be used for monitoring whether voice data arrival server is arranged, gather after voice data arrives when having listened to; The voice data processing module comprises particle filter and equalizer, and the voice data that is used for gathering is converted into different single sound source datas; Memory module is used for the single sound source data conversion after transforming is compressed into audio file formats, adds file description information and preservation; Transmit control module is used to read the audio file of having preserved, according to the characteristic of M road sound source data and reconstruction sound field, mixes sound source data and mates dateout and the control signal that is converted to N road loud speaker by intelligence, is sent to audio frequency playback subsystem.

In a preferred embodiment, described audio frequency playback subsystem further comprises: a plurality of audio frequency playback daughter boards and an audio frequency playback motherboard; Each audio frequency playback daughter board comprises one or more audio frequency playback passages, a digital to analog converter group and a logic processing device; Audio frequency playback motherboard comprises: playback daughter board control interface, playback daughter board data-interface and server communication interface; Wherein, audio frequency playback motherboard is by dateout and the control signal of server communication interface reception from each loud speaker of server, and the dateout of each loud speaker is sent to audio frequency playback daughter board by playback daughter board data-interface, by playback daughter board control interface control signal is sent to audio frequency playback daughter board simultaneously; Logic processing device in the audio frequency playback daughter board comes synchronously the dateout of each loud speaker of receiving from audio frequency playback motherboard according to the control signal of receiving from audio frequency playback motherboard, and be sent to the digital to analog converter group and be converted to the multichannel analog audio signal, be sent to loudspeaker array by audio frequency playback passage and play.

In a preferred embodiment, described transmit control module further comprises speaker volume control submodule, is used for loudspeaker array is carried out volume control.

In a preferred embodiment, described speaker volume control submodule, further comprise to loudspeaker array carry out single speaker volume control unit of single speaker volume control, to the grouping speaker volume control unit of grouping speaker volume control or to whole speaker volume control units of whole speaker volumes controls.

In a preferred embodiment, described transmit control module further comprises loudspeaker array network monitoring submodule, is used for loudspeaker array is carried out network monitoring.

In order to solve the problems of the technologies described above, the present invention also provides a kind of limited sound source multiple channel acousto field stimulation method, may further comprise the steps:

(a) by the audio collection subsystem M road audio signal of microphone array collection is carried out analog-to-digital conversion, and voice data mark acquisition channel after will changing number and timestamp, packing also sends;

(b) receive and resolve the packets of audio data that the audio collection subsystem sends by a server, the voice data of microphone array collection is converted into different single sound source datas, and the single sound source data after will transforming transforms and is compressed into audio file formats and preservation;

(c) in sound field playback process, described server reads the audio file of having preserved, according to M road sound source data and rebuild the characteristic of sound field, mix sound source data and be converted to the dateout and the control signal of N road loud speaker by Adaptive matching, be sent to audio frequency playback subsystem;

(d) described audio frequency playback subsystem comes synchronously the dateout of each loud speaker of receiving from server according to the control signal that receives from server, and is reduced into N road multichannel analog audio signal, is sent to loudspeaker array and plays.

In a preferred embodiment, when described step (b) is converted into different single sound source datas with the voice data of gathering, adopt particle filter that noise and interference are separated from this road voice-grade channel, the noise jamming of promptly other sound source signal being regarded as a kind of non-Gauss converts undistorted source of sound extraction problem to a kind of waveform tracking problem.

The present invention also provides a kind of voice data pack arrangement, be used at audio collection process sign voice data attribute information, comprise: the bag data division that is used to represent the header part of voice data attribute and is used to represent voice data, wherein, the header part comprises origin identification position, gap marker position, timestamp position; The bag data division comprises the voice data position.Also can further comprise check digit.

As from the foregoing, system of the present invention, each voice data of being gathered is converted into the voice data of a plurality of single sources of sound by the audio processing modules in the server, realized that undistorted source of sound separates, gathers, avoided the distortion of single sound source, reduce the influence of sound field phase distortion, thoroughly avoided the intermodulation distortion between the sound source; By the transmit control module in the server with the voice data of the single source of sound in described isolated M road according to the characteristic of rebuilding sound field, convert the data of N road loud speaker output to, and the necessary control signal be provided, thus the comprehensive accurate original sound field that reproduced.By audio frequency playback subsystem and loudspeaker array technology, avoided the narrow narrow phenomenon in best listening zone.The present invention is by adopting described voice data pack arrangement with channel number and timestamp sign, when audio collection, clearly put down in writing voice data from channel number and the time of collection, for comprehensive accurate playback provides the important evidence on the room and time.

Technical problem to be solved by this invention, technical scheme main points and beneficial effect will be in conjunction with the embodiments, are described further with reference to accompanying drawing.

Description of drawings

Fig. 1 is the structural representation of the described system of the embodiment of the invention;

Fig. 2 is the structural representation that Fig. 1 audio collection subsystem sound intermediate frequency is gathered daughter board;

Fig. 3 is the structural representation that Fig. 1 audio collection subsystem sound intermediate frequency is gathered motherboard;

Fig. 4 handles back packets of audio data structural representation for the audio collection daughter board;

Fig. 5 is the packets of audio data structural representation that has check code;

Fig. 6 is that the server voice data is monitored among Fig. 1, voice data is handled and the flow chart of storage;

Fig. 7 is the structural representation of the voice data processing module in Fig. 1 server;

Fig. 8 is the structural representation of the MINO intelligence matching module in Fig. 1 server;

Fig. 9 is the structural representation of Fig. 1 audio frequency playback subsystem sound intermediate frequency playback daughter board;

Figure 10 is the structural representation of Fig. 1 audio frequency playback subsystem sound intermediate frequency playback motherboard;

Figure 11 is the transmit control module structural representation in Fig. 1 server.

Embodiment

With reference to Fig. 1, the structural representation of the described system of the embodiment of the invention.

The described system of the embodiment of the invention comprises: have a plurality of microphones microphone array 1, have the loudspeaker array 5 of a plurality of loud speakers, also comprise: audio collection subsystem 2, server 3, audio frequency playback subsystem 4.

Wherein,

Audio collection subsystem 2 is used for each the road audio signal from the microphone array collection is carried out analog-to-digital conversion, and voice data mark acquisition channel after will changing number and timestamp, and packing also sends;

Server 3, in the sound field gatherer process, be used to receive and resolve the packets of audio data that the audio collection subsystem sends, will be converted into different single sound source datas from the voice data of each microphone array collection, and the single sound source data conversion after will transforming is compressed into audio file formats, and the adding file description information, comprise bent name, record length, the author, the player, place, preservations such as microphone locating information; In sound field playback process, be used to read the audio file of having preserved, the characteristic (number that comprises loud speaker according to M road sound source data and reconstruction sound field, the position of putting, the occasion of projection etc.), mix sound source data and mate dateout and the control signal that is converted to N road loud speaker, be sent to audio frequency playback subsystem by intelligence.

Audio frequency playback subsystem 4 is used for basis from the control signal that server receives, and comprises that NM hints obliquely at matrix, information such as loud speaker optimum structure, come synchronously the dateout of each loud speaker of receiving from server, and be reduced into the multichannel analog audio signal, be sent to loudspeaker array and play.

By the audio collection subsystem, the data (as 256 passages) that a plurality of voice-grade channel collections come can be transferred to Ethernet by data-interface, be sent to server by Ethernet again.The voice data of each microphone is handled according to different sound field Processing Algorithm by server, comprise modified model filter and equalizer processes, finally the voice data of each microphone collection is converted into the voice data of different single sources of sound, the audio file formats that these data transaction are become can play, and be saved in local SCSI (SmallComputer System Interface, small computer system interface) hard disk or other medium.Terminal Server Client can pass through ISCSI (Internet Small Computer System Interface, Internet Small Computer Systems Interface) protocol access is play audio file or other communications protocol visit respective stored medium in the SCSI dish, and reproduces original sound field realistically by audio frequency playback subsystem.

In collection, transmission and the storing process of audio frequency, distortion for fear of single sound source, reduce the influence of sound field phase distortion, thoroughly avoid the intermodulation distortion between the sound source, this programme adopts follow-on particle filter that noise and interference are separated from this road voice-grade channel.The noise jamming of promptly other sound source signal being regarded as a kind of non-Gauss converts undistorted source of sound extraction problem to a kind of waveform tracking problem, thereby has proposed a kind of brand-new source of sound separation method based on particle filter.This programme pickup scope comprises recording studio simultaneously and is live recording, as large-scale symphony concert, sports tournament and party recording etc.

By main limited sound source is separated, make the analysis of other sound field and represent more simple possible.Make up the sound field system that separates of a M input and N output by limited sound source reduction technique and Huygen's principle more, obtain one more superior, more flexible than single sound field integrated approach, be applicable to the system for electrical teaching of going back of multiple actual environment.Utilize the audio frequency playback subsystem and the loudspeaker array technology of this programme, can solve the narrow narrow problem in best listening zone of present existence preferably, can be applicable in the transcription system on each grand theater, music hall, stadium, square, and can obtain than the also system for electrical teaching playback impression more true to nature of present multichannel.

The described audio collection subsystem of the embodiment of the invention comprises a plurality of audio collection daughter boards and an audio collection motherboard.Wherein, each audio collection daughter board comprises one or more audio collection passages, an analog to digital converter group and an on-site programmable gate array FPGA (or other logic processing device); The audio collection motherboard comprises: gather the daughter board control interface and (comprise the serial type interface, parallel interface), gather the daughter board data-interface and (comprise HSSI High-Speed Serial Interface, parallel interface) and server communication interface (as, the wired ethernet interface, the wireless ultra-wideband interface, wireless IP interface etc.) (as shown in Figure 3).

With reference to Fig. 2, gather the structural representation of daughter board for the described audio collection subsystem of embodiment of the invention sound intermediate frequency, each audio collection daughter board is gathered audio signal by audio collection passage (CH0-CH7) from microphone array, and the audio signal that the audio collection passage collects is sent to the analog to digital converter group (there is shown 4 modulus converter A/D groups, each A/D group is responsible for the analog-to-digital conversion of two audio collection audio signals that passage is gathered), the analog to digital converter group transforms audio-frequency information voice data and is sent to field programmable gate array, field programmable gate array is with the voice data mark upper channel of each analog to digital converter output in the analog to digital converter group number and timestamp, and be sent to collection daughter board data-interface in the audio collection motherboard, by the server communication interface in the audio collection motherboard voice data is sent to server again.

Here, the audio collection subsystem adopts modularized design, it is the pattern of audio collection daughter board (as shown in Figure 2) and audio collection motherboard (as shown in Figure 3), each audio collection daughter board is as an acquisition terminal, generally can gather 8 audio collection passages (when the audio collection passage surpasses 8, can adopt the mode of expansion) data, these data are stamped timestamp and channel number by FPGA, and then give motherboard with these data passes, motherboard is read by server by Ethernet interface or other passage again, is saved in SCSI hard disk or other storage medium.Server 3 to audio collection daughter board transmitting control commands, and obtains the state information of audio collection daughter board feedback by the collection daughter board control interface in the audio collection motherboard from described collection daughter board control interface.

In Fig. 2, the effect of A/D group is to be the multi-channel audio analog signal conversion digital audio-frequency data.Can optionally the sample voice data of certain or certain several passages of user.Must contain the channel number parameter in the data of each passage, because veneer is supported 256 passages at most, so get final product as the channel number parameter with the integer of a byte length, promptly No. from 0 to 255, channel number, when 0x00～0xff surpasses 256 passages, distinguish by the server record file.In addition, the data that each passage collects all must be stamped timestamp, and are to guarantee the correct sequencing of image data, also convenient by time selectivity broadcast.For correctness and the reliability that guarantees to go back data in the system for electrical teaching, can also carry out error correction coding to voice data.

As shown in Figure 4, handle back packets of audio data structural representation for the audio collection daughter board.After stamping channel number and timestamp through the voice data of on-site programmable gate array FPGA after with analog-to-digital conversion, packets of audio data is made up of packet header and bag data two parts.Each data packet length is 128 fixing bytes, and promptly length overall is 1024, and the structure of packet as shown in Figure 4.In Network Transmission, packets of audio data also can be by with UDP (User Datagram Protocol, User Datagram Protoco (UDP)) or the form of TCP/IP (Transmission Control Protocol/Internet Protocol, transmission control protocol/internet protocol) be encapsulated as IP bag.In the packets of audio data of 128 bytes, 6 bytes in packet header are wherein wrapped 1 byte in origin identification position, 1 byte in gap marker position, 4 bytes in timestamp position, 122 bytes in voice data position.

Wherein,

1 byte is adopted in bag origin identification position:

First byte (0x77) in packet header is promptly wrapped origin identification and is used for synchronously, to characterize the beginning of a packet.

1 byte is adopted in the gap marker position:

Must contain the channel number parameter in the data of each passage, owing to support 256 passages at most, so the integer with a byte length gets final product as the channel number parameter, it is No. from 0 to 255, channel number, 0x0～0xff decoder detects a byte after the packet header, knows that promptly which in 256 tunnel audio signals be the audio signal that receives be.

4 bytes are adopted in the timestamp position:

The data that each passage collects all must be stamped timestamp, and are to guarantee the correct sequencing of image data, also convenient by time selectivity broadcast.The timestamp here is that relative time stabs, and has promptly defined the subordinate relation of each road audio signal.The first via of selecting every group of audio signal is the audio frequency main road, and other several roads audio signal then is decided to be the subordinate audio frequency.On each unit of main road audio frequency, stamp timestamp, and according to the performance of going up at one time with the unit of main road audio frequency, stamp identical timestamp on the corresponding unit of subordinate audio frequency, the timestamp on each unit of subordinate audio frequency is for the timestamp of main road audio unit.

In order to guarantee the correctness and the reliable rows of data in the audio frequency playback subsystem, can also carry out forward error correction to audio signal.Adopt RS (Reed-Solomon, the Read-Solomon) coding of T=8, brachymemma in our scheme, and 16 check byte are added on each packet, this moment, the voice data position had only 106 bytes.The frame structure of RS error protection bag as shown in Figure 5.

The RS coding acts on the sync byte of packet too.The implementation method of the RS of brachymemma herein (144,128) sign indicating number is before the input input information byte of RS (255,239) encoder, adds 111 bytes, and is set to complete 0.Behind the coding, again these zero bytes are abandoned.In like manner, audio frequency playback subsystem added 111 bytes, and is set to complete 0 before the input input information byte of RS (255,239) decoder.After the decoding, again these zero bytes are abandoned.

The described server of the embodiment of the invention comprises: monitor acquisition module, voice data processing module, memory module and transmit control module.

Monitor acquisition module, be used for monitoring whether voice data arrival server is arranged, gather after voice data arrives when having listened to;

The voice data processing module, the voice data that is used for gathering is converted into different single sound source datas; Here, audio process adopts particle filter and equalizer, as shown in Figure 7, finally the audio signal of each microphone collection is converted into different single sound source signals.Each audio process adopts the design of module plug-in, can add as required like this and deletes different audio process, and not influence whole system, simultaneously also can be along with the deeply continuous better audio process of design of research.

Memory module is used for the single sound source data after transforming is diminished or harmless conversion is compressed into audio file formats, adds associated documents information, comprises indicating and distinguishes greater than 256 sound channel systems and preservation;

Transmit control module is used to read the audio file of having preserved, according to the characteristic of M road sound source data and reconstruction sound field, mixes sound source data and mates dateout and the control signal that is converted to N road loud speaker by intelligence, is sent to audio frequency playback subsystem.As shown in Figure 8, transmit control module adopts the intelligent conversion module to realize the coupling of MINO in the present embodiment, can carry out M and N dynamic weighting coupling in transfer process.

At the front end of server, the audio signal that each microphone collects is by Ethernet interface on the server (communication interface) input server.Therefore, server is set up the port monitoring by monitoring acquisition module, as shown in Figure 6.When data arrive any one Ethernet interface, with the ICP/IP protocol read data packet, and determine the channel properties of this audio stream packet in then the audio stream data of different sound channels to be deposited in the different fixed disk files respectively according to the sound channel in packet packet header sign.Terminal Server Client can be visited corresponding storage medium by these audio files or other host-host protocol that the ISCSI protocol access is play on the SCSI dish.

In the playback process, server reads the audio file of having preserved by transmit control module, according to the characteristic of M road sound source data and reconstruction sound field, mix sound source data and mate dateout and the control signal that is converted to N road loud speaker by intelligence, be sent to audio frequency playback subsystem.

The described audio frequency playback of embodiment of the invention subsystem comprises: a plurality of audio frequency playback daughter boards and an audio frequency playback motherboard.

Wherein, each audio frequency playback daughter board comprises one or more audio frequency playback passages, a digital to analog converter group and a field programmable gate array.

Audio frequency playback motherboard comprises: playback daughter board control interface, playback daughter board data-interface and server communication interface, as shown in Figure 9.

Audio frequency playback motherboard is by dateout and the control signal of server communication interface reception from each loud speaker of server, and the dateout of each loud speaker is sent to audio frequency playback daughter board by playback daughter board data-interface, by playback daughter board control interface control signal is sent to audio frequency playback daughter board simultaneously, as shown in figure 10; Please be simultaneously with reference to Fig. 9, field programmable gate array in the audio frequency playback daughter board comes synchronously the dateout of each loud speaker of receiving from audio frequency playback motherboard according to the control signal of receiving from audio frequency playback motherboard, and be sent to digital to analog converter D/A group (4 two-way digital to analog converter D/A have been shown among Fig. 9) and be converted to the multichannel analog audio signal, being sent to loudspeaker array by audio frequency playback passage (CH0-CHN, audio frequency playback passage generally is no more than 8) plays.

As shown in figure 11, can also increase speaker volume control submodule in the transmit control module in the server comes loudspeaker array is carried out volume control, such as, increase single speaker volume control unit, grouping speaker volume control unit and whole speaker volume control unit, come respectively loudspeaker array to be carried out single speaker volume control, the control of grouping speaker volume or all speaker volume control.Simultaneously,, can also increase loudspeaker array network monitoring submodule, make server end to monitor, so that in time find the loud speaker that malunion is normal to the connection status of each loud speaker because server adopts the mode of network to be connected with each loud speaker.Here, can adopt existing mature technology to realize for speaker volume control submodule and loudspeaker array network monitoring submodule, in audio-frequency power amplifier, the volume control of numeral that adopts or simulation, loud speaker is carried out volume control, can be that list, grouping and whole loud speaker are carried out volume control.

To the monitoring of loud speaker, comprise the measurement of loud speaker input voltage, the output of loud speaker sound field is measured, and measurement result is delivered to server end by monitor network, and is working properly to guarantee.When remote playing, can select loud speaker not to be detected.

A kind of limited sound source multiple channel acousto field system of the present invention and voice data pack arrangement, be not restricted to listed utilization in specification and the execution mode, it can be applied to various suitable the present invention's field fully, for those skilled in the art, can easily realize additional advantage and make amendment, therefore under the situation of the spirit and scope of the universal that does not deviate from claim and equivalency range and limited, the examples shown that the present invention is not limited to specific details, representational equipment and illustrates here and describe.

Claims

1, a kind of limited sound source multiple channel acousto field system comprises: have a plurality of microphones microphone array, have the loudspeaker array of a plurality of loud speakers, it is characterized in that also comprising: audio collection subsystem, server and audio frequency playback subsystem, wherein,

Loudspeaker array is used to play N road audio signal and rebuilds sound field.

2, sound field according to claim 1 system is characterized in that described audio collection subsystem further comprises: a plurality of audio collection daughter boards and an audio collection motherboard; Each audio collection daughter board comprises one or more audio collection passages, an analog to digital converter group and a logic processing device; The audio collection motherboard comprises: gather daughter board data-interface and server communication interface; Wherein, each audio collection daughter board is gathered audio signal by the audio collection passage from microphone array, and the audio signal that the audio collection passage collects is sent to the analog to digital converter group, the analog to digital converter group transforms audio-frequency information voice data and is sent to described logic processing device, with the voice data mark upper channel of each analog to digital converter output in the analog to digital converter group number and timestamp, and be sent to collection daughter board data-interface in the audio collection motherboard, by the server communication interface in the audio collection motherboard voice data is sent to server again.

3, sound field according to claim 2 system is characterized in that described audio collection motherboard further comprises: gather the daughter board control interface; Server to audio collection daughter board transmitting control commands, and obtains the state information of audio collection daughter board feedback by the collection daughter board control interface in the audio collection motherboard from described collection daughter board control interface.

Whether 4, sound field according to claim 1 system is characterized in that described server further comprises: monitor acquisition module, be used for monitoring and have voice data to arrive server, gather after voice data arrives when having listened to; The voice data processing module comprises particle filter and equalizer, and the voice data that is used for gathering is converted into different single sound source datas; Memory module is used for the single sound source data conversion after transforming is compressed into audio file formats, adds file description information and preservation; Transmit control module is used to read the audio file of having preserved, according to the characteristic of M road sound source data and reconstruction sound field, mixes sound source data and mates dateout and the control signal that is converted to N road loud speaker by intelligence, is sent to audio frequency playback subsystem.

5, sound field according to claim 1 system is characterized in that described audio frequency playback subsystem further comprises: a plurality of audio frequency playback daughter boards and an audio frequency playback motherboard; Each audio frequency playback daughter board comprises one or more audio frequency playback passages, a digital to analog converter group and a logic processing device; Audio frequency playback motherboard comprises: playback daughter board control interface, playback daughter board data-interface and server communication interface; Wherein, audio frequency playback motherboard is by dateout and the control signal of server communication interface reception from each loud speaker of server, and the dateout of each loud speaker is sent to audio frequency playback daughter board by playback daughter board data-interface, by playback daughter board control interface control signal is sent to audio frequency playback daughter board simultaneously; Logic processing device in the audio frequency playback daughter board comes synchronously the dateout of each loud speaker of receiving from audio frequency playback motherboard according to the control signal of receiving from audio frequency playback motherboard, and be sent to the digital to analog converter group and be converted to the multichannel analog audio signal, be sent to loudspeaker array by audio frequency playback passage and play.

6, sound field according to claim 4 system is characterized in that described transmit control module further comprises speaker volume control submodule, is used for loudspeaker array is carried out volume control.

7, sound field according to claim 6 system, it is characterized in that described speaker volume control submodule, further comprise to loudspeaker array carry out single speaker volume control unit of single speaker volume control, to the grouping speaker volume control unit of grouping speaker volume control or to whole speaker volume control units of whole speaker volumes controls.

8, sound field according to claim 4 system is characterized in that described transmit control module, further comprises loudspeaker array network monitoring submodule, is used for loudspeaker array is carried out network monitoring.

9, a kind of limited sound source multiple channel acousto field stimulation method may further comprise the steps:

10, method as claimed in claim 9, it is characterized in that, when described step (b) is converted into different single sound source datas with the voice data of gathering, adopt particle filter that noise and interference are separated from this road voice-grade channel, the noise jamming of promptly other sound source signal being regarded as a kind of non-Gauss converts undistorted source of sound extraction problem to a kind of waveform tracking problem.

11, a kind of voice data pack arrangement, be used at audio collection process sign voice data attribute information, it is characterized in that comprising: the bag data division that is used to represent the header part of voice data attribute and is used to represent voice data, wherein, the header part comprises origin identification position, gap marker position, timestamp position; The bag data division comprises the voice data position.

12, packet structure according to claim 11 is characterized in that also comprising: check digit.