WO2017143910A1 - Acquisition processing method, device and system, and computer storage medium - Google Patents

Acquisition processing method, device and system, and computer storage medium Download PDF

Info

Publication number
WO2017143910A1
WO2017143910A1 PCT/CN2017/073176 CN2017073176W WO2017143910A1 WO 2017143910 A1 WO2017143910 A1 WO 2017143910A1 CN 2017073176 W CN2017073176 W CN 2017073176W WO 2017143910 A1 WO2017143910 A1 WO 2017143910A1
Authority
WO
WIPO (PCT)
Prior art keywords
sound
microphone array
issuer
relative
position information
Prior art date
Application number
PCT/CN2017/073176
Other languages
French (fr)
Chinese (zh)
Inventor
胡钦扬
李星
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2017143910A1 publication Critical patent/WO2017143910A1/en

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S5/00Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
    • G01S5/18Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using ultrasonic, sonic, or infrasonic waves
    • G01S5/22Position of source determined by co-ordinating a plurality of position lines defined by path-difference measurements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/66Remote control of cameras or camera parts, e.g. by remote control devices

Definitions

  • the present invention relates to the field of communications, and in particular to an acquisition processing method, apparatus, system, and computer storage medium.
  • Embodiments of the present invention provide an acquisition processing method, apparatus, system, and computer storage medium.
  • An aspect of an embodiment of the present invention provides an acquisition processing method, including: determining, by a ring microphone array group, current location information of a sound issuer; and determining, according to the current location information, the sound issuer relative to an image capture device Relative position information, wherein the image acquisition device is configured to acquire an image of the sound issuer; and adjust the location according to the relative position information The angle at which the image capture device collects the sound issuer.
  • an acquisition processing apparatus comprising: a first determining module configured to determine current location information of a sound issuer through a ring microphone array group; and a second determining module configured to The current location information determines relative position information of the sound issuer relative to the image capture device, wherein the image capture device is configured to acquire an image of the sound issuer; and the adjustment module is configured to adjust the image according to the relative position information The angle at which the image capture device collects the sound issuer.
  • Another aspect of the embodiments of the present invention further provides an acquisition processing system, including: an image collection device, a sound issuer, and a ring microphone array group, wherein the ring microphone array group is configured to determine a current sound issuer Location information, and determining relative position information of the sound issuer relative to the image capture device based on the current location information, wherein the image capture device is configured to acquire an image of the sound issuer; the image capture device, After the relative position information is acquired, the collection angle of the sound emitting device by the image capturing device is adjusted according to the relative position information.
  • the embodiment of the invention further provides a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the foregoing collection processing method.
  • the technical solution provided by the embodiment of the present invention can determine the current location information of the sound issuer through the ring microphone array group; and determine the relative position information of the sound issuer relative to the image capture device according to the current location information, where the image
  • the collecting device is configured to collect an image of the sound emitting person, and adjust an acquisition angle of the sound emitting device by the image collecting device according to the relative position information, and solve the related art, the positioning speaker (ie, the sound emitting person)
  • the technical solution has the problem of low precision, thereby improving the positioning accuracy of the positioner of the positioning sound.
  • FIG. 1 is a schematic flowchart of a first collection processing method according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram of a first ring microphone array according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of a first microphone array group according to an embodiment of the present invention.
  • FIG. 4 is another schematic diagram of a second microphone array group according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a second microphone array position according to an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of positioning of a first sound issuer according to an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of a first type of collection processing apparatus according to an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of a first determining module 70 of a second type of collection processing apparatus according to an embodiment of the present invention.
  • FIG. 9 is another schematic structural diagram of a third collection processing device according to an embodiment of the present invention.
  • FIG. 10 is a schematic structural diagram of an acquisition processing system according to an embodiment of the present invention.
  • FIG. 1 is a flowchart of an acquisition processing method according to an embodiment of the present invention. As shown in FIG. 1, the method includes the following steps:
  • Step S102 determining current location information of the sound issuer through the ring microphone array group
  • Step S104 determining relative position information of the sound issuer relative to the image capture device according to the current position information, wherein the image capture device is configured to collect an image of the sound issuer;
  • Step S106 adjusting an acquisition angle of the sound emitting device by the image capturing device according to the relative position information.
  • the current location information of the sound issuer can be determined by the ring microphone array group; and the relative position information of the sound issuer relative to the image capture device is determined according to the current location information, where
  • the image capture device is configured to collect an image of the sound issuer; adjust the collection angle of the sound imager by the image capture device according to the relative position information, and solve the related art, locate the speaker (ie, the sound issuer)
  • the technical solution has a problem of low precision, thereby improving the positioning accuracy of the positioner of the positioning sound.
  • step S104 before determining the relative position information of the sound issuer relative to the image capture device, knowing the position information determined by the ring microphone array group in advance, and the relative relationship with the image capture device, in step S104, according to this relative relationship, combined with the current position information acquired by the ring microphone array and the relative relationship, the relative position information of the sound issuer relative to the image capture device is obtained by calculation.
  • the step S104 may further include: transmitting current location information determined by the ring microphone array group to other devices on the network side, for example, a positioning server, according to the current location information by the positioning server. And the relative relationship, the relative position information of the sound issuer relative to the image capture device is calculated.
  • the relative relationship herein may include a first relative position of the annular microphone array relative to the image capture device.
  • the current position information may be a second relative position of the sound generator relative to a specific position in the annular array.
  • the first relative position, the second relative position, and three-dimensional modeling may determine the sound sender relative to the image collection.
  • Relative position information of the device Here, specific information on how to position the sound issuer relative to the image capture device is provided, and is not limited thereto.
  • the plurality of microphones in the circular microphone array can collect sounds generated by the generator from different angles or collect sounds generated by the generators at different positions, and then collect sound and sound signals based on multi-angle or multi-position transmission, which can be precisely determined.
  • the current position information is obtained, so that the accuracy of the position information can be improved, thereby achieving precise adjustment of the acquisition angle.
  • the number of the wake-up microphone arrays included in the ring microphone array group is preferably the following number of two, three, or four.
  • the following embodiments of the present invention are only two and three, for four or more. The implementation of the technical solution is not described here because the similar solution is adopted.
  • determining the current location information of the sound issuer by the ring microphone array group may be implemented by: adopting the first ring microphone in the ring microphone array group Determining, by the array, a first relative angle of the sound emitter; determining, by the second annular microphone array in the annular microphone array group, a second relative angle of the sound emitter; determining the above according to the first relative angle and the second relative angle Current location information.
  • the method further includes: determining the first relative angle according to the plurality of microphones of the first ring microphone array; and determining the second relative angle according to the plurality of microphones of the second ring microphone array.
  • the method further includes: acquiring first position information of the image capturing device relative to each of the ring microphone arrays in the ring microphone array group, and between the two ring microphone arrays in each of the ring microphone arrays Second location information.
  • the method further includes: determining the relative location information according to the current location information, the first location information, and the second location information.
  • FIG. 2 is a schematic diagram of a ring microphone array according to an embodiment of the present invention. As shown in FIG. 2, the four microphones in FIG. 2 are annularly distributed, and the microphone faces outward, at 0°, 90°, 180°, and 270°, respectively.
  • a ring microphone array with different numbers of microphones and spacings may be used as an alternative to the embodiment of the present invention; the number of arrays of the present invention may be limited to two, and more than two ring microphone array groups are within the protection scope of the patent. See Figure 3 and Figure 4.
  • FIG. 5 is a schematic diagram of a position of a microphone array according to an embodiment of the present invention, based on FIG. 5:
  • Step 1 Referring to Figure 5, the two ring microphone arrays are fixed. The preferred position is to align the 0° of the two microphone array positioning algorithms onto the same line. The distance between the two microphone arrays is adjusted according to the actual situation (conference room, conference table size);
  • Step 2 according to the actual situation, the microphone array group and the camera (corresponding to the image acquisition device of the above embodiment) are arranged in the use environment, when the camera and the microphone array group are in the same When it is flat, the practical effect is the best;
  • FIG. 6 is a schematic diagram of positioning of a sound issuer according to an embodiment of the present invention. Based on the schematic diagram of FIG. 6, the specific operation flow includes the following processing steps:
  • Step A the camera, the microphone array and the sound source shown in FIG. 6 are projections of the corresponding objects on the ground.
  • This step is necessary data measurement, and the part of the data is input into the sound source localization algorithm, and the data to be measured includes:
  • the angle between the camera projection 0° line and the projection of the microphone array 1 is ⁇ ;
  • Step B According to the positioning result ⁇ of the microphone array 1, the positioning result ⁇ of the microphone array 2, and the actual data measured in the step A, the position of the sound source can be calculated by the sine and cosine theorem (the distance from the camera to the sound source L5) , the angle between the camera 0 ° and the sound source ⁇ );
  • Microphone array sound source localization method :
  • MVDR Minimum Variance Distortion Less Response, which can obtain higher azimuth resolution and noise interference suppression performance
  • beamforming is performed on the sub-array to find the beam direction of the sound source.
  • the microphone array 1 projection, the microphone array 2 projection formed by the triangle, through the sine theorem, L4 can be calculated;
  • Step C Rotate the camera to the designated position according to the final calculation result of the step B, the relative angle ⁇ of the camera and the sound source.
  • FIG. 7 is a structural block diagram of an acquisition processing apparatus according to an embodiment of the present invention. As shown in Figure 7, the device includes:
  • the first determining module 70 is configured to determine the sound issuer through the ring microphone array group Front position information
  • the second determining module 72 is configured to determine relative position information of the sound issuer relative to the image capture device according to the current position information, wherein the image capture device is configured to collect an image of the sound issuer;
  • the adjustment module 74 is configured to adjust an acquisition angle of the sound emitting device by the image capturing device according to the relative position information.
  • the current position information of the sound issuer can be determined by the ring microphone array group through the combined action of the above modules; the relative position information of the sound issuer relative to the image capture device is determined according to the current position information, wherein the image is collected.
  • the device is configured to collect an image of the sound issuer; adjust an acquisition angle of the sound issuer by the image capture device according to the relative position information, and solve the related art, the locate speaker (ie, the sound issuer)
  • the technical solution has a problem of low precision, thereby improving the positioning accuracy of the positioner of the positioning sound.
  • the number of the wake-up microphone arrays included in the ring microphone array group in the first determining module 70 includes: 2, 3, and 4.
  • FIG. 8 is a structural block diagram of a first determining module 70 of the collection processing device according to an embodiment of the present invention. As shown in FIG. 8, when the ring microphone array group includes two ring microphone arrays, the first determining module 70 includes:
  • the first determining unit 700 is configured to determine, by using the first ring microphone array in the ring microphone array group, a first relative angle of the sound issuer;
  • the second determining unit 702 is configured to determine a second relative angle of the sound issuer by using the second ring microphone array in the ring microphone array group;
  • the third determining unit 704 is configured to determine the current location information according to the first relative angle and the second relative angle.
  • FIG. 9 is another structural block diagram of an acquisition processing apparatus according to an embodiment of the present invention. As shown in FIG. 9, the apparatus further includes:
  • the obtaining module 76 is configured to acquire first position information of the image capturing device relative to each ring microphone array in the ring microphone array group, and second position information between the two ring microphone arrays in each of the ring microphone arrays.
  • FIG. 10 is a structural block diagram of an acquisition processing system according to an embodiment of the present invention.
  • the image collection device 100 includes a sound issuer 102. , ring microphone array group 104.
  • the ring microphone array group 104 is configured to determine current location information of the sound issuer, and determine relative position information of the sound issuer relative to the image capture device 100 according to the current location information, wherein the image capture device 100 is configured to collect The image capturing device 100 is configured to adjust the collection angle of the sound emitting device by the image capturing device based on the relative position information after acquiring the relative position information.
  • the trackable speaker system of the preferred embodiment of the present invention includes the following components: a ring microphone array set; a camera.
  • the sound source positioning designed by the preferred embodiment of the present invention uses a plurality of ring microphone arrays, and the two arrays are arranged in a line of absolute 0 degree according to the positioning algorithm, and the ring microphone array group and the camera position can be flexibly placed:
  • Step 1 Install hardware devices, including placement of the microphone array and fixing of the camera device;
  • Step 2. Measure the necessary data and input it to the calculation module.
  • the data to be measured includes but not limited to The distance between the microphone arrays, the distance from the camera to the microphone array, and the angle of rotation between the camera and each microphone array.
  • Step 3 Calculate the precise position of the sound source by using a positioning algorithm according to the data measured in step 2;
  • Step 4 Rotate the camera to the sound source orientation according to the positioning result.
  • the technical solution of the preferred embodiment of the present invention improves the accuracy of sound source localization.
  • the microphone array of the method of the present invention can be used not only as a sound source for positioning but also as a sound collecting device. Therefore, while reducing the hardware cost, it also greatly reduces the difficulty of implementing the solution. Can be widely used in video conferencing, security and various real-time monitoring areas.
  • a computer storage medium having computer executable instructions for performing any one or more of the collection processing methods described above, for example, executable The method shown in Figure 1.
  • the computer storage medium includes, but is not limited to, an optical disk, a floppy disk, a hard disk, a rewritable memory, etc., optionally a non-transitory storage medium.
  • the ring microphone array group is used to collect the sound of the generator, and the sound is collected from different angles by the plurality of microphones in the ring microphone array group, so that the current position information of the sound sender can be accurately determined and reused. Determining current position information, determining relative position information of the occurrence of the occurrence relative to the image acquisition device, wherein the image acquisition device is configured to acquire an image of the sound issuer; and finally, adjusting the image according to the relative position information
  • the collecting angle of the collecting device to the sound emitting person is simple in the process, high in reproducibility and high in obtaining position information, thereby providing a more accurate acquisition angle adjustment of the image capturing device.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Obtaining Desirable Characteristics In Audible-Bandwidth Transducers (AREA)
  • Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)

Abstract

The present invention provides an acquisition processing method, device and system, wherein the method comprises: determining current position information of a speaker by means of an annular microphone array group; determining relative position information of the speaker with respect to an image acquisition device according to the current position information, wherein the image acquisition device is used for acquiring an image of the speaker; and adjusting the acquisition angle of the image acquisition device for the speaker according to the relative position information. Embodiments of the present invention also provide a computer storage medium.

Description

采集处理方法、装置、系统和计算机存储介质Acquisition processing method, device, system and computer storage medium 技术领域Technical field
本发明涉及通信领域,具体而言,涉及一种采集处理方法、装置、系统和计算机存储介质。The present invention relates to the field of communications, and in particular to an acquisition processing method, apparatus, system, and computer storage medium.
背景技术Background technique
现有视频跟踪系统主要有两类:(1)采用高清广角摄像头,对整个场景进行拍摄。这种方式局限性在于如果想呈现话者的局部细节,则需要人工调整摄像头,这样的处理方式十分繁琐,用户体验度很低;(2)将线性麦克风阵列与摄像头相结合来定位话者位置的技术方案,虽然摄像头能够根据定位结果进行旋转,但是这种方式定位仍然存在以下问题:麦克风阵列仅做定位用,功能单一,相当于额外增加了硬件成本;上述两种定位话者方式的方法都存在定位精度不高的问题。There are two main types of existing video tracking systems: (1) Adopting a high-definition wide-angle camera to shoot the entire scene. The limitation of this method is that if you want to present the local details of the speaker, you need to manually adjust the camera. This kind of processing is very cumbersome and the user experience is very low. (2) Combine the linear microphone array with the camera to locate the speaker. The technical solution, although the camera can rotate according to the positioning result, but the positioning still has the following problems: the microphone array is only used for positioning, and the function is single, which is equivalent to an additional hardware cost; the above two ways of positioning the speaker mode There is a problem that the positioning accuracy is not high.
针对相关技术中,定位话者的技术方案存在精度不高的问题,尚未提出有效的解决方案。In view of the related art, the technical solution for locating the speaker has a problem of low precision, and an effective solution has not been proposed.
发明内容Summary of the invention
本发明实施例提供了一种采集处理方法、装置、系统和和计算机存储介质。Embodiments of the present invention provide an acquisition processing method, apparatus, system, and computer storage medium.
本发明实施例的一个方面,提供了一种采集处理方法,包括:通过环形麦克风阵列组确定声音发出者的当前位置信息;根据所述当前位置信息确定所述声音发出者相对于图像采集装置的相对位置信息,其中,该图像采集装置用于采集所述声音发出者的图像;根据所述相对位置信息调整所 述图像采集装置对所述声音发出者的采集角度。An aspect of an embodiment of the present invention provides an acquisition processing method, including: determining, by a ring microphone array group, current location information of a sound issuer; and determining, according to the current location information, the sound issuer relative to an image capture device Relative position information, wherein the image acquisition device is configured to acquire an image of the sound issuer; and adjust the location according to the relative position information The angle at which the image capture device collects the sound issuer.
根据本发明的另一个方面,还提供了一种采集处理装置,包括:第一确定模块,配置为通过环形麦克风阵列组确定声音发出者的当前位置信息;第二确定模块,配置为根据所述当前位置信息确定所述声音发出者相对于图像采集装置的相对位置信息,其中,该图像采集装置用于采集所述声音发出者的图像;调整模块,配置为根据所述相对位置信息调整所述图像采集装置对所述声音发出者的采集角度。According to another aspect of the present invention, there is also provided an acquisition processing apparatus, comprising: a first determining module configured to determine current location information of a sound issuer through a ring microphone array group; and a second determining module configured to The current location information determines relative position information of the sound issuer relative to the image capture device, wherein the image capture device is configured to acquire an image of the sound issuer; and the adjustment module is configured to adjust the image according to the relative position information The angle at which the image capture device collects the sound issuer.
本发明实施例的另一个方面,还提供了一种采集处理系统,包括:图像采集装置,声音发出者,环形麦克风阵列组,其中,所述环形麦克风阵列组,配置为确定声音发出者的当前位置信息,以及根据所述当前位置信息确定所述声音发出者相对于图像采集装置的相对位置信息,其中,该图像采集装置配置为采集所述声音发出者的图像;所述图像采集装置,用于获取到所述相对位置信息后,根据该相对位置信息调整所述图像采集装置对所述声音发出者的采集角度。Another aspect of the embodiments of the present invention further provides an acquisition processing system, including: an image collection device, a sound issuer, and a ring microphone array group, wherein the ring microphone array group is configured to determine a current sound issuer Location information, and determining relative position information of the sound issuer relative to the image capture device based on the current location information, wherein the image capture device is configured to acquire an image of the sound issuer; the image capture device, After the relative position information is acquired, the collection angle of the sound emitting device by the image capturing device is adjusted according to the relative position information.
本发明实施例还提供一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,所述计算机可执行指令用于执行前述采集处理方法。The embodiment of the invention further provides a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the foregoing collection processing method.
本发明实施例提供的技术方案,能够通过环形麦克风阵列组确定声音发出者的当前位置信息;根据所述当前位置信息确定所述声音发出者相对于图像采集装置的相对位置信息,其中,该图像采集装置用于采集所述声音发出者的图像;根据所述相对位置信息调整所述图像采集装置对所述声音发出者的采集角度,解决了相关技术中,定位话者(即声音发出者)的技术方案存在精度不高的问题,进而提高定位声音发出者的定位精度。 The technical solution provided by the embodiment of the present invention can determine the current location information of the sound issuer through the ring microphone array group; and determine the relative position information of the sound issuer relative to the image capture device according to the current location information, where the image The collecting device is configured to collect an image of the sound emitting person, and adjust an acquisition angle of the sound emitting device by the image collecting device according to the relative position information, and solve the related art, the positioning speaker (ie, the sound emitting person) The technical solution has the problem of low precision, thereby improving the positioning accuracy of the positioner of the positioning sound.
附图说明DRAWINGS
图1为根据本发明实施例提供的第一种采集处理方法的流程示意图;1 is a schematic flowchart of a first collection processing method according to an embodiment of the present invention;
图2为根据本发明实施例提供的第一种环形麦克风阵列示意图;2 is a schematic diagram of a first ring microphone array according to an embodiment of the present invention;
图3为根据本发明实施例提供的第一种麦克风阵列组示意图;3 is a schematic diagram of a first microphone array group according to an embodiment of the present invention;
图4为根据本发明实施例提供的第二种麦克风阵列组的另一示意图;4 is another schematic diagram of a second microphone array group according to an embodiment of the present invention;
图5为根据本发明实施例提供的第二种麦克风阵列位置示意图;FIG. 5 is a schematic diagram of a second microphone array position according to an embodiment of the present invention; FIG.
图6为根据本发明实施例提供的第一种声音发出者的定位示意图;FIG. 6 is a schematic diagram of positioning of a first sound issuer according to an embodiment of the present invention; FIG.
图7为根据本发明实施例提供的第一种采集处理装置的结构示意图;FIG. 7 is a schematic structural diagram of a first type of collection processing apparatus according to an embodiment of the present invention; FIG.
图8为根据本发明实施例提供的第二种采集处理装置的第一确定模块70的结构示意图;FIG. 8 is a schematic structural diagram of a first determining module 70 of a second type of collection processing apparatus according to an embodiment of the present invention;
图9为根据本发明实施例提供的第三种采集处理装置的另一结构示意图;FIG. 9 is another schematic structural diagram of a third collection processing device according to an embodiment of the present invention; FIG.
图10为根据本发明实施例提供的采集处理系统的结构示意图。FIG. 10 is a schematic structural diagram of an acquisition processing system according to an embodiment of the present invention.
具体实施方式detailed description
下文中将参考附图并结合实施例来详细说明本发明。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。The invention will be described in detail below with reference to the drawings in conjunction with the embodiments. It should be noted that the embodiments in the present application and the features in the embodiments may be combined with each other without conflict.
本发明的其它特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者通过实施本发明而了解。本发明的目的和其他优点可通过在所写的说明书、权利要求书、以及附图中所特别指出的结构来实现和获得。Other features and advantages of the invention will be set forth in the description which follows, The objectives and other advantages of the invention may be realized and obtained by means of the structure particularly pointed in the appended claims.
为了使本技术领域的人员更好地理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述, 显然,所描述的实施例仅仅是本发明一部分的实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本发明保护的范围。The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is apparent that the described embodiments are merely a part of the embodiments of the invention, and not all of the embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts shall fall within the scope of the present invention.
在本发明实施例中,还提供了一种采集处理方法,图1为根据本发明实施例的采集处理方法的流程图,如图1所示,包括以下步骤:In the embodiment of the present invention, an acquisition processing method is also provided. FIG. 1 is a flowchart of an acquisition processing method according to an embodiment of the present invention. As shown in FIG. 1, the method includes the following steps:
步骤S102,通过环形麦克风阵列组确定声音发出者的当前位置信息;Step S102, determining current location information of the sound issuer through the ring microphone array group;
步骤S104,根据上述当前位置信息确定上述声音发出者相对于图像采集装置的相对位置信息,其中,该图像采集装置用于采集上述声音发出者的图像;Step S104, determining relative position information of the sound issuer relative to the image capture device according to the current position information, wherein the image capture device is configured to collect an image of the sound issuer;
步骤S106,根据上述相对位置信息调整上述图像采集装置对上述声音发出者的采集角度。Step S106, adjusting an acquisition angle of the sound emitting device by the image capturing device according to the relative position information.
通过本发明实施例提供的各个步骤,能够通过环形麦克风阵列组确定声音发出者的当前位置信息;根据所述当前位置信息确定所述声音发出者相对于图像采集装置的相对位置信息,其中,该图像采集装置用于采集所述声音发出者的图像;根据所述相对位置信息调整所述图像采集装置对所述声音发出者的采集角度,解决了相关技术中,定位话者(即声音发出者)的技术方案存在精度不高的问题,进而提高定位声音发出者的定位精度。Through the steps provided by the embodiments of the present invention, the current location information of the sound issuer can be determined by the ring microphone array group; and the relative position information of the sound issuer relative to the image capture device is determined according to the current location information, where The image capture device is configured to collect an image of the sound issuer; adjust the collection angle of the sound imager by the image capture device according to the relative position information, and solve the related art, locate the speaker (ie, the sound issuer) The technical solution has a problem of low precision, thereby improving the positioning accuracy of the positioner of the positioning sound.
在确定所述步骤S104中,确定声音发出者相对于图像采集装置的相对位置信息之前,预先知道所述环形麦克风阵列组确定的位置信息,与图像采集装置的相对关系,在步骤S104中就可以根据这种相对关系,结合环形麦克风阵列采集的当前位置信息和所述相对关系,通过计算得到所述声音发出者相对于图像采集装置的相对位置信息。在一些实施例中,所述步骤S104还可包括:将所述环形麦克风阵列组确定的当前位置信息,发送给网络侧的其他设备,例如,定位服务器,由定位服务器根据所述当前位置信 息及所述相对关系,计算出所述声音发出者相对于图像采集装置的相对位置信息。这里的相对关系,可以包括:环形麦克风阵列相对于图像采集装置的第一相对位置。而所述当前位置信息可为声音发生者相对于环形阵列中特定位置的第二相对位置,通过第一相对位置、第二相对位置及三维建模等方式,可以确定声音发出者相对于图像采集装置的相对位置信息。这里提供了具体如何定位声音发出者相对于图像采集装置的相对位置信息,单不局限于此。环形麦克风阵列中的多个麦克风可以从不同角度采集所述发生者发出的声音或不同位置采集所述发生者发出的声音,然后基于多角度或多位置采集声音及声音信号的传输,可以精确定得到所述当前位置信息,从而可以提升位置信息的精确度,从而实现采集角度的精确调整。In determining the step S104, before determining the relative position information of the sound issuer relative to the image capture device, knowing the position information determined by the ring microphone array group in advance, and the relative relationship with the image capture device, in step S104, According to this relative relationship, combined with the current position information acquired by the ring microphone array and the relative relationship, the relative position information of the sound issuer relative to the image capture device is obtained by calculation. In some embodiments, the step S104 may further include: transmitting current location information determined by the ring microphone array group to other devices on the network side, for example, a positioning server, according to the current location information by the positioning server. And the relative relationship, the relative position information of the sound issuer relative to the image capture device is calculated. The relative relationship herein may include a first relative position of the annular microphone array relative to the image capture device. The current position information may be a second relative position of the sound generator relative to a specific position in the annular array. The first relative position, the second relative position, and three-dimensional modeling may determine the sound sender relative to the image collection. Relative position information of the device. Here, specific information on how to position the sound issuer relative to the image capture device is provided, and is not limited thereto. The plurality of microphones in the circular microphone array can collect sounds generated by the generator from different angles or collect sounds generated by the generators at different positions, and then collect sound and sound signals based on multi-angle or multi-position transmission, which can be precisely determined. The current position information is obtained, so that the accuracy of the position information can be improved, thereby achieving precise adjustment of the acquisition angle.
可选地,上述环形麦克风阵列组包括的唤醒麦克风阵列的数量优选以下数量2个、3个、4个,本发明以下实施例仅以2个和3个为例,对于4个及4个以上技术方案的实现方式由于采用的是相似的解决方案,本发明实施例在此不再赘述。Optionally, the number of the wake-up microphone arrays included in the ring microphone array group is preferably the following number of two, three, or four. The following embodiments of the present invention are only two and three, for four or more. The implementation of the technical solution is not described here because the similar solution is adopted.
可选地,在上述环形麦克风阵列组包括两个环形麦克风阵列时,通过环形麦克风阵列组确定声音发出者的当前位置信息可以通过以下方案实现:通过上述环形麦克风阵列组中的第一上述环形麦克风阵列确定上述声音发出者的第一相对角度;通过上述环形麦克风阵列组中的第二上述环形麦克风阵列确定上述声音发出者的第二相对角度;根据上述第一相对角度和第二相对角度确定上述当前位置信息。Optionally, when the ring microphone array group includes two ring microphone arrays, determining the current location information of the sound issuer by the ring microphone array group may be implemented by: adopting the first ring microphone in the ring microphone array group Determining, by the array, a first relative angle of the sound emitter; determining, by the second annular microphone array in the annular microphone array group, a second relative angle of the sound emitter; determining the above according to the first relative angle and the second relative angle Current location information.
具体地,上述方法还包括:根据上述第一环形麦克风阵列的多个咪头确定上述第一相对角度;根据上述第二环形麦克风阵列的多个咪头确定上述第二相对角度。Specifically, the method further includes: determining the first relative angle according to the plurality of microphones of the first ring microphone array; and determining the second relative angle according to the plurality of microphones of the second ring microphone array.
可选地,根据上述当前位置信息确定上述声音发出者相对于图像采集 装置的相对位置信息之前,上述方法还包括:获取上述图像采集装置相对于上述环形麦克风阵列组中各个环形麦克风阵列的第一位置信息,以及上述各个环形麦克风阵列中,两两环形麦克风阵列之间的第二位置信息。Optionally, determining, according to the current location information, that the sound emitter is collected relative to the image Before the relative position information of the device, the method further includes: acquiring first position information of the image capturing device relative to each of the ring microphone arrays in the ring microphone array group, and between the two ring microphone arrays in each of the ring microphone arrays Second location information.
在本发明实施例中,上述方法还包括:根据上述当前位置信息、上述第一位置信息以及上述第二位置信息确定上述相对位置信息。In the embodiment of the present invention, the method further includes: determining the relative location information according to the current location information, the first location information, and the second location information.
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本发明并不受所描述的动作顺序的限制,因为依据本发明,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本发明所必需的。It should be noted that, for the foregoing method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should understand that the present invention is not limited by the described action sequence. Because certain steps may be performed in other sequences or concurrently in accordance with the present invention. Secondly, those skilled in the art should also understand that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by the present invention.
以下结合一示例对上述技术方案进行详细说明,但不用于限定本发明实施例的保护范围。The above technical solutions are described in detail below with reference to an example, but are not intended to limit the scope of protection of the embodiments of the present invention.
图2为根据本发明实施例的环形麦克风阵列示意图,如图2所示,图2中四个麦克风呈环形分布,麦克风正面朝外,分别位于0°、90°、180°和270°处。不同麦克风数量、间隔距离的环形麦克风阵列均可作为本发明实施例的备选;本发明所述阵列个数可不限为2个,2个以上的环形麦克风阵列组均在本专利的保护范围内,可参见图3和图4。2 is a schematic diagram of a ring microphone array according to an embodiment of the present invention. As shown in FIG. 2, the four microphones in FIG. 2 are annularly distributed, and the microphone faces outward, at 0°, 90°, 180°, and 270°, respectively. A ring microphone array with different numbers of microphones and spacings may be used as an alternative to the embodiment of the present invention; the number of arrays of the present invention may be limited to two, and more than two ring microphone array groups are within the protection scope of the patent. See Figure 3 and Figure 4.
图5为根据本发明实施例的麦克风阵列位置示意图,基于图5所示:FIG. 5 is a schematic diagram of a position of a microphone array according to an embodiment of the present invention, based on FIG. 5:
步骤一、参见图5,将2个环形麦克风阵列固定,优选位置是将两麦克风阵列定位算法的0°对齐到同一直线上。两麦克风阵列之间的距离根据实际情况(会议室、会议桌大小)做相应调整;Step 1. Referring to Figure 5, the two ring microphone arrays are fixed. The preferred position is to align the 0° of the two microphone array positioning algorithms onto the same line. The distance between the two microphone arrays is adjusted according to the actual situation (conference room, conference table size);
步骤二、根据实际情况将麦克风阵列组与摄像头(相当于上述实施例的图像采集装置)布置到使用环境中,当摄像头与麦克风阵列组处于同一 平面时,实用效果最好; Step 2, according to the actual situation, the microphone array group and the camera (corresponding to the image acquisition device of the above embodiment) are arranged in the use environment, when the camera and the microphone array group are in the same When it is flat, the practical effect is the best;
图6为根据本发明实施例的声音发出者的定位示意图,基于图6的示意,具体操作流程包括以下处理步骤:FIG. 6 is a schematic diagram of positioning of a sound issuer according to an embodiment of the present invention. Based on the schematic diagram of FIG. 6, the specific operation flow includes the following processing steps:
步骤A、图6中所示的摄像头、麦克风阵列及声源均为对应物在地面的投影,本步骤为必要数据测量,该部分数据将输入到声源定位算法中,需要测量的数据包括:Step A, the camera, the microphone array and the sound source shown in FIG. 6 are projections of the corresponding objects on the ground. This step is necessary data measurement, and the part of the data is input into the sound source localization algorithm, and the data to be measured includes:
两阵列投影之间的距离L3;The distance L3 between the two array projections;
摄像头投影到阵列1投影的距离L1;The distance L1 projected by the camera to the array 1 is projected;
摄像头投影到阵列2投影的距离L2;The distance L2 projected by the camera to the array 2;
摄像头投影0°线与麦克风阵列1投影的夹角α;The angle between the camera projection 0° line and the projection of the microphone array 1 is α;
步骤B、根据麦克风阵列1的定位结果β、麦克风阵列2的定位位结果γ以及步骤A中测量的实际数据,通过正弦、余弦定理即可计算出声源所在位置(摄像头到声源的距离L5、摄像头0°到声源的夹角ε);Step B: According to the positioning result β of the microphone array 1, the positioning result γ of the microphone array 2, and the actual data measured in the step A, the position of the sound source can be calculated by the sine and cosine theorem (the distance from the camera to the sound source L5) , the angle between the camera 0 ° and the sound source ε);
麦克风阵列声源定位方法:Microphone array sound source localization method:
1)根据环形咪头数量,计算出每个咪头的采集能量;1) Calculate the collected energy of each microphone according to the number of ring microphones;
2)通过VAD选择出正常语音阶段,进行各个咪头的能量判断;2) Selecting the normal speech stage through VAD to perform energy judgment of each microphone;
3)选择出能量最大的咪头,然后联合左右两个咪头,三个咪头组成一个麦克风子阵列,所形成的圆周角度作为初选的声源方位;3) Select the microphone with the largest energy, and then combine the two microphones, the three microphones form a microphone sub-array, and the circumferential angle formed is used as the primary sound source orientation;
4)根据MVDR(Minimum Variance Distortionless Response,即最小方差信号无畸变响应法,该方法可获得较高的方位分辨力和噪声干扰抑制性能)方法对子阵进行波束形成,找出声源的波束方位;4) According to MVDR (Minimum Variance Distortion Less Response, which can obtain higher azimuth resolution and noise interference suppression performance), beamforming is performed on the sub-array to find the beam direction of the sound source. ;
5)通过多帧统计,找出稳定的声源方位,及本实施例中的β和γ。 5) Through the multi-frame statistics, find a stable sound source orientation, and β and γ in this embodiment.
声源精确位置计算方法:Sound source precise position calculation method:
通过声源投影、麦克风阵列1投影、麦克风阵列2投影所组成的三角形,通过正弦定理,可计算出L4;Through the sound source projection, the microphone array 1 projection, the microphone array 2 projection formed by the triangle, through the sine theorem, L4 can be calculated;
Figure PCTCN2017073176-appb-000001
Figure PCTCN2017073176-appb-000001
已知L1、L2、L3组成的三角形三边长度,通过余弦定理可计算出角δ;It is known that the lengths of the three sides of the triangle composed of L1, L2, and L3 can be calculated by the cosine theorem;
Figure PCTCN2017073176-appb-000002
Figure PCTCN2017073176-appb-000002
基于摄像头投影、麦克风阵列1投影、麦克风阵列2投影所组成的三角形,通过余弦定理,可计算出L5;Based on the triangle formed by camera projection, microphone array 1 projection, and microphone array 2 projection, L5 can be calculated by the cosine theorem;
Figure PCTCN2017073176-appb-000003
Figure PCTCN2017073176-appb-000003
通过正弦定理,可计算出角ε;Through the sine theorem, the angle ε can be calculated;
Figure PCTCN2017073176-appb-000004
Figure PCTCN2017073176-appb-000004
步骤C、根据步骤B的最终计算结果,摄像头与声源的相对角度ε,将摄像头旋转到指定位置。Step C: Rotate the camera to the designated position according to the final calculation result of the step B, the relative angle ε of the camera and the sound source.
在本实施例中还提供了一种采集处理装置,用于实现上述实施例及优选实施方式,已经进行过说明的不再赘述,下面对该装置中涉及到的模块进行说明。如以下所使用的,术语“模块”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置较佳地以软件来实现,但是硬件,或者软件和硬件的组合的实现也是可能并被构想的。图7为根据本发明实施例的采集处理装置的结构框图。如图7所示,该装置包括:In the embodiment, an acquisition processing device is also provided for implementing the above-mentioned embodiments and preferred embodiments. The descriptions of the modules and the preferred embodiments have been omitted. The modules involved in the devices are described below. As used below, the term "module" may implement a combination of software and/or hardware of a predetermined function. Although the apparatus described in the following embodiments is preferably implemented in software, hardware, or a combination of software and hardware, is also possible and contemplated. FIG. 7 is a structural block diagram of an acquisition processing apparatus according to an embodiment of the present invention. As shown in Figure 7, the device includes:
第一确定模块70,配置为通过环形麦克风阵列组确定声音发出者的当 前位置信息;The first determining module 70 is configured to determine the sound issuer through the ring microphone array group Front position information;
第二确定模块72,配置为根据上述当前位置信息确定上述声音发出者相对于图像采集装置的相对位置信息,其中,该图像采集装置用于采集上述声音发出者的图像;The second determining module 72 is configured to determine relative position information of the sound issuer relative to the image capture device according to the current position information, wherein the image capture device is configured to collect an image of the sound issuer;
调整模块74,配置为根据上述相对位置信息调整上述图像采集装置对上述声音发出者的采集角度。The adjustment module 74 is configured to adjust an acquisition angle of the sound emitting device by the image capturing device according to the relative position information.
通过上述各个模块的综合作用,能够通过环形麦克风阵列组确定声音发出者的当前位置信息;根据所述当前位置信息确定所述声音发出者相对于图像采集装置的相对位置信息,其中,该图像采集装置用于采集所述声音发出者的图像;根据所述相对位置信息调整所述图像采集装置对所述声音发出者的采集角度,解决了相关技术中,定位话者(即声音发出者)的技术方案存在精度不高的问题,进而提高定位声音发出者的定位精度。The current position information of the sound issuer can be determined by the ring microphone array group through the combined action of the above modules; the relative position information of the sound issuer relative to the image capture device is determined according to the current position information, wherein the image is collected. The device is configured to collect an image of the sound issuer; adjust an acquisition angle of the sound issuer by the image capture device according to the relative position information, and solve the related art, the locate speaker (ie, the sound issuer) The technical solution has a problem of low precision, thereby improving the positioning accuracy of the positioner of the positioning sound.
第一确定模块70中上述环形麦克风阵列组包括的唤醒麦克风阵列的数量包括:2个、3个、4个。The number of the wake-up microphone arrays included in the ring microphone array group in the first determining module 70 includes: 2, 3, and 4.
图8为根据本发明实施例的采集处理装置的第一确定模块70的结构框图,如图8所示,在上述环形麦克风阵列组包括两个环形麦克风阵列时,第一确定模块70,包括:FIG. 8 is a structural block diagram of a first determining module 70 of the collection processing device according to an embodiment of the present invention. As shown in FIG. 8, when the ring microphone array group includes two ring microphone arrays, the first determining module 70 includes:
第一确定单元700,配置为通过上述环形麦克风阵列组中的第一上述环形麦克风阵列确定上述声音发出者的第一相对角度;The first determining unit 700 is configured to determine, by using the first ring microphone array in the ring microphone array group, a first relative angle of the sound issuer;
第二确定单元702,配置为通过上述环形麦克风阵列组中的第二上述环形麦克风阵列确定上述声音发出者的第二相对角度;The second determining unit 702 is configured to determine a second relative angle of the sound issuer by using the second ring microphone array in the ring microphone array group;
第三确定单元704,配置为根据上述第一相对角度和第二相对角度确定上述当前位置信息。 The third determining unit 704 is configured to determine the current location information according to the first relative angle and the second relative angle.
在本发明实施例中,图9为根据本发明实施例的采集处理装置的另一结构框图,如图9所示,上述装置还包括:In the embodiment of the present invention, FIG. 9 is another structural block diagram of an acquisition processing apparatus according to an embodiment of the present invention. As shown in FIG. 9, the apparatus further includes:
获取模块76,配置为获取上述图像采集装置相对于上述环形麦克风阵列组中各个环形麦克风阵列的第一位置信息,以及上述各个环形麦克风阵列中,两两环形麦克风阵列之间的第二位置信息。The obtaining module 76 is configured to acquire first position information of the image capturing device relative to each ring microphone array in the ring microphone array group, and second position information between the two ring microphone arrays in each of the ring microphone arrays.
根据本发明的另一个方面,还提供了一种采集处理系统,图10为根据本发明实施例的采集处理系统的结构框图,如图10所示,包括:图像采集装置100,声音发出者102,环形麦克风阵列组104。According to another aspect of the present invention, an acquisition processing system is further provided. FIG. 10 is a structural block diagram of an acquisition processing system according to an embodiment of the present invention. As shown in FIG. 10, the image collection device 100 includes a sound issuer 102. , ring microphone array group 104.
其中,环形麦克风阵列组104,配置为确定声音发出者的当前位置信息,以及根据上述当前位置信息确定上述声音发出者相对于图像采集装置100的相对位置信息,其中,图像采集装置100配置为采集上述声音发出者的图像;图像采集装置100,配置为获取到上述相对位置信息后,根据该相对位置信息调整上述图像采集装置对上述声音发出者的采集角度。The ring microphone array group 104 is configured to determine current location information of the sound issuer, and determine relative position information of the sound issuer relative to the image capture device 100 according to the current location information, wherein the image capture device 100 is configured to collect The image capturing device 100 is configured to adjust the collection angle of the sound emitting device by the image capturing device based on the relative position information after acquiring the relative position information.
为了更好的理解上述操作事件的执行方法,以下结合优选实施例进行详细说明。In order to better understand the execution method of the above operational events, the following detailed description will be given in conjunction with the preferred embodiments.
本发明优选实施例的可跟踪讲话人系统包括以下部分:环形麦克风阵列组;摄像头。The trackable speaker system of the preferred embodiment of the present invention includes the following components: a ring microphone array set; a camera.
本发明优选实施例设计的声源定位使用多个环形麦克风阵列,两阵列按定位算法绝对0度直线排列,环形麦克风阵列组与摄像头位置可灵活摆放:The sound source positioning designed by the preferred embodiment of the present invention uses a plurality of ring microphone arrays, and the two arrays are arranged in a line of absolute 0 degree according to the positioning algorithm, and the ring microphone array group and the camera position can be flexibly placed:
本发明优选实施例的声源定位方法,包括以下步骤:The sound source localization method of the preferred embodiment of the present invention comprises the following steps:
步骤1、安装硬件设备,包括麦克风阵列的摆放以及摄像装置的固定;Step 1. Install hardware devices, including placement of the microphone array and fixing of the camera device;
步骤2、测量必要数据并输入到计算模块,需要测量的数据包括但不限 于麦克风阵列之间的距离、摄像头到麦克风阵列的距离、摄像头与各麦克风阵列之间的旋转角度。 Step 2. Measure the necessary data and input it to the calculation module. The data to be measured includes but not limited to The distance between the microphone arrays, the distance from the camera to the microphone array, and the angle of rotation between the camera and each microphone array.
步骤3、根据步骤二测量的数据,通过定位算法计算声源的精确位置;Step 3: Calculate the precise position of the sound source by using a positioning algorithm according to the data measured in step 2;
步骤4、根据定位结果旋转摄像头到声源方位。Step 4. Rotate the camera to the sound source orientation according to the positioning result.
采用本发明优选实施例的技术方案提高了声源定位的精度。其次,由于本发明所述方法的麦克风阵列不仅可做声源定位,还可复用作为声音采集设备。因此,在降低硬件成本的同时也并极大的降低了方案实施的难度。能广泛使用于视频会议、安防及各种实时监控领域。The technical solution of the preferred embodiment of the present invention improves the accuracy of sound source localization. Secondly, the microphone array of the method of the present invention can be used not only as a sound source for positioning but also as a sound collecting device. Therefore, while reducing the hardware cost, it also greatly reduces the difficulty of implementing the solution. Can be widely used in video conferencing, security and various real-time monitoring areas.
在另外一个实施例中提供计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,所述计算机可执行指令用于执行前述任意一个或多个所述采集处理方法,例如,可执行如图1所示的方法。该计算机存储介质包括但不限于:光盘、软盘、硬盘、可擦写存储器等,可选为非瞬间存储介质。In another embodiment, a computer storage medium is provided, the computer storage medium having computer executable instructions for performing any one or more of the collection processing methods described above, for example, executable The method shown in Figure 1. The computer storage medium includes, but is not limited to, an optical disk, a floppy disk, a hard disk, a rewritable memory, etc., optionally a non-transitory storage medium.
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的对象在适当情况下可以互换,以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It is to be understood that the terms "first", "second" and the like in the specification and claims of the present invention are used to distinguish similar objects, and are not necessarily used to describe a particular order or order. It is to be understood that the objects so used are interchangeable, where appropriate, so that the embodiments of the invention described herein can be carried out in a sequence other than those illustrated or described herein. In addition, the terms "comprises" and "comprises" and "the" and "the" are intended to cover a non-exclusive inclusion, for example, a process, method, system, product, or device that comprises a series of steps or units is not necessarily limited to Those steps or units may include other steps or units not explicitly listed or inherent to such processes, methods, products or devices.
显然,本领域的技术人员应该明白,上述的本发明的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执 行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本发明不限制于任何特定的硬件和软件结合。It will be apparent to those skilled in the art that the various modules or steps of the present invention described above can be implemented by a general-purpose computing device that can be centralized on a single computing device or distributed across a network of multiple computing devices. Alternatively, they can be executed by computing devices The program code of the lines is implemented so that they can be stored in the storage device by the computing device, and in some cases, the steps shown or described can be performed in a different order than here, or they can be Each of the integrated circuit modules is fabricated separately, or a plurality of modules or steps thereof are fabricated into a single integrated circuit module. Thus, the invention is not limited to any specific combination of hardware and software.
以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,凡按照本发明原理所作的修改,都应当理解为落入本发明的保护范围。The above description is only for the preferred embodiment of the present invention, and is not intended to limit the present invention, and those skilled in the art should understand that modifications made in accordance with the principles of the present invention are intended to fall within the scope of the present invention.
工业实用性Industrial applicability
在本发明实施例中,利用环形麦克风阵列组,进行发生者的声音采集,通过环形麦克风阵列组中多个麦克风从了不同角度的声音采集,可精确确定声音发出者的当前位置信息,再利用确定的当前位置信息,确定出发生者相对于图像采集装置的相对位置信息,其中,该图像采集装置用于采集所述声音发出者的图像;最后,根据所述相对位置信息,调整所述图像采集装置对所述声音发出者的采集角度,在工艺中实现简便,可复制性高且位置信息获得的精确度高,进而可以提供更加精确的图像采集装置的采集角度调整。 In the embodiment of the present invention, the ring microphone array group is used to collect the sound of the generator, and the sound is collected from different angles by the plurality of microphones in the ring microphone array group, so that the current position information of the sound sender can be accurately determined and reused. Determining current position information, determining relative position information of the occurrence of the occurrence relative to the image acquisition device, wherein the image acquisition device is configured to acquire an image of the sound issuer; and finally, adjusting the image according to the relative position information The collecting angle of the collecting device to the sound emitting person is simple in the process, high in reproducibility and high in obtaining position information, thereby providing a more accurate acquisition angle adjustment of the image capturing device.

Claims (12)

  1. 一种采集处理方法,包括:An acquisition processing method includes:
    通过环形麦克风阵列组确定声音发出者的当前位置信息;Determining current location information of the sound issuer through the ring microphone array group;
    根据所述当前位置信息,确定所述声音发出者相对于图像采集装置的相对位置信息,其中,该图像采集装置用于采集所述声音发出者的图像;Determining relative position information of the sound issuer relative to the image capture device according to the current position information, wherein the image capture device is configured to collect an image of the sound issuer;
    根据所述相对位置信息,调整所述图像采集装置对所述声音发出者的采集角度。And adjusting an acquisition angle of the sound emitting device by the image capturing device according to the relative position information.
  2. 根据权利要求1所述的方法,其中,所述环形麦克风阵列组包括的环形麦克风阵列的数量包括:2个、3个、4个。The method according to claim 1, wherein the number of annular microphone arrays included in the annular microphone array group comprises: 2, 3, and 4.
  3. 根据权利要求2所述的方法,其中,在所述环形麦克风阵列组包括两个环形麦克风阵列时,通过环形麦克风阵列组确定声音发出者的当前位置信息,包括:The method according to claim 2, wherein when the ring microphone array group comprises two ring microphone arrays, determining the current location information of the sound issuer through the ring microphone array group comprises:
    通过所述环形麦克风阵列组中的第一环形麦克风阵列确定所述声音发出者的第一相对角度;Determining, by the first ring microphone array in the ring microphone array group, a first relative angle of the sound issuer;
    通过所述环形麦克风阵列组中的第二环形麦克风阵列确定所述声音发出者的第二相对角度;Determining, by the second ring microphone array in the ring microphone array group, a second relative angle of the sound issuer;
    根据所述第一相对角度和第二相对角度确定所述当前位置信息。The current location information is determined according to the first relative angle and the second relative angle.
  4. 根据权利要求3所述的方法,其中,所述方法还包括:The method of claim 3, wherein the method further comprises:
    根据所述第一环形麦克风阵列的多个咪头确定所述第一相对角度;Determining the first relative angle according to the plurality of microphones of the first ring microphone array;
    根据所述第二环形麦克风阵列的多个咪头确定所述第二相对角度。The second relative angle is determined according to a plurality of microphones of the second circular microphone array.
  5. 根据权利要求1所述的方法,其中,根据所述当前位置信息,确定所述声音发出者相对于图像采集装置的相对位置信息之前,所述方法还包括:The method of claim 1, wherein the method further comprises: prior to determining the relative position information of the sound issuer relative to the image capture device based on the current location information, the method further comprising:
    获取所述图像采集装置相对于所述环形麦克风阵列组中各个环形麦克风阵列的第一位置信息,以及所述各个环形麦克风阵列中,两两环形麦克 风阵列之间的第二位置信息。Acquiring first position information of the image acquisition device relative to each ring microphone array in the ring microphone array group, and two or two ring microphones in each ring microphone array Second position information between the wind arrays.
  6. 根据权利要求5所述的方法,其中,所述方法还包括:根据所述当前位置信息、所述第一位置信息以及所述第二位置信息确定所述相对位置信息。The method of claim 5, wherein the method further comprises determining the relative location information based on the current location information, the first location information, and the second location information.
  7. 一种采集处理装置,其中,包括:An acquisition processing device, comprising:
    第一确定模块,配置为通过环形麦克风阵列组确定声音发出者的当前位置信息;a first determining module configured to determine current location information of the sound issuer through the ring microphone array group;
    第二确定模块,配置为根据所述当前位置信息,确定所述声音发出者相对于图像采集装置的相对位置信息,其中,该图像采集装置用于采集所述声音发出者的图像;a second determining module, configured to determine relative position information of the sound issuer relative to the image capture device according to the current location information, wherein the image capture device is configured to collect an image of the sound issuer;
    调整模块,配置为根据所述相对位置信息,调整所述图像采集装置对所述声音发出者的采集角度。And an adjusting module configured to adjust an acquisition angle of the sound emitting device by the image capturing device according to the relative position information.
  8. 根据权利要求7所述的装置,其中,所述第一确定模块中所述环形麦克风阵列组包括的环形麦克风阵列的数量包括:2个、3个、4个。The apparatus according to claim 7, wherein the number of annular microphone arrays included in the ring microphone array group in the first determining module comprises: 2, 3, and 4.
  9. 根据权利要求8所述的装置,其中,在所述环形麦克风阵列组包括两个环形麦克风阵列时,所述第一确定模块,包括:The apparatus according to claim 8, wherein when the ring microphone array group comprises two ring microphone arrays, the first determining module comprises:
    第一确定单元,配置为通过所述环形麦克风阵列组中的第一所述环形麦克风阵列确定所述声音发出者的第一相对角度;a first determining unit, configured to determine a first relative angle of the sound issuer through the first one of the ring microphone arrays;
    第二确定单元,配置为通过所述环形麦克风阵列组中的第二所述环形麦克风阵列确定所述声音发出者的第二相对角度;a second determining unit, configured to determine a second relative angle of the sound issuer through the second annular microphone array in the annular microphone array group;
    第三确定单元,配置为根据所述第一相对角度和第二相对角度确定所述当前位置信息。And a third determining unit configured to determine the current location information according to the first relative angle and the second relative angle.
  10. 根据权利要求7所述的装置,其中,所述装置还包括:The apparatus of claim 7 wherein said apparatus further comprises:
    获取模块,配置为获取所述图像采集装置相对于所述环形麦克风阵列组中各个环形麦克风阵列的第一位置信息,以及所述各个环形麦克风阵列 中,两两环形麦克风阵列之间的第二位置信息。An acquiring module, configured to acquire first position information of the image capturing device relative to each ring microphone array in the ring microphone array group, and the respective ring microphone arrays The second position information between the two pairs of ring microphone arrays.
  11. 一种采集处理系统,其中,包括:图像采集装置,声音发出者,环形麦克风阵列组,其中,An acquisition processing system, comprising: an image acquisition device, a sound issuer, and a ring microphone array group, wherein
    所述环形麦克风阵列组,配置为确定声音发出者的当前位置信息,以及根据所述当前位置信息确定所述声音发出者相对于图像采集装置的相对位置信息,其中,该图像采集装置用于采集所述声音发出者的图像;The ring microphone array group is configured to determine current location information of the sound issuer, and determine relative position information of the sound issuer relative to the image capture device according to the current location information, wherein the image capture device is configured to collect An image of the sound sender;
    所述图像采集装置,配置为获取到所述相对位置信息后,根据该相对位置信息调整所述图像采集装置对所述声音发出者的采集角度。The image capturing device is configured to, after acquiring the relative position information, adjust an acquisition angle of the sound emitting device by the image capturing device according to the relative position information.
  12. 一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,所述计算机可执行指令用于执行权利要求1至6任一项所述的方法。 A computer storage medium having stored therein computer executable instructions for performing the method of any one of claims 1 to 6.
PCT/CN2017/073176 2016-02-25 2017-02-09 Acquisition processing method, device and system, and computer storage medium WO2017143910A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610103955.9A CN107124540A (en) 2016-02-25 2016-02-25 Acquiring and processing method, apparatus and system
CN201610103955.9 2016-02-25

Publications (1)

Publication Number Publication Date
WO2017143910A1 true WO2017143910A1 (en) 2017-08-31

Family

ID=59684793

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/073176 WO2017143910A1 (en) 2016-02-25 2017-02-09 Acquisition processing method, device and system, and computer storage medium

Country Status (2)

Country Link
CN (1) CN107124540A (en)
WO (1) WO2017143910A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108735226A (en) * 2018-07-09 2018-11-02 科沃斯商用机器人有限公司 Voice acquisition method, device and equipment
CN111028840A (en) * 2019-12-24 2020-04-17 深圳火星探索科技有限公司 Unmanned aerial vehicle voice control system based on three-dimensional microphone array
CN111526295A (en) * 2020-04-30 2020-08-11 北京臻迪科技股份有限公司 Audio and video processing system, acquisition method, device, equipment and storage medium
US11115625B1 (en) 2020-12-14 2021-09-07 Cisco Technology, Inc. Positional audio metadata generation
US11425502B2 (en) 2020-09-18 2022-08-23 Cisco Technology, Inc. Detection of microphone orientation and location for directional audio pickup

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110300279B (en) * 2019-06-26 2021-11-02 视联动力信息技术股份有限公司 Tracking method and device for conference speaker
CN111935411A (en) * 2020-09-25 2020-11-13 杭州涂鸦信息技术有限公司 Monitoring system and monitoring method based on sound positioning
CN117665705A (en) * 2022-08-26 2024-03-08 华为技术有限公司 Method for emitting and receiving sound signals and detecting relative position between devices

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6826284B1 (en) * 2000-02-04 2004-11-30 Agere Systems Inc. Method and apparatus for passive acoustic source localization for video camera steering applications
CN101567969A (en) * 2009-05-21 2009-10-28 上海交通大学 Intelligent video director method based on microphone array sound guidance
CN102833476A (en) * 2012-08-17 2012-12-19 歌尔声学股份有限公司 Camera for terminal equipment and implementation method of camera for terminal equipment
CN202798928U (en) * 2012-08-17 2013-03-13 歌尔声学股份有限公司 Pick-up head for terminal equipment
CN103685906A (en) * 2012-09-20 2014-03-26 中兴通讯股份有限公司 Control method, control device and control equipment
CN103841357A (en) * 2012-11-21 2014-06-04 中兴通讯股份有限公司 Microphone array sound source positioning method, device and system based on video tracking

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100442837C (en) * 2006-07-25 2008-12-10 华为技术有限公司 Video frequency communication system with sound position information and its obtaining method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6826284B1 (en) * 2000-02-04 2004-11-30 Agere Systems Inc. Method and apparatus for passive acoustic source localization for video camera steering applications
CN101567969A (en) * 2009-05-21 2009-10-28 上海交通大学 Intelligent video director method based on microphone array sound guidance
CN102833476A (en) * 2012-08-17 2012-12-19 歌尔声学股份有限公司 Camera for terminal equipment and implementation method of camera for terminal equipment
CN202798928U (en) * 2012-08-17 2013-03-13 歌尔声学股份有限公司 Pick-up head for terminal equipment
CN103685906A (en) * 2012-09-20 2014-03-26 中兴通讯股份有限公司 Control method, control device and control equipment
CN103841357A (en) * 2012-11-21 2014-06-04 中兴通讯股份有限公司 Microphone array sound source positioning method, device and system based on video tracking

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108735226A (en) * 2018-07-09 2018-11-02 科沃斯商用机器人有限公司 Voice acquisition method, device and equipment
CN108735226B (en) * 2018-07-09 2024-04-02 科沃斯商用机器人有限公司 Voice acquisition method, device and equipment
CN111028840A (en) * 2019-12-24 2020-04-17 深圳火星探索科技有限公司 Unmanned aerial vehicle voice control system based on three-dimensional microphone array
CN111526295A (en) * 2020-04-30 2020-08-11 北京臻迪科技股份有限公司 Audio and video processing system, acquisition method, device, equipment and storage medium
US11425502B2 (en) 2020-09-18 2022-08-23 Cisco Technology, Inc. Detection of microphone orientation and location for directional audio pickup
US11115625B1 (en) 2020-12-14 2021-09-07 Cisco Technology, Inc. Positional audio metadata generation

Also Published As

Publication number Publication date
CN107124540A (en) 2017-09-01

Similar Documents

Publication Publication Date Title
WO2017143910A1 (en) Acquisition processing method, device and system, and computer storage medium
KR101724514B1 (en) Sound signal processing method and apparatus
CN101201399B (en) Sound localization method and system
US9516412B2 (en) Directivity control apparatus, directivity control method, storage medium and directivity control system
CN106125048B (en) A kind of sound localization method and device
CN109506568B (en) Sound source positioning method and device based on image recognition and voice recognition
JP4296197B2 (en) Arrangement and method for sound source tracking
US9838646B2 (en) Attenuation of loudspeaker in microphone array
CN101567969B (en) Intelligent video director method based on microphone array sound guidance
EP2519831B1 (en) Method and system for determining the direction between a detection point and an acoustic source
US20150022636A1 (en) Method and system for voice capture using face detection in noisy environments
CN110389597B (en) Camera adjusting method, device and system based on sound source positioning
CN105611167B (en) focusing plane adjusting method and electronic equipment
Stade et al. A spatial audio impulse response compilation captured at the WDR broadcast studios
US9591229B2 (en) Image tracking control method, control device, and control equipment
CN109669158B (en) Sound source positioning method, system, computer equipment and storage medium
WO2018049957A1 (en) Audio signal, image processing method, device, and system
Zunino et al. Seeing the sound: A new multimodal imaging device for computer vision
CN109712188A (en) A kind of method for tracking target and device
Crocco et al. Audio tracking in noisy environments by acoustic map and spectral signature
JP2018019294A5 (en)
US10084965B2 (en) Omnidirectional high resolution tracking and recording apparatus and method
Bai et al. Localization and separation of acoustic sources by using a 2.5-dimensional circular microphone array
US20230086490A1 (en) Conferencing systems and methods for room intelligence
JP2011033369A (en) Conference device

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17755739

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 17755739

Country of ref document: EP

Kind code of ref document: A1