WO2017143910A1 - Procédé, dispositif et système de traitement d'acquisition, et support de stockage informatique - Google Patents

Procédé, dispositif et système de traitement d'acquisition, et support de stockage informatique Download PDF

Info

Publication number
WO2017143910A1
WO2017143910A1 PCT/CN2017/073176 CN2017073176W WO2017143910A1 WO 2017143910 A1 WO2017143910 A1 WO 2017143910A1 CN 2017073176 W CN2017073176 W CN 2017073176W WO 2017143910 A1 WO2017143910 A1 WO 2017143910A1
Authority
WO
WIPO (PCT)
Prior art keywords
sound
microphone array
issuer
relative
position information
Prior art date
Application number
PCT/CN2017/073176
Other languages
English (en)
Chinese (zh)
Inventor
胡钦扬
李星
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2017143910A1 publication Critical patent/WO2017143910A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S5/00Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
    • G01S5/18Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using ultrasonic, sonic, or infrasonic waves
    • G01S5/22Position of source determined by co-ordinating a plurality of position lines defined by path-difference measurements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/66Remote control of cameras or camera parts, e.g. by remote control devices

Definitions

  • the present invention relates to the field of communications, and in particular to an acquisition processing method, apparatus, system, and computer storage medium.
  • Embodiments of the present invention provide an acquisition processing method, apparatus, system, and computer storage medium.
  • An aspect of an embodiment of the present invention provides an acquisition processing method, including: determining, by a ring microphone array group, current location information of a sound issuer; and determining, according to the current location information, the sound issuer relative to an image capture device Relative position information, wherein the image acquisition device is configured to acquire an image of the sound issuer; and adjust the location according to the relative position information The angle at which the image capture device collects the sound issuer.
  • an acquisition processing apparatus comprising: a first determining module configured to determine current location information of a sound issuer through a ring microphone array group; and a second determining module configured to The current location information determines relative position information of the sound issuer relative to the image capture device, wherein the image capture device is configured to acquire an image of the sound issuer; and the adjustment module is configured to adjust the image according to the relative position information The angle at which the image capture device collects the sound issuer.
  • Another aspect of the embodiments of the present invention further provides an acquisition processing system, including: an image collection device, a sound issuer, and a ring microphone array group, wherein the ring microphone array group is configured to determine a current sound issuer Location information, and determining relative position information of the sound issuer relative to the image capture device based on the current location information, wherein the image capture device is configured to acquire an image of the sound issuer; the image capture device, After the relative position information is acquired, the collection angle of the sound emitting device by the image capturing device is adjusted according to the relative position information.
  • the embodiment of the invention further provides a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the foregoing collection processing method.
  • the technical solution provided by the embodiment of the present invention can determine the current location information of the sound issuer through the ring microphone array group; and determine the relative position information of the sound issuer relative to the image capture device according to the current location information, where the image
  • the collecting device is configured to collect an image of the sound emitting person, and adjust an acquisition angle of the sound emitting device by the image collecting device according to the relative position information, and solve the related art, the positioning speaker (ie, the sound emitting person)
  • the technical solution has the problem of low precision, thereby improving the positioning accuracy of the positioner of the positioning sound.
  • FIG. 1 is a schematic flowchart of a first collection processing method according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram of a first ring microphone array according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of a first microphone array group according to an embodiment of the present invention.
  • FIG. 4 is another schematic diagram of a second microphone array group according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a second microphone array position according to an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of positioning of a first sound issuer according to an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of a first type of collection processing apparatus according to an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of a first determining module 70 of a second type of collection processing apparatus according to an embodiment of the present invention.
  • FIG. 9 is another schematic structural diagram of a third collection processing device according to an embodiment of the present invention.
  • FIG. 10 is a schematic structural diagram of an acquisition processing system according to an embodiment of the present invention.
  • FIG. 1 is a flowchart of an acquisition processing method according to an embodiment of the present invention. As shown in FIG. 1, the method includes the following steps:
  • Step S102 determining current location information of the sound issuer through the ring microphone array group
  • Step S104 determining relative position information of the sound issuer relative to the image capture device according to the current position information, wherein the image capture device is configured to collect an image of the sound issuer;
  • Step S106 adjusting an acquisition angle of the sound emitting device by the image capturing device according to the relative position information.
  • the current location information of the sound issuer can be determined by the ring microphone array group; and the relative position information of the sound issuer relative to the image capture device is determined according to the current location information, where
  • the image capture device is configured to collect an image of the sound issuer; adjust the collection angle of the sound imager by the image capture device according to the relative position information, and solve the related art, locate the speaker (ie, the sound issuer)
  • the technical solution has a problem of low precision, thereby improving the positioning accuracy of the positioner of the positioning sound.
  • step S104 before determining the relative position information of the sound issuer relative to the image capture device, knowing the position information determined by the ring microphone array group in advance, and the relative relationship with the image capture device, in step S104, according to this relative relationship, combined with the current position information acquired by the ring microphone array and the relative relationship, the relative position information of the sound issuer relative to the image capture device is obtained by calculation.
  • the step S104 may further include: transmitting current location information determined by the ring microphone array group to other devices on the network side, for example, a positioning server, according to the current location information by the positioning server. And the relative relationship, the relative position information of the sound issuer relative to the image capture device is calculated.
  • the relative relationship herein may include a first relative position of the annular microphone array relative to the image capture device.
  • the current position information may be a second relative position of the sound generator relative to a specific position in the annular array.
  • the first relative position, the second relative position, and three-dimensional modeling may determine the sound sender relative to the image collection.
  • Relative position information of the device Here, specific information on how to position the sound issuer relative to the image capture device is provided, and is not limited thereto.
  • the plurality of microphones in the circular microphone array can collect sounds generated by the generator from different angles or collect sounds generated by the generators at different positions, and then collect sound and sound signals based on multi-angle or multi-position transmission, which can be precisely determined.
  • the current position information is obtained, so that the accuracy of the position information can be improved, thereby achieving precise adjustment of the acquisition angle.
  • the number of the wake-up microphone arrays included in the ring microphone array group is preferably the following number of two, three, or four.
  • the following embodiments of the present invention are only two and three, for four or more. The implementation of the technical solution is not described here because the similar solution is adopted.
  • determining the current location information of the sound issuer by the ring microphone array group may be implemented by: adopting the first ring microphone in the ring microphone array group Determining, by the array, a first relative angle of the sound emitter; determining, by the second annular microphone array in the annular microphone array group, a second relative angle of the sound emitter; determining the above according to the first relative angle and the second relative angle Current location information.
  • the method further includes: determining the first relative angle according to the plurality of microphones of the first ring microphone array; and determining the second relative angle according to the plurality of microphones of the second ring microphone array.
  • the method further includes: acquiring first position information of the image capturing device relative to each of the ring microphone arrays in the ring microphone array group, and between the two ring microphone arrays in each of the ring microphone arrays Second location information.
  • the method further includes: determining the relative location information according to the current location information, the first location information, and the second location information.
  • FIG. 2 is a schematic diagram of a ring microphone array according to an embodiment of the present invention. As shown in FIG. 2, the four microphones in FIG. 2 are annularly distributed, and the microphone faces outward, at 0°, 90°, 180°, and 270°, respectively.
  • a ring microphone array with different numbers of microphones and spacings may be used as an alternative to the embodiment of the present invention; the number of arrays of the present invention may be limited to two, and more than two ring microphone array groups are within the protection scope of the patent. See Figure 3 and Figure 4.
  • FIG. 5 is a schematic diagram of a position of a microphone array according to an embodiment of the present invention, based on FIG. 5:
  • Step 1 Referring to Figure 5, the two ring microphone arrays are fixed. The preferred position is to align the 0° of the two microphone array positioning algorithms onto the same line. The distance between the two microphone arrays is adjusted according to the actual situation (conference room, conference table size);
  • Step 2 according to the actual situation, the microphone array group and the camera (corresponding to the image acquisition device of the above embodiment) are arranged in the use environment, when the camera and the microphone array group are in the same When it is flat, the practical effect is the best;
  • FIG. 6 is a schematic diagram of positioning of a sound issuer according to an embodiment of the present invention. Based on the schematic diagram of FIG. 6, the specific operation flow includes the following processing steps:
  • Step A the camera, the microphone array and the sound source shown in FIG. 6 are projections of the corresponding objects on the ground.
  • This step is necessary data measurement, and the part of the data is input into the sound source localization algorithm, and the data to be measured includes:
  • the angle between the camera projection 0° line and the projection of the microphone array 1 is ⁇ ;
  • Step B According to the positioning result ⁇ of the microphone array 1, the positioning result ⁇ of the microphone array 2, and the actual data measured in the step A, the position of the sound source can be calculated by the sine and cosine theorem (the distance from the camera to the sound source L5) , the angle between the camera 0 ° and the sound source ⁇ );
  • Microphone array sound source localization method :
  • MVDR Minimum Variance Distortion Less Response, which can obtain higher azimuth resolution and noise interference suppression performance
  • beamforming is performed on the sub-array to find the beam direction of the sound source.
  • the microphone array 1 projection, the microphone array 2 projection formed by the triangle, through the sine theorem, L4 can be calculated;
  • Step C Rotate the camera to the designated position according to the final calculation result of the step B, the relative angle ⁇ of the camera and the sound source.
  • FIG. 7 is a structural block diagram of an acquisition processing apparatus according to an embodiment of the present invention. As shown in Figure 7, the device includes:
  • the first determining module 70 is configured to determine the sound issuer through the ring microphone array group Front position information
  • the second determining module 72 is configured to determine relative position information of the sound issuer relative to the image capture device according to the current position information, wherein the image capture device is configured to collect an image of the sound issuer;
  • the adjustment module 74 is configured to adjust an acquisition angle of the sound emitting device by the image capturing device according to the relative position information.
  • the current position information of the sound issuer can be determined by the ring microphone array group through the combined action of the above modules; the relative position information of the sound issuer relative to the image capture device is determined according to the current position information, wherein the image is collected.
  • the device is configured to collect an image of the sound issuer; adjust an acquisition angle of the sound issuer by the image capture device according to the relative position information, and solve the related art, the locate speaker (ie, the sound issuer)
  • the technical solution has a problem of low precision, thereby improving the positioning accuracy of the positioner of the positioning sound.
  • the number of the wake-up microphone arrays included in the ring microphone array group in the first determining module 70 includes: 2, 3, and 4.
  • FIG. 8 is a structural block diagram of a first determining module 70 of the collection processing device according to an embodiment of the present invention. As shown in FIG. 8, when the ring microphone array group includes two ring microphone arrays, the first determining module 70 includes:
  • the first determining unit 700 is configured to determine, by using the first ring microphone array in the ring microphone array group, a first relative angle of the sound issuer;
  • the second determining unit 702 is configured to determine a second relative angle of the sound issuer by using the second ring microphone array in the ring microphone array group;
  • the third determining unit 704 is configured to determine the current location information according to the first relative angle and the second relative angle.
  • FIG. 9 is another structural block diagram of an acquisition processing apparatus according to an embodiment of the present invention. As shown in FIG. 9, the apparatus further includes:
  • the obtaining module 76 is configured to acquire first position information of the image capturing device relative to each ring microphone array in the ring microphone array group, and second position information between the two ring microphone arrays in each of the ring microphone arrays.
  • FIG. 10 is a structural block diagram of an acquisition processing system according to an embodiment of the present invention.
  • the image collection device 100 includes a sound issuer 102. , ring microphone array group 104.
  • the ring microphone array group 104 is configured to determine current location information of the sound issuer, and determine relative position information of the sound issuer relative to the image capture device 100 according to the current location information, wherein the image capture device 100 is configured to collect The image capturing device 100 is configured to adjust the collection angle of the sound emitting device by the image capturing device based on the relative position information after acquiring the relative position information.
  • the trackable speaker system of the preferred embodiment of the present invention includes the following components: a ring microphone array set; a camera.
  • the sound source positioning designed by the preferred embodiment of the present invention uses a plurality of ring microphone arrays, and the two arrays are arranged in a line of absolute 0 degree according to the positioning algorithm, and the ring microphone array group and the camera position can be flexibly placed:
  • Step 1 Install hardware devices, including placement of the microphone array and fixing of the camera device;
  • Step 2. Measure the necessary data and input it to the calculation module.
  • the data to be measured includes but not limited to The distance between the microphone arrays, the distance from the camera to the microphone array, and the angle of rotation between the camera and each microphone array.
  • Step 3 Calculate the precise position of the sound source by using a positioning algorithm according to the data measured in step 2;
  • Step 4 Rotate the camera to the sound source orientation according to the positioning result.
  • the technical solution of the preferred embodiment of the present invention improves the accuracy of sound source localization.
  • the microphone array of the method of the present invention can be used not only as a sound source for positioning but also as a sound collecting device. Therefore, while reducing the hardware cost, it also greatly reduces the difficulty of implementing the solution. Can be widely used in video conferencing, security and various real-time monitoring areas.
  • a computer storage medium having computer executable instructions for performing any one or more of the collection processing methods described above, for example, executable The method shown in Figure 1.
  • the computer storage medium includes, but is not limited to, an optical disk, a floppy disk, a hard disk, a rewritable memory, etc., optionally a non-transitory storage medium.
  • the ring microphone array group is used to collect the sound of the generator, and the sound is collected from different angles by the plurality of microphones in the ring microphone array group, so that the current position information of the sound sender can be accurately determined and reused. Determining current position information, determining relative position information of the occurrence of the occurrence relative to the image acquisition device, wherein the image acquisition device is configured to acquire an image of the sound issuer; and finally, adjusting the image according to the relative position information
  • the collecting angle of the collecting device to the sound emitting person is simple in the process, high in reproducibility and high in obtaining position information, thereby providing a more accurate acquisition angle adjustment of the image capturing device.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)
  • Obtaining Desirable Characteristics In Audible-Bandwidth Transducers (AREA)

Abstract

La présente invention concerne un procédé, un dispositif et un système de traitement d'acquisition, le procédé comprenant les étapes suivantes : détermination d'informations de position actuelle d'un locuteur au moyen d'un groupe de réseaux de microphones annulaires; détermination d'informations de position relative du locuteur par rapport à un dispositif d'acquisition d'image conformément aux informations de position actuelles, le dispositif d'acquisition d'image étant utilisé pour acquérir une image du locuteur; et réglage de l'angle d'acquisition du dispositif d'acquisition d'image pour le locuteur conformément aux informations de position relative. Des modes de réalisation de la présente invention concernent également un support de stockage informatique.
PCT/CN2017/073176 2016-02-25 2017-02-09 Procédé, dispositif et système de traitement d'acquisition, et support de stockage informatique WO2017143910A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610103955.9A CN107124540A (zh) 2016-02-25 2016-02-25 采集处理方法、装置及系统
CN201610103955.9 2016-02-25

Publications (1)

Publication Number Publication Date
WO2017143910A1 true WO2017143910A1 (fr) 2017-08-31

Family

ID=59684793

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/073176 WO2017143910A1 (fr) 2016-02-25 2017-02-09 Procédé, dispositif et système de traitement d'acquisition, et support de stockage informatique

Country Status (2)

Country Link
CN (1) CN107124540A (fr)
WO (1) WO2017143910A1 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108735226A (zh) * 2018-07-09 2018-11-02 科沃斯商用机器人有限公司 语音采集方法、装置及设备
CN111028840A (zh) * 2019-12-24 2020-04-17 深圳火星探索科技有限公司 基于三维麦克风阵列的无人机语音控制系统
CN111526295A (zh) * 2020-04-30 2020-08-11 北京臻迪科技股份有限公司 音视频处理系统、采集方法、装置、设备及存储介质
US11115625B1 (en) 2020-12-14 2021-09-07 Cisco Technology, Inc. Positional audio metadata generation
US11425502B2 (en) 2020-09-18 2022-08-23 Cisco Technology, Inc. Detection of microphone orientation and location for directional audio pickup

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110300279B (zh) * 2019-06-26 2021-11-02 视联动力信息技术股份有限公司 一种会议发言人的追踪方法及装置
CN111935411A (zh) * 2020-09-25 2020-11-13 杭州涂鸦信息技术有限公司 一种基于声音定位的监控系统及监控方法
CN117665705A (zh) * 2022-08-26 2024-03-08 华为技术有限公司 发出、接收声音信号以及检测设备间相对位置的方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6826284B1 (en) * 2000-02-04 2004-11-30 Agere Systems Inc. Method and apparatus for passive acoustic source localization for video camera steering applications
CN101567969A (zh) * 2009-05-21 2009-10-28 上海交通大学 基于麦克风阵列声音制导的智能视频导播方法
CN102833476A (zh) * 2012-08-17 2012-12-19 歌尔声学股份有限公司 终端设备用摄像头和终端设备用摄像头的实现方法
CN202798928U (zh) * 2012-08-17 2013-03-13 歌尔声学股份有限公司 终端设备用摄像头
CN103685906A (zh) * 2012-09-20 2014-03-26 中兴通讯股份有限公司 一种控制方法、控制装置及控制设备
CN103841357A (zh) * 2012-11-21 2014-06-04 中兴通讯股份有限公司 基于视频跟踪的麦克风阵列声源定位方法、装置及系统

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100442837C (zh) * 2006-07-25 2008-12-10 华为技术有限公司 一种具有声音位置信息的视频通讯系统及其获取方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6826284B1 (en) * 2000-02-04 2004-11-30 Agere Systems Inc. Method and apparatus for passive acoustic source localization for video camera steering applications
CN101567969A (zh) * 2009-05-21 2009-10-28 上海交通大学 基于麦克风阵列声音制导的智能视频导播方法
CN102833476A (zh) * 2012-08-17 2012-12-19 歌尔声学股份有限公司 终端设备用摄像头和终端设备用摄像头的实现方法
CN202798928U (zh) * 2012-08-17 2013-03-13 歌尔声学股份有限公司 终端设备用摄像头
CN103685906A (zh) * 2012-09-20 2014-03-26 中兴通讯股份有限公司 一种控制方法、控制装置及控制设备
CN103841357A (zh) * 2012-11-21 2014-06-04 中兴通讯股份有限公司 基于视频跟踪的麦克风阵列声源定位方法、装置及系统

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108735226A (zh) * 2018-07-09 2018-11-02 科沃斯商用机器人有限公司 语音采集方法、装置及设备
CN108735226B (zh) * 2018-07-09 2024-04-02 科沃斯商用机器人有限公司 语音采集方法、装置及设备
CN111028840A (zh) * 2019-12-24 2020-04-17 深圳火星探索科技有限公司 基于三维麦克风阵列的无人机语音控制系统
CN111526295A (zh) * 2020-04-30 2020-08-11 北京臻迪科技股份有限公司 音视频处理系统、采集方法、装置、设备及存储介质
US11425502B2 (en) 2020-09-18 2022-08-23 Cisco Technology, Inc. Detection of microphone orientation and location for directional audio pickup
US11115625B1 (en) 2020-12-14 2021-09-07 Cisco Technology, Inc. Positional audio metadata generation

Also Published As

Publication number Publication date
CN107124540A (zh) 2017-09-01

Similar Documents

Publication Publication Date Title
WO2017143910A1 (fr) Procédé, dispositif et système de traitement d'acquisition, et support de stockage informatique
KR101724514B1 (ko) 사운드 신호 처리 방법 및 장치
CN101201399B (zh) 一种声源定位方法及系统
US9516412B2 (en) Directivity control apparatus, directivity control method, storage medium and directivity control system
CN109506568B (zh) 一种基于图像识别和语音识别的声源定位方法及装置
CN106125048B (zh) 一种声源定位方法及装置
JP4296197B2 (ja) 音源追跡のための配置及び方法
EP2519831B1 (fr) Procédé et système de détermination de la direction entre un point de détection et une source acoustique
US9838646B2 (en) Attenuation of loudspeaker in microphone array
CN101567969B (zh) 基于麦克风阵列声音制导的智能视频导播方法
US20150022636A1 (en) Method and system for voice capture using face detection in noisy environments
CN110389597B (zh) 基于声源定位的摄像头调整方法、装置和系统
CN105611167B (zh) 一种对焦平面调整方法及电子设备
Stade et al. A spatial audio impulse response compilation captured at the WDR broadcast studios
US9591229B2 (en) Image tracking control method, control device, and control equipment
CN109669158B (zh) 一种声源定位方法、系统、计算机设备及存储介质
Ziegelwanger et al. Modeling the direction-continuous time-of-arrival in head-related transfer functions
CN104898086A (zh) 适用于微型麦克风阵列的声强估计声源定向方法
JP2018019294A5 (fr)
Crocco et al. Audio tracking in noisy environments by acoustic map and spectral signature
US20230086490A1 (en) Conferencing systems and methods for room intelligence
Astapov et al. Simplified acoustic localization by linear arrays for wireless sensor networks
JP2011033369A (ja) 会議装置
WO2016197444A1 (fr) Procédé et terminal pour réaliser une prise de vue
WO2023056905A1 (fr) Procédé et appareil de localisation de source sonore et dispositif

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17755739

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 17755739

Country of ref document: EP

Kind code of ref document: A1