WO2023093078A1 - 导播控制的方法、装置、存储介质和计算机程序产品 - Google Patents

导播控制的方法、装置、存储介质和计算机程序产品 Download PDF

Info

Publication number
WO2023093078A1
WO2023093078A1 PCT/CN2022/105499 CN2022105499W WO2023093078A1 WO 2023093078 A1 WO2023093078 A1 WO 2023093078A1 CN 2022105499 W CN2022105499 W CN 2022105499W WO 2023093078 A1 WO2023093078 A1 WO 2023093078A1
Authority
WO
WIPO (PCT)
Prior art keywords
microphone array
camera
microphone
sounder
sound
Prior art date
Application number
PCT/CN2022/105499
Other languages
English (en)
French (fr)
Inventor
张磊
刘智辉
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023093078A1 publication Critical patent/WO2023093078A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/695Control of camera direction for changing a field of view, e.g. pan, tilt or based on tracking of objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/268Signal distribution or switching
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/40Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/40Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
    • H04R1/406Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers microphones

Definitions

  • the present application relates to the technical field of communications, and in particular to a method, device, storage medium and computer program product for broadcast control.
  • Directing refers to controlling the camera to shoot key objects (people or objects) in the scene based on real-time shooting requirements during the video shooting process to output video images. For example, in a video conference, the camera can be controlled to shoot the current speaker, and when the speaker changes, the camera can be controlled to shoot the new speaker.
  • the shooting direction of the camera in order to obtain video images containing key objects, the shooting direction of the camera can be adjusted, video images can be selected among multiple cameras, and partial intercepts can be performed in the video images.
  • the processing process of automatic broadcasting is: the control device recognizes the video image captured by the camera in real time, determines the object with specified characteristics in the image (ie the above-mentioned key object), and controls the camera to shoot the object. For example, in a conference scene, the control device can identify a person standing or with mouth movements (speaking) in the video image captured in real time, determine the person as the speaker, and then control the camera to take a close-up of the speaker for playback.
  • the embodiment of the present application provides a broadcasting control method, which can solve the problem of poor broadcasting accuracy in the prior art. Described technical scheme is as follows:
  • a method for broadcast control is provided, the method is applied to a broadcast control system, the broadcast control system includes a first microphone array, a second microphone array, a camera and a control device, the method includes: the control device determines the first microphone The position of the array and the position of the camera; when the sound source object makes a sound, the control device according to the position of the sound source object relative to the first microphone array, the position of the sound source object relative to the second microphone array, the position of the first microphone array and the position of the second microphone array The position of the second microphone array determines the position of the sound source object; the control device determines the guiding operation of the camera based on the position of the sound source object and the position of the camera.
  • each microphone in the first microphone array can detect corresponding audio data, and the first microphone array sends the audio data to the control device.
  • the control device can perform sound source localization according to the audio data, and determine the azimuth ⁇ 1 of the speaker relative to the first microphone array.
  • the algorithm used in the sound source localization process can be a steerable-response power (SRP) algorithm, etc.
  • the control device may also perform sound source localization according to the audio data detected by the microphones of the second microphone array, and determine the azimuth angle ⁇ 2 of the speaker relative to the second microphone array.
  • the control device can according to the azimuth angle ⁇ 1 , the azimuth angle ⁇ 2 , the position of the first microphone array and the position of the second microphone array, and the first The geometric relationship among the microphone array, the second microphone array and the speaker is calculated to obtain the position of the speaker.
  • the control device may base on the deflection angle ⁇ 1 of the first microphone array, the deflection angle ⁇ 2 of the second microphone array, the azimuth angle ⁇ 1 , The azimuth ⁇ 2 , the positions of the first microphone array and the second microphone array, and the geometric relationship between the first microphone array, the second microphone array and the speaker are calculated to obtain the position of the speaker.
  • control device After the control device determines the position of the speaker, it can calculate the azimuth of the speaker relative to the camera and the distance between the speaker and the camera based on the position of the speaker and the position of the camera.
  • the distance is the plane equivalent distance, that is, the projection distance between the equivalent center of the camera and the equivalent center of the speaker in the plane.
  • the director rotation angle of the camera can be determined based on the speaker's azimuth relative to the camera.
  • the camera can include a rotatable camera and a fixed base. The camera can rotate relative to the fixed base.
  • An initial shooting direction can be specified for the camera.
  • the initial shooting direction and the reference direction of the camera can be the same.
  • the guide rotation angle can be the real-time shooting direction of the camera Relative to the angle of the initial shooting direction, the initial shooting direction can be regarded as a 0-degree direction, and the director rotation angle and the azimuth angle of the speaker relative to the camera can be the same.
  • the focal length of the camera can be determined based on the distance.
  • the control device may query the pre-stored first correspondence table to determine the broadcast guide focal length corresponding to the distance.
  • the first correspondence table may record the correspondence between the distance of the speaker relative to the camera and the focal length of the camera.
  • the control device can determine the rotation angle and focal length of the camera according to the position of the speaker and the position of the camera, so that the camera can be controlled to rotate to the rotation angle of the broadcast, and the camera can be controlled to follow the focal length of the broadcast. shoot.
  • the control device can determine the director rotation angle and focus distance of the camera according to the deflection angle of the camera, the position of the speaker, and the position of the camera, so as to control the rotation of the pan/tilt of the camera to the rotation angle of the director , and control the camera to shoot according to the focal length of the guide.
  • multiple cameras can be added and arranged in different positions to better capture the participants.
  • the control device can determine the target camera of the two cameras that is farther away from the speaker based on the position of the speaker and the positions of the two cameras, based on the position of the speaker and the target camera.
  • the position of the camera determines the directing operation of the target camera.
  • the control device can control multiple cameras to shoot the sound source object to obtain multiple video images. Then, image recognition can be performed on the obtained plurality of video images, and a video image satisfying the target condition can be selected as the guide video image.
  • a video image satisfying the target condition can be selected as the guide video image.
  • the face angle in the video image can be determined using a machine learning model for face angle detection.
  • the sound source object As long as the sound source object is making a sound, it can be located based on the sound. In this way, the problem of requiring the speaker to have obvious actions (such as obvious mouth movements) when the sound source object is located based on image recognition is avoided, and like this, the limitations of the automatic guidance method based on image recognition in the prior art are avoided , improving the accuracy of the guide.
  • the first microphone array is integrated with a first sound generator
  • the second microphone array includes a first microphone and a second microphone
  • the control device receives the first sound generator based on the first microphone and the second microphone.
  • the time of the sound signal sent and the time when the first sounder emits the sound signal determine the distance D1 between the first sounder and the first microphone and the distance D2 between the first sounder and the second microphone; the control device is based on The position of the first microphone, the position of the second microphone, the distance D 1 and the distance D 2 determine the position of the first microphone array relative to the second microphone array.
  • the equivalent centers of the first sound generator and the first microphone array may be the same, that is, the positions of the first sound generator and the first microphone array may be the same.
  • the position of the first microphone array relative to the second microphone array may be the position of the first sound emitter in the first microphone array relative to the second microphone array.
  • the coordinate system can be used to determine the position. For example, when the origin of the coordinate system is set at the center of the second microphone array, the coordinates of the first microphone array reflect that the first microphone array is relative to the The position of the second microphone array is described.
  • the first sounder can be set to emit a sound signal every time it is powered on, and the control device can obtain the time when the first sounder is powered on as the time when the first sounder emits a sound signal.
  • Mode 2 the control device instructs the first sounder to emit a sound signal, and when the first sounder emits a sound signal, it may record the time when the sound signal is emitted, and then send the time to the control device.
  • the first sounder When the control device controls the first sounder to emit the sound signal S1 , the first sounder sends the time point t1 when the sound signal S1 is emitted to the control device for recording.
  • Each microphone in the second microphone array can receive the sound signal, record the time point when the sound signal is detected, and send it to the control device.
  • the control device can obtain the time point t2 when the first microphone in the second microphone array detects the sound signal S1 , and the time point t3 when the second microphone in the second microphone array detects the sound signal S1 , and then, can The time length ⁇ T 1 between the time point t 1 and the time point t 2 and the time length ⁇ T 2 between the time point t 1 and the time point t 3 are calculated.
  • the control device may calculate the distance D 1 between the first microphone and the first sound generator and the distance D 2 between the second microphone and the first sound generator according to the pre-stored sound velocity data V.
  • the control device can calculate the position of the first sound generator according to the distance D, the distance D 1 and the distance D 2 , and the geometric relationship between the first microphone, the second microphone and the first sound generator.
  • the distance between the first sounder and the first microphone is determined.
  • the distance D 1 and the distance D 2 between the first sound generator and the second microphone and then based on the position of the first microphone, the position of the second microphone, the distance D 1 and the distance D 2 , determine the relative distance between the first microphone array and the second microphone array. The location of the two microphone arrays. In this way, there is no need to manually calibrate the device parameters, thereby improving the convenience of calibrating the device parameters.
  • the broadcast control system further includes a second sounder and a third sounder, the second sounder and the third sounder are integrated on the same electronic screen as the second microphone array, and the control device obtains the first The azimuth angle ⁇ 3 of the second sound generator relative to the first microphone array sent by the microphone array and the azimuth angle ⁇ 4 of the third sound generator relative to the first microphone array; the control device is based on the azimuth angle ⁇ 3 , azimuth angle ⁇ 4 , The position of the second sound generator and the position of the third sound generator determine the orientation of the first microphone array.
  • the position of the second sounder and the position of the third sounder can be pre-set, and the position of the second sounder and the position of the third sounder can be pre-stored in the control device, without the need to read from the microphone array.
  • the orientation of the device refers to the direction that the reference direction of the device is facing. It can be expressed by the angle between the reference direction of the device and the specified direction (that is, the deflection angle of the device).
  • the specified direction can be the X-axis or Y-axis direction.
  • each microphone in the first microphone array can detect corresponding audio data, and the first microphone array sends the audio data to the control device.
  • the control device can perform sound source localization according to the audio data, and determine the azimuth angle ⁇ 3 of the second sound generator relative to the first microphone array.
  • the control device may also perform sound source localization according to the audio data detected by the microphones of the first microphone array, and determine the azimuth angle ⁇ 4 of the third sound generator relative to the first microphone array.
  • X m (k) represents the fast Fourier transform (fast fourier transform, FFT) value of the k-band of the m-th microphone
  • s( ⁇ ) represents the steering vector corresponding to the sound source at the angle ⁇ on the two-dimensional space plane
  • the steering vector can be It is calculated in advance according to the layout of the microphones inside the microphone array and the angle search range (artificially set, the angle range for subsequent determination of the maximum extreme point). Taking the linear layout of each microphone in the microphone array as an example, the calculation formula of the steering vector is:
  • d m cos ⁇ represents the path difference between the sound source reaching the mth microphone and the reference microphone.
  • the control device may determine the distance L between the second sounder and the third sounder according to the position coordinates of the second sounder and the position coordinates of the third sounder. Then the control device can be based on the azimuth angle ⁇ 3 , the azimuth angle ⁇ 4 , the position of the second sounder, the position of the third sounder, and the positional relationship between the first microphone array, the second sounder and the third sounder, The deflection angle ⁇ 5 of the first microphone array is determined by calculation.
  • the azimuth angle ⁇ 3 of the second sound generator sent by the first microphone array relative to the first microphone array and the azimuth angle ⁇ 4 of the third sound generator relative to the first microphone array are obtained, and then The orientation of the first microphone array is determined based on the azimuth angle ⁇ 3 , the azimuth angle ⁇ 4 , the position of the second sounder and the position of the third sounder. In this way, there is no need to manually calibrate the device parameters, thereby improving the convenience of calibrating the device parameters.
  • the camera is integrated with a fourth sounder
  • the second microphone array includes a first microphone and a second microphone
  • the control device receives the sound signal from the fourth sounder based on the first microphone and the second microphone.
  • time and the time when the fourth sounder emits a sound signal determine the distance D 3 between the first microphone and the fourth sounder, and the distance D 4 between the second microphone and the fourth sounder;
  • the control device is based on the position of the first microphone, the second The positions of the two microphones, the distance D 3 and the distance D 4 , determine the position of the camera relative to the second microphone array.
  • the equivalent centers of the fourth sound generator and the camera may be the same, that is, the positions of the fourth sound generator and the camera may be the same.
  • the control device When the control device controls the fourth sounder to emit the sound signal S 4 , it can record the time point t 4 when the fourth sounder emits the sound signal S 4 .
  • Each microphone in the second microphone array can detect corresponding audio data, and record a detection time point corresponding to the audio data, that is, a time point when the audio data is detected.
  • the control device can obtain the time point t5 when the first microphone in the second microphone array detects the sound signal S4 , and the time point t6 when the second microphone in the second microphone array detects the sound signal S4 , and then, can The time length ⁇ T 3 between the time point t 4 and the time point t 5 and the time length ⁇ T 4 between the time point t 4 and the time point t 6 are calculated.
  • the control device can calculate the distance D 3 between the first microphone and the fourth sound generator and determine the distance D 4 between the second microphone and the fourth sound generator according to the pre-stored sound velocity data V.
  • the control device can calculate the position of the fourth sound generator according to the distance D, the distance D 3 and the distance D 4 , and the geometric relationship among the first microphone, the second microphone and the fourth sound generator.
  • the distance between the first microphone and the fourth sounder is determined.
  • the distance D3 , and the distance D4 between the second microphone and the fourth sounder and then based on the position of the first microphone, the position of the second microphone, the distance D3 and the distance D4 , determine the position of the camera relative to the second microphone array . In this way, there is no need to manually calibrate the device parameters, thereby improving the convenience of calibrating the device parameters.
  • the first microphone array is integrated with a first sounder
  • the camera is integrated with a fourth sounder and a third microphone array
  • the control device is based on detection data, determine the azimuth angle ⁇ 6 of the first sound generator relative to the third microphone array, and determine the position of the fourth sound generator relative to the first microphone array based on the detection data of the first microphone array when the fourth sound generator emits a sound signal Azimuth ⁇ 7 ; the control device determines the deflection angle of the camera based on the azimuth ⁇ 6 , the azimuth ⁇ 7 and the orientation of the first microphone array.
  • the orientation of the first microphone array may be manually measured and stored in the control device, or may be determined through a parameter calibration process.
  • the equivalent center of the third microphone array and the equivalent center of the camera may be the same, that is, the positions of the third microphone and the camera may be the same.
  • the deflection angle of the third microphone and the deflection angle of the camera may be the same.
  • the equivalent center of the fourth sound generator and the equivalent center of the camera may be the same, that is, the positions of the fourth sound generator and the camera may be the same.
  • each microphone in the third microphone array can detect corresponding audio data, and the third microphone array sends the audio data to the control device.
  • the control device can perform sound source localization according to the audio data, and determine the azimuth ⁇ 6 of the first sound generator relative to the third microphone array.
  • the control device may also perform sound source localization according to the audio data detected by the microphones of the first microphone array, and determine the azimuth angle ⁇ 7 of the fourth sound generator relative to the first microphone array.
  • the azimuth angle ⁇ 6 , azimuth angle ⁇ 7 , deflection angle ⁇ 5 , and the geometric relationship between the first sound generator, the third microphone array and the fourth sound generator, the deflection angle ⁇ 8 of the third microphone and the camera can be calculated .
  • the azimuth angle ⁇ 6 of the first sound generator relative to the third microphone array is determined, based on the first microphone array
  • the azimuth angle ⁇ 7 of the fourth sounder relative to the first microphone array is determined, and then based on the azimuth angle ⁇ 6 , the azimuth angle ⁇ 7 and the orientation of the first microphone array, determine The declination of the camera. In this way, there is no need to manually calibrate the device parameters, thereby improving the convenience of calibrating the device parameters.
  • the first microphone array is integrated with a light emitter
  • the camera is integrated with a fourth sound emitter
  • the control device determines the position of the light-emitting point in the image captured by the camera.
  • the image is taken when the light emitter emits light, based on the image
  • the location of the luminous point in the luminous point and the rotation angle of the camera determine the azimuth angle ⁇ 9 of the luminaire relative to the camera;
  • the control device determines the relative The azimuth ⁇ 7 of the first microphone array;
  • the control device determines the orientation of the camera based on the azimuth ⁇ 9 , the azimuth ⁇ 7 and the orientation of the first microphone array.
  • the orientation of the first microphone array is the angle of the reference direction of the first microphone array relative to the first specified direction, and the first specified direction may be the positive direction of the X axis, or other specified directions.
  • the orientation of the camera is the angle of the reference direction of the camera relative to the second specified direction, and the second specified direction may be the positive direction of the Y axis.
  • the equivalent center of the light emitter may be the same as the equivalent center of the first microphone array, that is, the position of the light emitter may be the same as that of the first microphone array.
  • the equivalent center of the fourth sound generator and the equivalent center of the camera may be the same, that is, the positions of the fourth sound generator and the camera may be the same.
  • the control device may record the corresponding relationship between the focal length of the camera and the range of the horizontal shooting angle (also called the horizontal field of view). The corresponding relationship may be reported by the camera to the control device, or entered manually into the control device, and so on.
  • the control device can determine the current focal length of the camera. Then look up the horizontal shooting angle range ⁇ 4 corresponding to the current focal length in the above correspondence table.
  • the controller controls the light emitter to emit light, it can acquire the image captured by the camera, and determine the distance L 3 between the position of the light emitting point and the longitudinal central axis of the image in the image. The distance L 4 between the left or right boundary of the image and the longitudinal central axis of the image may be recorded in the control device.
  • the real-time shooting direction of the camera corresponds to the longitudinal central axis of the image.
  • the azimuth ⁇ 5 of the illuminator relative to the camera can be determined, and the azimuth ⁇ 5 is the counterclockwise direction from the real-time shooting direction of the camera to the line connecting the illuminator and the camera angle.
  • the control device can also acquire the current rotation angle ⁇ 6 of the camera.
  • the azimuth ⁇ 5 and the rotation ⁇ 6 the azimuth ⁇ 9 of the illuminator relative to the camera can be calculated.
  • the rotation angle ⁇ 6 is the rotation angle of the camera head relative to the fixed base.
  • the camera head rotates under the control of the control device, so the control device knows the rotation angle ⁇ 6 .
  • the rotation angle is not a necessary parameter for calculating the orientation of the camera, and in other possible cases, the orientation of the camera may also be calculated without using the rotation angle.
  • the control device can control the fourth sounder to emit the sound signal S 6 , when the fourth sounder emits the sound signal S 6 , each microphone in the first microphone array can detect corresponding audio data, and the first microphone array can convert these audio Data is sent to the control device.
  • the control device can perform sound source localization according to the audio data, and determine the azimuth angle ⁇ 7 of the fourth sound generator relative to the first microphone array.
  • the control device can calculate the deflection angle ⁇ 8 of the camera .
  • the solution shown in the embodiment of the present application first determine the position of the light-emitting point in the image captured by the camera, and then determine the azimuth angle ⁇ 9 of the light-emitting device relative to the camera based on the position of the light-emitting point in the image and the rotation angle of the camera, and then based on the first
  • the detection data of the microphone array when the fourth sounder emits a sound signal determines the azimuth angle ⁇ 7 of the fourth sounder relative to the first microphone array, and then based on the azimuth angle ⁇ 9 , azimuth angle ⁇ 7 and the orientation of the first microphone array , to determine the orientation of the camera. In this way, there is no need to manually calibrate the device parameters, thereby improving the convenience of calibrating the device parameters.
  • the first microphone array is integrated with the first sound generator
  • the second microphone array includes the first microphone and the second microphone
  • the control device is based on the The detection data determines the distance D 5 between the first sound generator and the second microphone array and the azimuth angle ⁇ 10 of the first sound generator relative to the second microphone array; the control device is based on the distance D 5 , the azimuth angle ⁇ 10 and the second microphone array
  • the position of the array determines the position of the first microphone array.
  • the equivalent centers of the first sound generator and the first microphone array may be the same, that is, the positions of the first sound generator and the first microphone array may be the same.
  • the control device When the control device controls the first sound generator to emit the sound signal S 7 , it can record the time point t 7 when the first sound generator emits the sound signal S 7 .
  • the microphones of the second microphone array can detect corresponding audio data, and record the detection time point t 8 corresponding to the audio data, that is, the time point when the audio data is detected.
  • the control device can obtain the time point t 7 when the second microphone array detects the sound signal S 7 and the time point t 8 when the second microphone array detects the sound signal S 7 , and then can calculate the time point t 7 and the time point t The duration ⁇ T 5 between 8 .
  • the control device may calculate the distance D 5 between the second microphone array and the first sound generator according to the pre-stored sound velocity data V.
  • the second microphone array can send the audio data corresponding to the sound signal S7 to the control device.
  • the control device can perform sound source localization according to the audio data, and determine the azimuth angle ⁇ 10 of the first sound generator relative to the second microphone array.
  • the control device can calculate the position of the first sound generator according to the distance D 5 , the azimuth angle ⁇ 10 , the position of the second microphone array, and the geometric relationship between the first sound generator and the second microphone array.
  • the distance D5 between the first sounder and the second microphone array and the relative distance between the first sounder and the second microphone array are determined.
  • the azimuth ⁇ 10 of the second microphone array is then based on the distance D 5 , the azimuth ⁇ 10 and the location of the second microphone array to determine the position of the first microphone array. In this way, there is no need to manually calibrate the device parameters, thereby improving the convenience of calibrating the device parameters.
  • the first microphone array is integrated with a first sounder
  • the second microphone array is integrated with a fifth sounder
  • the control device is based on the detection data of the second microphone array when the first sounder emits a sound signal , determine the azimuth angle ⁇ 10 of the first sound generator relative to the second microphone array, and determine the orientation of the fifth sound generator relative to the first microphone array based on the detection data of the first microphone array when the fifth sound generator emits a sound signal Angle ⁇ 11 ; the control device determines the orientation of the first microphone array based on the azimuth angle ⁇ 10 , the azimuth angle ⁇ 11 and the orientation of the second microphone array.
  • the microphones of the second microphone array can detect corresponding audio data, and the second microphone array can send the audio data to the control device.
  • the control device can perform sound source localization according to the audio data, and determine the azimuth angle ⁇ 10 of the first sound generator relative to the second microphone array.
  • the control device may also perform sound source localization according to the audio data detected by the microphones of the first microphone array, and determine the azimuth angle ⁇ 11 of the fifth sound generator relative to the first microphone array.
  • the control device can determine the deflection angle ⁇ 5 of the first microphone array according to the azimuth angle ⁇ 10 , the azimuth angle ⁇ 11 , and the geometric relationship between the second microphone array and the first microphone array.
  • the control device can determine the deviation of the first microphone array according to the azimuth ⁇ 10 , azimuth ⁇ 11 , included angle ⁇ 12 , and the geometric relationship between the second microphone array and the first microphone array. angle ⁇ 5 .
  • the azimuth angle ⁇ 10 of the first sound generator relative to the second microphone array is determined, and based on the first microphone array
  • the detection data of the array when the fifth sounder emits a sound signal determines the azimuth angle ⁇ 11 of the fifth sounder relative to the first microphone array
  • the device is based on the azimuth angle ⁇ 10 , the azimuth angle ⁇ 11 and the orientation of the second microphone array , to determine the orientation of the first microphone array.
  • the camera is integrated with a fourth sounder
  • the control device is based on the time when the first microphone array and the second microphone array receive the sound signal emitted by the fourth sounder and the time when the sound signal emitted by the fourth sounder time, determine the distance D 6 between the first microphone array and the fourth sound generator, and the distance D 7 between the second microphone array and the fourth sound generator; the control device is based on the position of the first microphone array, the position of the second microphone array, and the distance D 6 and distance D 7 determine the position of the camera.
  • the equivalent centers of the fourth sound generator and the camera may be the same, that is, the positions of the fourth sound generator and the camera may be the same.
  • the control device When the control device controls the fourth sounder to emit the sound signal S 9 , it may record the time point t 9 when the fourth sounder emits the sound signal S 9 .
  • the first microphone array and the second microphone array can detect corresponding audio data, and record the detection time point corresponding to the audio data, that is, the time point when the audio data is detected.
  • the control device can obtain the time point t 10 when the first microphone array detects the sound signal S 9 and the time point t 11 when the second microphone array detects the sound signal S 9 , and then calculate the time point t 9 and the time point t
  • the time length ⁇ T 6 between the time point t 9 and the time point t 11 is ⁇ T 7 .
  • the control device can calculate the distance D 6 between the first microphone array and the fourth sound generator and determine the distance D 7 between the second microphone array and the fourth sound generator according to the pre-stored sound velocity data V.
  • the control device can calculate the position of the fourth sound generator according to the distance D 6 , the distance D 7 and the distance D 8 , and the geometric relationship among the first microphone array, the second microphone array and the fourth sound generator.
  • the first microphone array and the second microphone array determine the first microphone array and the second microphone array.
  • the distance D 6 of the four sounders, and the distance D 7 between the second microphone array and the fourth sounder, and then based on the position of the first microphone array, the position of the second microphone array, the distance D 6 and the distance D 7 determine the distance of the camera Location. In this way, there is no need to manually calibrate the device parameters, thereby improving the convenience of calibrating the device parameters.
  • the control device determines the azimuth of the sound source object relative to the camera and the position of the sound source object based on the position of the sound source object and the position of the camera. The distance from the camera; the control device determines the camera guide rotation angle based on the azimuth angle of the sound source object relative to the camera, and determines the camera guide focal length based on the distance between the sound source object and the camera.
  • the azimuth of the speaker relative to the camera and the distance of the speaker from the camera can be calculated.
  • the distance is the plane equivalent distance, that is, the projection distance between the equivalent center of the camera and the equivalent center of the speaker in the plane.
  • the camera can include a rotatable camera and a fixed base.
  • the camera can rotate relative to the fixed base.
  • An initial shooting direction can be specified for the camera.
  • the initial shooting direction and the reference direction of the camera can be the same.
  • the guide rotation angle can be the real-time shooting direction of the camera Relative to the angle of the initial shooting direction, the initial shooting direction can be regarded as a 0-degree direction, and the director rotation angle and the azimuth angle of the speaker relative to the camera can be the same.
  • the focal length of the camera can be determined based on the distance.
  • the control device may query the pre-stored first correspondence table to determine the broadcast guide focal length corresponding to the distance.
  • the first correspondence table may record the correspondence between the distance of the speaker relative to the camera and the focal length of the camera.
  • the control device determines the azimuth of the sound source object relative to the camera, and the distance between the sound source object and the camera based on the position of the sound source object and the position of the camera. Then, based on the azimuth angle of the sound source object relative to the camera, determine the guide rotation angle of the camera, and determine the guide focal length of the camera based on the distance between the sound source object and the camera. In this way, there is no need to manually determine the broadcasting parameters, thereby improving the convenience of the broadcasting process.
  • the time when the first sounder emits the sound signal is the time when the first sounder is powered on.
  • the broadcast control system further includes another camera; the control device determines the target camera of the two cameras that is farther away from the sound source object based on the position of the sound source object and the positions of the two cameras, based on The position of the sound source object and the position of the target camera determine the guiding operation for the target camera.
  • This processing method may be applicable to the following scenario: a long table is arranged in a conference room, several chairs are arranged on both sides of the long table, and the speaker sits on the chair facing the long table.
  • a video camera is installed on the walls on both sides of the long table. In this scenario, among the two cameras respectively arranged on the walls on both sides of the long table, the camera farther away from the speaker can capture the face of the speaker better.
  • the target camera that is farther away from the sound source object among the two cameras is determined, based on the position of the sound source object and the position of the target camera, Determine the steering operation for the target camera. In this way, the face of the speaker can be better captured in a regular conference scene, improving the accuracy of automatic broadcasting.
  • a broadcast control device in a second aspect, includes one or more modules, and the one or more modules are used to implement the method of the first aspect and possible implementation manners thereof.
  • a computer device in a third aspect, includes a memory and a processor, the memory is used to store computer instructions; the processor executes the computer instructions stored in the memory, so that the computer device performs the first aspect and possible implementations thereof method.
  • a computer-readable storage medium stores computer program codes.
  • the computer device executes the method of the first aspect and possible implementations thereof.
  • a computer program product in a fifth aspect, includes computer program code, and when the computer program code is executed by a computer device, the computer device executes the method of the first aspect and its possible implementation manners.
  • the sound source object As long as the sound source object is making a sound, it can be located based on the sound. In this way, the problem of requiring the speaker to have obvious actions (such as obvious mouth movements) when the sound source object is located based on image recognition is avoided, and like this, the limitations of the automatic guidance method based on image recognition in the prior art are avoided , improving the accuracy of the guide.
  • FIG. 1 is a schematic diagram of a broadcast control system provided by an embodiment of the present application
  • Fig. 2 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of a broadcast control system provided by an embodiment of the present application.
  • FIG. 4 is a flow chart of a broadcast control method provided by an embodiment of the present application.
  • Fig. 5 is a schematic diagram of processing provided by the embodiment of the present application.
  • FIG. 6 is a schematic diagram of processing provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of processing provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of a broadcast control system provided by an embodiment of the present application.
  • Fig. 9 is a schematic diagram of processing provided by the embodiment of the present application.
  • Fig. 10 is a schematic diagram of processing provided by the embodiment of the present application.
  • Fig. 11 is a schematic diagram of processing provided by the embodiment of the present application.
  • Fig. 12 is a schematic diagram of a broadcast control system provided by an embodiment of the present application.
  • Fig. 13 is a schematic diagram of processing provided by the embodiment of the present application.
  • Fig. 14 is a schematic diagram of processing provided by the embodiment of the present application.
  • Fig. 15 is a schematic diagram of processing provided by the embodiment of the present application.
  • Fig. 16 is a schematic diagram of a broadcast control system provided by an embodiment of the present application.
  • Fig. 17 is a schematic diagram of processing provided by the embodiment of the present application.
  • Figure 18 is a schematic diagram of processing provided by the embodiment of the present application.
  • Fig. 19 is a schematic diagram of processing provided by the embodiment of the present application.
  • Fig. 20 is a schematic diagram of an apparatus for broadcast control provided by an embodiment of the present application.
  • Reference direction All devices in the broadcast control system can be directed devices. Directed devices have a reference direction, which can also be called the positive direction of the device.
  • the reference direction of the device will rotate with the rotation of the device.
  • the reference direction is generally set manually during the production process of the equipment, and the corresponding icon can also be set on the equipment to mark it, so as to facilitate the installation of the user.
  • the reference direction of a pan-tilt camera is any specified radial direction of the pan-tilt base, and a line mark can be printed on the position of the radius on the pan-tilt base.
  • the characteristic of directional devices is that the real-time output parameters of the device will include azimuth or rotation angle (which will be introduced separately below), and these angle parameters need to be determined with reference to the reference direction.
  • Azimuth The azimuth angle of the B object relative to the A device, which refers to the angle in the plane from the reference direction of the A device to the equivalent center of the B object and the equivalent center of the A device.
  • the counterclockwise included angle in the plane of the line from the reference direction of the A device to the equivalent center of the B object and the equivalent center of the A device is defined as the azimuth angle of the B device relative to the A device.
  • Angle The angle between the reference direction of the device and the specified direction (can be set manually). In this embodiment, the counterclockwise included angle of the device from the reference direction to the specified direction in the plane is defined as the declination angle of the device.
  • the orientation of the device refers to the direction of the reference direction of the device. It can be expressed by the angle between the reference direction of the device and the specified direction (that is, the deflection angle of the device).
  • the specified direction can be the X-axis or Y-axis direction .
  • Designated direction It is a direction set to determine the deflection angle of the equipment.
  • different designated directions can be set for different devices, or the same designated direction can be set for different devices.
  • the specified direction is the direction with reference to the coordinate axis.
  • the deflection angle of the device is referenced to the specified direction, so in fact the deflection angle of the device is also referenced to the coordinate axis.
  • the orientation of the sound source object is referred to the reference direction of the microphone array, so in fact, the orientation of the sound source object can also be finally expressed as a relative angle to the coordinate axis.
  • the specified direction is generally set to the positive direction of a certain coordinate axis.
  • C equipment can include M parts and N parts.
  • the M part is rotatably mounted on the N part.
  • the rotation angle of the M part refers to the rotation angle of the positive direction of the M part relative to the positive direction of the N part.
  • the positive direction of the N component can be considered as the reference direction of the C device.
  • Sound source object the person or thing that is currently making a sound, usually the current speaker.
  • Shooting angle range Also known as field of view, it refers to the horizontal and vertical angles that the camera can capture currently.
  • Image longitudinal central axis refers to the imaginary line that can divide the image into two in the vertical direction of the image.
  • the sounder in the embodiment of the present application is a device capable of emitting sound under the control of the control device.
  • the sounder mentioned below may be an ultrasonic sounder, and the sound emitted is ultrasonic.
  • An embodiment of the present application provides a method for controlling a broadcast guide, which can be applied in a broadcast guide control system.
  • the broadcast control system may include microphone arrays, cameras, control equipment, and the like. There may be multiple types of microphone arrays, such as distributed microphone arrays ("distributed" means not integrated on other devices) or microphone arrays integrated on other devices. There can be many kinds of cameras, such as distributed cameras or cameras integrated in other devices.
  • the control device may be an independent control device, or a control device integrated with a microphone array and/or a camera.
  • the broadcast control system may also include terminal devices (such as smart screens or projectors) and other devices. One or more of the control device, the microphone array and the camera can be integrated on the terminal device.
  • the director control system can be used for shooting and directing various scenes, such as meeting scenes, teaching scenes or program recording scenes, etc. In this embodiment, the broadcasting of a conference scene is taken as an example for illustration, and other situations are similar, and details are not repeated here.
  • a very common meeting scene is a long table meeting scene.
  • This meeting scene can be set with a bar-shaped conference table and several seats. The seats are arranged around the bar-shaped conference table. During the meeting, Participants can sit in the seat for the meeting.
  • the embodiment of the present application uses this conference scene as an example to describe the solution.
  • the broadcast control system may include a first microphone array, a second microphone array, a control device, a camera, and the like.
  • the first microphone array and the second microphone array may be distributed microphone arrays.
  • the distributed microphone array can be placed anywhere in the conference scene.
  • the control device can be a stand-alone device or integrated on a microphone array or camera.
  • a plane coordinate system can be set.
  • the plane coordinate system can be a two-dimensional rectangular coordinate system in the horizontal plane. Any point in the meeting room space can be set as the origin of the plane coordinate system.
  • the X-axis direction and the Y-axis direction of the system can be any two mutually perpendicular directions in the horizontal plane.
  • the control device can record the position, specified direction and deflection angle of some or all devices such as the microphone array and camera, and the position of the device can be the coordinates of the projection point of the equivalent center of the device on the plane coordinate system.
  • the equivalent center of a device that does not move randomly in a certain position in the conference room is taken as the origin of the coordinate system, and the directions that use the device as a reference are the X-axis direction and the Y-axis direction.
  • take the equivalent center of the conference terminal as the origin of the coordinate system take the normal direction of the screen of the conference terminal as the Y-axis direction, and take the direction perpendicular to the normal in the horizontal plane as the X-axis direction.
  • an embodiment of the present application provides a broadcast guide control method, which can be executed by a control device in the broadcast guide control system.
  • the control device may be a server, a terminal, or a component integrated in other devices. Servers can be individual servers or groups of servers.
  • the terminal can be a device arranged in a meeting room, or a device arranged in an enterprise computer room, or a portable device, such as a smart screen, a desktop computer, a notebook computer, a mobile phone, a tablet computer, a smart watch, etc.
  • the control device can be integrated in devices such as smart screens, cameras, and microphone arrays.
  • FIG. 2 is a schematic structural diagram of a control device provided by an embodiment of the present application. From the perspective of hardware composition, the structure of the control device 20 may be as shown in FIG. 2 , including a processor 201 , a memory 202 and a communication component 203 .
  • the processor 201 can be a central processing unit (central processing unit, CPU) or a system-on-chip (system on chip, SoC), etc., and the processor 201 can be used to determine the azimuth angle ⁇ 1 , The azimuth angle ⁇ 2 of the sound source object relative to the second microphone array can also be used to determine the position of the sound source object and so on.
  • CPU central processing unit
  • SoC system on chip
  • the memory 202 may include various volatile memories or nonvolatile memories, such as solid state disk (solid state disk, SSD), dynamic random access memory (dynamic random access memory, DRAM) memory, and the like.
  • the memory 202 can be used to store the initial data, intermediate data and result data used in the process of recording the guide control, such as the detection data of the first microphone array, the detection data of the second microphone array, the sound source object relative to the first microphone The azimuth ⁇ 1 of the array, the azimuth ⁇ 2 of the sound source object relative to the second microphone array, the position of the first microphone array, the position of the second microphone array and the position of the sound source object, and so on.
  • the communication component 203 may be a wired network connector, a wireless fidelity (wireless fidelity, WiFi) module, a Bluetooth module, a cellular network communication module, and the like.
  • the communication component 203 may be used for data transmission with other devices, and the other devices may be servers or terminals.
  • the control device 20 may receive the detection data of the first microphone array and the detection data of the second microphone array, and may also send the position of the sound source object to the server for storage.
  • the broadcast control system may include a first microphone array, a second microphone array and a camera.
  • the first microphone array and the second microphone array are distributed microphone arrays.
  • the number of the first microphone array may include one or more.
  • the cameras may be distributed cameras, and the number of cameras may include one or more.
  • the positions of the above-mentioned devices in the conference room can be set arbitrarily.
  • the first microphone array and the second microphone array can be placed on the long table, and the two cameras can be hung on both sides of the long table. side wall.
  • the position and deflection angle of the first microphone array, the second microphone array, the camera and other equipment can be recorded in the control device, and the position of the equipment can be the coordinates of the projection point of the equivalent center of the equipment on the plane coordinate system.
  • the sound source object is a speaker in a conference scene as an example for illustration, and other situations are similar and will not be repeated here.
  • the control device determines a position of a first microphone array and a position of a camera.
  • the control device can acquire the pre-stored position of the first microphone array and the position of the camera.
  • the control device may determine the position of the first microphone array and the position of the camera through a parameter calibration process, and the specific process of parameter calibration will be described in detail later.
  • the control device determines the speaker according to the position of the speaker relative to the first microphone array, the position of the speaker relative to the second microphone array, the position of the first microphone array, and the position of the second microphone array s position.
  • the position of the speaker relative to the first microphone array and the position of the speaker relative to the second microphone array may be represented by azimuth angles.
  • each microphone in the first microphone array can detect corresponding audio data, and the first microphone array sends the audio data to the control device.
  • the control device can perform sound source localization according to the audio data, and determine the azimuth ⁇ 1 of the speaker relative to the first microphone array.
  • the algorithm used in the sound source localization process can be a steerable-response power (SRP) algorithm, etc.
  • the control device may also perform sound source localization according to the audio data detected by the microphones of the second microphone array, and determine the azimuth angle ⁇ 2 of the speaker relative to the second microphone array.
  • azimuth ⁇ 1 is the counterclockwise included angle from the reference direction of the first microphone array to the line connecting the speaker and the first microphone array in the horizontal plane
  • azimuth ⁇ 2 is the reference direction from the first microphone array The counterclockwise angle in the horizontal plane between the line connecting the speaker and the second microphone array.
  • the control device can according to the azimuth angle ⁇ 1 , the azimuth angle ⁇ 2 , the position of the first microphone array and the position of the second microphone array, and the first The geometric relationship among the microphone array, the second microphone array and the speaker is calculated to obtain the position of the speaker.
  • the position coordinates of the speaker are expressed as (x, y)
  • the coordinates of the first microphone array are expressed as (x 1 , y 1 )
  • the coordinates of the second microphone array are expressed as (x 2 , y 2 )
  • the control device may base on the deflection angle ⁇ 1 of the first microphone array, the deflection angle ⁇ 2 of the second microphone array, the azimuth angle ⁇ 1 , The azimuth ⁇ 2 , the positions of the first microphone array and the second microphone array, and the geometric relationship between the first microphone array, the second microphone array and the speaker are calculated to obtain the position of the speaker.
  • the position coordinates of the speaker are expressed as (x, y)
  • the coordinates of the first microphone array are expressed as (x 1 , y 1 )
  • the coordinates of the second microphone array are expressed as (x 2 , y 2 )
  • the calculation can be as follows:
  • the control device determines, based on the location of the speaker and the location of the camera, a broadcast guidance operation on the camera.
  • control device After the control device determines the position of the speaker, it can calculate the azimuth of the speaker relative to the camera and the distance between the speaker and the camera based on the position of the speaker and the position of the camera.
  • the distance is the plane equivalent distance, that is, the projection distance between the equivalent center of the camera and the equivalent center of the speaker in the plane.
  • the camera can include a rotatable camera and a fixed base.
  • the camera can rotate relative to the fixed base.
  • An initial shooting direction can be specified for the camera.
  • the initial shooting direction and the reference direction of the camera can be the same.
  • the guide rotation angle can be the real-time shooting direction of the camera Relative to the angle of the initial shooting direction, the initial shooting direction can be regarded as a 0-degree direction, and the director rotation angle and the azimuth angle of the speaker relative to the camera can be the same.
  • the focal length of the camera can be determined based on the distance.
  • the control device may query the pre-stored first correspondence table to determine the broadcast guide focal length corresponding to the distance.
  • the first correspondence table may record the correspondence between the distance of the speaker relative to the camera and the focal length of the camera.
  • the control device can determine the rotation angle and focal length of the camera according to the position of the speaker and the position of the camera, so that the camera can be controlled to rotate to the rotation angle of the broadcast, and the camera can be controlled to follow the focal length of the broadcast. shoot.
  • the control device can determine the director rotation angle and focus distance of the camera according to the deflection angle of the camera, the position of the speaker, and the position of the camera, so as to control the rotation of the pan/tilt of the camera to the rotation angle of the director , and control the camera to shoot according to the focal length of the guide.
  • the control device can determine the target camera of the two cameras that is farther away from the speaker based on the position of the speaker and the positions of the two cameras. The position and the position of the target camera determine the guiding operation of the target camera.
  • This processing method may be applicable to the following scenario: a long table is arranged in a conference room, several chairs are arranged on both sides of the long table, and the speaker sits on the chair facing the long table.
  • a video camera is installed on the walls on both sides of the long table.
  • the camera farther away from the speaker can capture the face of the speaker better. Therefore, among the two cameras, the camera that is farther away from the speaker may be determined as the target camera, and then based on the position of the speaker and the position of the target camera, the guiding operation for the target camera is determined.
  • the control device may control multiple cameras to shoot the sound source object based on the position of the sound source object and the positions of the multiple cameras to obtain multiple video images. Then, image recognition can be performed on the obtained plurality of video images, and a video image satisfying the target condition can be selected as the guide video image.
  • a video image satisfying the target condition can be selected as the guide video image.
  • the face angle in the video image can be determined using a machine learning model for face angle detection.
  • parameters that may be involved include the position of each device and the deflection angle of each device. These parameters can all be entered into the control equipment in advance, can be measured and entered after installation, or can also be entered before the equipment leaves the factory. In this case, the factory configuration should be considered during installation. Some of these parameters can also be entered into the control device in advance, and the other part can be determined through the parameter calibration process. Which parameters need to be entered in advance and which parameters need to be calibrated can be determined based on the equipment in the broadcast control system. For example, the parameters of equipment whose location can change at any time need to be calibrated, such as distributed microphone arrays, etc., and equipment with relatively fixed locations Parameters can be pre-recorded, such as the integrated microphone array on the conference terminal.
  • Technicians can pre-enter the position and deflection angle of the designated device in the control device, and then the control device can determine the position and deflection angle of other devices other than the designated device through the parameter calibration process.
  • the specified device may be a certain microphone array or the like. The following is a detailed description of the process of parameter calibration for several broadcast control systems in different situations.
  • the broadcast control system may include a first microphone array, a conference terminal, and a camera.
  • the first microphone array is a distributed microphone array, which may be integrated with a first sound generator, and the number of the first microphone array may include one or more.
  • the conference terminal may be a smart screen, and the conference terminal may be integrated with a control device, a second microphone array, a second sounder, and a third sounder.
  • the camera may be a distributed camera, and may be integrated with a fourth sound generator and a third microphone array, and the number of cameras may include one or more.
  • the sound generator can have many possibilities, such as a normal speaker or an ultrasonic transmitter, etc.
  • the positions of the above-mentioned devices in the conference room can be set arbitrarily.
  • the conference terminal is installed on the wall at one end of the long table, and the second microphone array is installed at the center of the top of the conference terminal.
  • the sounder and the third sounder are installed on both sides of the conference terminal, the first microphone array can be placed on the long table, and the two cameras can be hung on the walls on both sides of the long table respectively.
  • the control device may pre-record the position of the second microphone array, the deflection angle of the second microphone array, the position of the first microphone in the second microphone array, the position of the second microphone in the second microphone array, the position of the second sound generator, The position of the third sound generator, and pre-recording the first specified direction corresponding to the first microphone array, and the second specified direction corresponding to the camera.
  • the control device establishes a plane Cartesian coordinate system in the horizontal plane with the central position of the second microphone array as the origin of coordinates and with the reference direction of the second microphone array as the positive direction of the X-axis.
  • the reference direction of the second microphone array may also be set as the screen direction, and the first microphone and the second microphone in the second microphone array may be arranged symmetrically with respect to the central position on the conference terminal.
  • the distance between the microphones in the microphone array is usually clear, when the distance between the first microphone and the second microphone is D, so the position coordinates of the first microphone can be recorded as (0, -D/2), the second The position coordinates of the two microphones can be recorded as (0, D/2).
  • the second sound generator and the third sound generator are generally arranged symmetrically with respect to the central position on the conference terminal.
  • the position coordinates of the second sounder can be recorded as (0,-L/2), and the position coordinates of the third sounder can be recorded as (0, L/2).
  • the positions of the above-mentioned first microphone, second microphone, second sound generator and third sound generator may be pre-stored before the conference terminal leaves the factory.
  • it may be set and recorded that the first specified direction corresponding to the first microphone array is the positive direction of the X axis, and the second specified direction corresponding to the camera is the positive direction of the Y axis.
  • Position calibration of the first microphone array (if there are multiple first microphone arrays, the position calibration of each first microphone array can adopt the following processing method)
  • the control device controls the first sounder to emit the sound signal S 1 , based on the time point when the first sounder emits the sound signal S 1 and the time point when the first microphone and the second microphone in the second microphone array detect the sound signal S 1 , Determining the distance D 1 between the first microphone and the first sounder, and the distance D 2 between the second microphone and the first sounder, the control device is based on the position of the first microphone, the position of the second microphone, the distance D 1 and the distance D 2 , determine the positions of the first sound generator and the first microphone array.
  • the equivalent centers of the first sound generator and the first microphone array may be the same, that is, the positions of the first sound generator and the first microphone array may be the same.
  • the first sound generator when the control device controls the first sound generator to emit the sound signal S1 , the first sound generator sends the time point t1 when the sound signal S1 is emitted to the control device for recording.
  • Each microphone in the second microphone array can receive the sound signal, record the time point when the sound signal is detected, and send it to the control device.
  • the control device can obtain the time point t2 when the first microphone in the second microphone array detects the sound signal S1 , and the time point t3 when the second microphone in the second microphone array detects the sound signal S1 , and then, can The time length ⁇ T 1 between the time point t 1 and the time point t 2 and the time length ⁇ T 2 between the time point t 1 and the time point t 3 are calculated.
  • the control device may calculate the distance D 1 between the first microphone and the first sound generator and the distance D 2 between the second microphone and the first sound generator according to the pre-stored sound velocity data V.
  • the control device can calculate the position of the first sound generator according to the distance D, the distance D 1 and the distance D 2 , and the geometric relationship between the first microphone, the second microphone and the first sound generator.
  • the counterclockwise included angle in the horizontal plane from the connection line between the first microphone and the first sound generator to the connection line between the first microphone and the second microphone is expressed as ⁇ 3
  • the coordinates of the first sound generator are expressed as (x 1 , y 1 )
  • the positional relationship among the first microphone, the second microphone, and the first sound generator there are many possibilities for the positional relationship among the first microphone, the second microphone, and the first sound generator.
  • the above description process only takes one of the positional relationships as an example.
  • the position of the first sound generator can be obtained through geometric operations according to the above relevant data.
  • the positional relationships and calculation methods used in the above description do not limit this embodiment.
  • X m (k) represents the fast Fourier transform (fast fourier transform, FFT) value of the k-band of the m-th microphone
  • s( ⁇ ) represents the steering vector corresponding to the sound source at the angle ⁇ on the two-dimensional space plane
  • the steering vector can be It is calculated in advance according to the layout of the microphones inside the microphone array and the angle search range (artificially set, the angle range for subsequent determination of the maximum extreme point). Taking the linear layout of each microphone in the microphone array as an example, the calculation formula of the steering vector is:
  • d m cos ⁇ represents the path difference between the sound source reaching the mth microphone and the reference microphone.
  • the declination calibration of the first microphone array (if there are multiple first microphone arrays, the declination calibration of each first microphone array can adopt the following processing method)
  • the deflection angle of the first microphone array is an included angle between the reference direction of the first microphone array and the first specified direction, and the first specified direction may be the positive direction of the X axis.
  • the control device controls the second sound generator to emit a sound signal S 2 , based on the detection data of the first microphone array, the azimuth angle ⁇ 3 of the second sound generator relative to the first microphone array is determined, and the control device controls the third sound generator to emit a sound signal S 3.
  • Based on the detection data of the first microphone array determine the azimuth angle ⁇ 4 of the third sound generator relative to the first microphone array, based on the azimuth angle ⁇ 3 , azimuth angle ⁇ 4 , the position of the second sound generator and the position of the third sound generator to determine the deflection angle ⁇ 5 of the first microphone array.
  • each microphone in the first microphone array can detect corresponding audio data, and the first microphone array sends the audio data to the control device.
  • the control device can perform sound source localization according to the audio data, and determine the azimuth angle ⁇ 3 of the second sound generator relative to the first microphone array.
  • the control device may also perform sound source localization according to the audio data detected by the microphones of the first microphone array, and determine the azimuth angle ⁇ 4 of the third sound generator relative to the first microphone array.
  • the control device may determine the distance L between the second sounder and the third sounder according to the position coordinates of the second sounder and the position coordinates of the third sounder.
  • control device can be based on the azimuth angle ⁇ 3 , the azimuth angle ⁇ 4 , the position of the second sounder, the position of the third sounder, and the positional relationship between the first microphone array, the second sounder and the third sounder,
  • the deflection angle ⁇ 5 of the first microphone array is determined by calculation. Referring to Fig.
  • the coordinates of the first microphone array are represented as (x 1 , y 1 ), the coordinates of the second sound generator are represented as (0, -L/2), and the coordinates of the third sound generator are represented as (0, L/ 2), the distance between the second sound generator and the first microphone array is expressed as L 1 , and the distance between the third sound generator and the first microphone array is expressed as L 2 , the calculation process can be as follows:
  • the positional relationship among the first microphone array, the second sounder, and the third sounder there are many possibilities for the positional relationship among the first microphone array, the second sounder, and the third sounder.
  • the above description process only takes one of the positional relationships as an example. relationship, the deflection angle of the first microphone array can be obtained through geometric operations according to the above correlation data.
  • the positional relationships and calculation methods used in the above description do not limit this embodiment.
  • the control device controls the fourth sounder to emit the sound signal S4 , based on the time point when the fourth sounder emits the sound signal S4 and the time point when the first microphone and the second microphone in the second microphone array detect the sound signal S4 , Determining the distance D 3 between the first microphone and the fourth sounder, and the distance D 4 between the second microphone and the fourth sounder, the control device is based on the position of the first microphone, the position of the second microphone, the distance D 3 and the distance D 4 , to determine the location of the fourth sounder and camera.
  • the equivalent centers of the fourth sound generator and the camera may be the same, that is, the positions of the fourth sound generator and the camera may be the same.
  • the control device when the control device controls the fourth sounder to emit the sound signal S 4 , it may record the time point t 4 when the fourth sounder emits the sound signal S 4 .
  • Each microphone in the second microphone array can detect corresponding audio data, and record a detection time point corresponding to the audio data, that is, a time point when the audio data is detected.
  • the control device can obtain the time point t5 when the first microphone in the second microphone array detects the sound signal S4 , and the time point t6 when the second microphone in the second microphone array detects the sound signal S4 , and then, can The time length ⁇ T 3 between the time point t 4 and the time point t 5 and the time length ⁇ T 4 between the time point t 4 and the time point t 6 are calculated.
  • the control device can calculate the distance D 3 between the first microphone and the fourth sound generator and determine the distance D 4 between the second microphone and the fourth sound generator according to the pre-stored sound velocity data V.
  • the control device can calculate the position of the fourth sound generator according to the distance D, the distance D 3 and the distance D 4 , and the geometric relationship among the first microphone, the second microphone and the fourth sound generator.
  • the calculation process for determining the position of the fourth sound generator is similar to the process for determining the position of the first sound generator in Case 1, and reference may be made to relevant descriptions of the position calibration of the first microphone array in Case 1.
  • the control device controls the first sound generator to emit a sound signal S 5 , determines the azimuth angle ⁇ 6 of the first sound generator relative to the third microphone array based on the detection data of the third microphone array, and controls the fourth sound generator to emit a sound signal S 6 , based on the detection data of the first microphone array, determine the azimuth angle ⁇ 7 of the fourth sound generator relative to the first microphone array, and control the device based on the azimuth angle ⁇ 6 , azimuth angle ⁇ 7 and the deflection angle ⁇ 5 of the first microphone array, The deflection angle ⁇ 8 of the third microphone array and the camera is determined.
  • the equivalent center of the third microphone array and the equivalent center of the camera may be the same, that is, the position of the third microphone and the position of the camera may be the same.
  • the deflection angle of the third microphone and the deflection angle of the camera may be the same.
  • the equivalent center of the fourth sound generator and the equivalent center of the camera may be the same, that is, the positions of the fourth sound generator and the camera may be the same.
  • each microphone in the third microphone array can detect corresponding audio data, and the third microphone array sends the audio data to the control device.
  • the control device can perform sound source localization according to the audio data, and determine the azimuth ⁇ 6 of the first sound generator relative to the third microphone array.
  • the control device may also perform sound source localization according to the audio data detected by the microphones of the first microphone array, and determine the azimuth angle ⁇ 7 of the fourth sound generator relative to the first microphone array.
  • the calculation process can be as follows:
  • Case 2 as shown in Figure 12, the structure of the broadcast control system is similar to Case 1, the difference is that the camera may not integrate the third microphone array, and the first microphone array may also integrate a illuminator.
  • the illuminator can have many possibilities, such as ordinary LED light source or infrared LED light source, etc.
  • the control device may pre-record the position of the second microphone array, the deflection angle of the second microphone array, the position of the first microphone in the second microphone array, the position of the second microphone in the second microphone array, the position of the second sound generator, The position of the third sound generator, and pre-recording the first specified direction corresponding to the first microphone array, and the second specified direction corresponding to the camera.
  • the control device establishes a plane Cartesian coordinate system in the horizontal plane with the position of the second microphone array as the origin of coordinates and with the reference direction of the second microphone array as the positive direction of the X-axis.
  • the first microphone and the second microphone in the second microphone array may be arranged symmetrically with respect to the central position on the conference terminal.
  • the position coordinates of the first microphone in the second microphone array can be recorded as (0,-D/2), and the position coordinates of the second microphone in the second microphone array Can be recorded as (0, D/2).
  • the second sound generator and the third sound generator are generally arranged symmetrically with respect to the central position on the conference terminal.
  • the position coordinates of the second sounder can be recorded as (0,-L/2), and the position coordinates of the third sounder can be recorded as (0, L/2).
  • the first specified direction corresponding to the first microphone array is the positive direction of the X axis
  • the second specified direction corresponding to the camera is the positive direction of the Y axis.
  • the position calibration of the first microphone array, the calibration of the deflection angle of the first microphone array, and the position calibration of the camera in case two are similar to the corresponding processing of case one, and reference may be made to the description of the corresponding processing of case one, which will not be repeated here.
  • the declination calibration of the camera in Case 2 is different from the corresponding processing in Case 1, which will be described in detail below:
  • Camera declination calibration (if there are multiple cameras, the declination calibration of each camera can be processed as follows)
  • the deflection angle of the camera is an angle between the reference direction of the camera and the second specified direction, and the second specified direction may be the positive direction of the Y axis.
  • the control device controls the illuminator to emit light, determines the position of the luminous point in the image captured by the camera, determines the azimuth ⁇ 9 of the illuminator relative to the camera based on the position of the luminous point in the image, and controls the fourth sounder to emit a sound signal S 6 , Based on the detection data of the first microphone array, the azimuth angle ⁇ 7 of the fourth sound generator relative to the first microphone array is determined, and the control device is based on the azimuth angle ⁇ 9 , the azimuth angle ⁇ 7 and the reference direction of the first microphone array and the first specified The included angle ⁇ 5 of the direction determines the deflection angle ⁇ 8 of the camera.
  • the equivalent center of the light emitter and the equivalent center of the first microphone array may be the same, that is, the positions of the light emitter and the first microphone array may be the same.
  • the equivalent center of the fourth sound generator and the equivalent center of the camera may be the same, that is, the positions of the fourth sound generator and the camera may be the same.
  • the control device may record the corresponding relationship between the focal length of the camera and the range of the horizontal shooting angle (also referred to as the horizontal field of view).
  • the corresponding relationship may be reported by the camera to the control device, or entered manually into the control device, and so on.
  • the control device can determine the current focal length of the camera. Then look up the horizontal shooting angle range ⁇ 4 corresponding to the current focal length in the above correspondence table.
  • the controller controls the light emitter to emit light, it can acquire the image captured by the camera, and determine the distance L 3 between the position of the light emitting point and the longitudinal central axis of the image in the image.
  • the distance L 4 between the left or right boundary of the image and the longitudinal central axis of the image may be recorded in the control device.
  • the real-time shooting direction of the camera corresponds to the longitudinal central axis of the image.
  • the horizontal shooting angle ⁇ 4 , the distance L 3 and the distance L 4 , the azimuth ⁇ 5 of the illuminator relative to the camera can be determined, and the azimuth ⁇ 5 is the counterclockwise direction from the real-time shooting direction of the camera to the line connecting the illuminator and the camera angle.
  • the calculation process can be as follows:
  • the control device can also acquire the current rotation angle ⁇ 6 of the camera.
  • the azimuth ⁇ 9 of the illuminator relative to the camera can be calculated. Referring to Figure 14, the calculation process can be as follows:
  • the rotation angle ⁇ 6 is the rotation angle of the camera head relative to the fixed base. Generally, the camera head rotates under the control of the control device, so the control device knows the rotation angle ⁇ 6 .
  • the control device can control the fourth sounder to emit the sound signal S 6 , when the fourth sounder emits the sound signal S 6 , each microphone in the first microphone array can detect corresponding audio data, and the first microphone array can convert these audio Data is sent to the control device.
  • the control device can perform sound source localization according to the audio data, and determine the azimuth angle ⁇ 7 of the fourth sound generator relative to the first microphone array.
  • the control device can calculate the deflection angle ⁇ 8 of the camera .
  • the calculation process can be as follows:
  • ⁇ 8 its value can be adjusted to a range of 0-2 ⁇ , for example, ⁇ 8 is 560°, and it can be adjusted to 200° (ie, 560°-360°).
  • the broadcast control system may include a first microphone array, a second microphone array, and a camera. Both the first microphone array and the second microphone array are distributed microphone arrays, the first microphone array may be integrated with a first sound generator and a light emitter, and the second microphone array may be integrated with a fifth sound generator.
  • the number of the first microphone array may include one or more.
  • the cameras may be distributed cameras, and may be integrated with a fourth sound generator, and the number of cameras may include one or more.
  • the sound generator can have many possibilities, such as a normal speaker or an ultrasonic transmitter, etc.
  • the illuminator can have many possibilities, such as ordinary LED light source or infrared LED light source, etc.
  • the broadcast control system may also include a conference terminal, and the control device may be integrated in the conference terminal, or may be integrated in other devices, or may be an additional independent terminal device.
  • the positions of the above-mentioned devices in the conference room can be set arbitrarily.
  • the first microphone array and the second microphone array can be placed on the long table, and the two cameras can be hung on both sides of the long table. side wall.
  • the control device may pre-record the position of the second microphone array and the deflection angle of the second microphone array, and pre-record the first designated direction corresponding to the first microphone array, and the second designated direction corresponding to the camera.
  • the control device establishes a plane Cartesian coordinate system in the horizontal plane with the position of the second microphone array as the origin of coordinates and with the reference direction of the second microphone array as the positive direction of the X-axis. It may be set and recorded that the first specified direction corresponding to the first microphone array is the positive direction of the X axis, and the second specified direction corresponding to the camera is the positive direction of the Y axis.
  • Position calibration of the first microphone array (if there are multiple first microphone arrays, the position calibration of each first microphone array can adopt the following processing method)
  • the control device controls the first sound generator to emit the sound signal S 7 , based on the time point when the first sound generator emits the sound signal S 7 and the time point when the second microphone array detects the sound signal S 7 , determine the difference between the second microphone array and the first sound signal.
  • the distance D 5 of the sound generator and based on the detection data of the second microphone array, determine the azimuth angle ⁇ 10 of the first sound generator relative to the second microphone array, and the control device is based on the distance D 5 , the azimuth angle ⁇ 10 and the second microphone array Position, to determine the position of the first sound emitter and the first microphone array.
  • the equivalent centers of the first sound generator and the first microphone array may be the same, that is, the positions of the first sound generator and the first microphone array may be the same.
  • the control device when the control device controls the first sounder to emit the sound signal S 7 , it may record the time point t 7 when the first sounder emits the sound signal S 7 .
  • the microphones of the second microphone array can detect corresponding audio data, and record the detection time point t 8 corresponding to the audio data, that is, the time point when the audio data is detected.
  • the control device can obtain the time point t 7 when the second microphone array detects the sound signal S 7 and the time point t 8 when the second microphone array detects the sound signal S 7 , and then can calculate the time point t 7 and the time point t The duration ⁇ T 5 between 8 .
  • the control device may calculate the distance D 5 between the second microphone array and the first sound generator according to the pre-stored sound velocity data V.
  • the second microphone array can send the audio data corresponding to the sound signal S7 to the control device.
  • the control device can perform sound source localization according to the audio data, and determine the azimuth angle ⁇ 10 of the first sound generator relative to the second microphone array.
  • the control device can calculate the position of the first sound generator according to the distance D 5 , the azimuth angle ⁇ 10 , the position of the second microphone array, and the geometric relationship between the first sound generator and the second microphone array.
  • the coordinates of the first sounder are expressed as (x 1 , y 1 ), referring to Fig. 17, the calculation process can be as follows:
  • the declination calibration of the first microphone array (if there are multiple first microphone arrays, the declination calibration of each first microphone array can adopt the following processing method)
  • the control device controls the first sound generator to emit a sound signal S 7 , determines the azimuth angle ⁇ 10 of the first sound generator relative to the second microphone array based on the detection data of the second microphone array, and controls the fifth sound generator to emit a sound signal S 8 , based on the detection data of the first microphone array, the azimuth angle ⁇ 11 of the fifth sound generator relative to the first microphone array is determined, and the control device is based on the azimuth angle ⁇ 10 , the azimuth angle ⁇ 11 and the reference direction of the second microphone array and the first Specify the included angle ⁇ 12 of the direction to determine the deflection angle ⁇ 5 of the first microphone array.
  • the microphones of the second microphone array can detect corresponding audio data, and the second microphone array can send the audio data to the control device.
  • the control device can perform sound source localization according to the audio data, and determine the azimuth angle ⁇ 10 of the first sound generator relative to the second microphone array.
  • the control device may also perform sound source localization according to the audio data detected by the microphones of the first microphone array, and determine the azimuth angle ⁇ 11 of the fifth sound generator relative to the first microphone array.
  • the control device can determine the deflection angle ⁇ 5 of the first microphone array according to the azimuth angle ⁇ 10 , the azimuth angle ⁇ 11 , and the geometric relationship between the second microphone array and the first microphone array.
  • the calculation process can be as follows:
  • ⁇ 5 ⁇ 11 - ⁇ 10 - ⁇
  • the control device can determine the deviation of the first microphone array according to the azimuth ⁇ 10 , azimuth ⁇ 11 , included angle ⁇ 12 , and the geometric relationship between the second microphone array and the first microphone array.
  • angle ⁇ 5 the calculation process can be as follows:
  • ⁇ 5 ⁇ 12 + ⁇ 11 - ⁇ 10 - ⁇
  • the control device controls the fourth sounder to emit the sound signal S 9 , based on the time point when the fourth sounder emits the sound signal S 9 and the time point when the first microphone array and the second microphone array detect the sound signal S 9 , determine the first microphone
  • the distance D 6 between the array and the fourth sound generator, and the distance D 7 between the second microphone array and the fourth sound generator are determined based on the position of the first microphone array, the position of the second microphone array, the distance D 6 and the distance D 7 Position of the fourth sounder and camera.
  • the equivalent centers of the fourth sound generator and the camera may be the same, that is, the positions of the fourth sound generator and the camera may be the same.
  • the control device when the control device controls the fourth sounder to emit the sound signal S 9 , it may record the time point t 9 when the fourth sounder emits the sound signal S 9 .
  • the first microphone array and the second microphone array can detect corresponding audio data, and record the detection time point corresponding to the audio data, that is, the time point when the audio data is detected.
  • the control device can obtain the time point t 10 when the first microphone array detects the sound signal S 9 and the time point t 11 when the second microphone array detects the sound signal S 9 , and then calculate the time point t 9 and the time point t
  • the time length ⁇ T 6 between the time point t 9 and the time point t 11 is ⁇ T 7 .
  • the control device can calculate the distance D 6 between the first microphone array and the fourth sound generator and determine the distance D 7 between the second microphone array and the fourth sound generator according to the pre-stored sound velocity data V.
  • the control device can calculate the position of the fourth sound generator according to the distance D 6 , the distance D 7 and the distance D 8 , and the geometric relationship among the first microphone array, the second microphone array and the fourth sound generator.
  • the calculation process for determining the position of the fourth sound generator is similar to the process for determining the position of the first sound generator in Case 1, and reference may be made to the relevant description of the position calibration of the first microphone array in Case 1.
  • deflection angle calibration of the camera in case three is similar to the corresponding processing in case two, and reference may be made to the description of the camera deflection angle calibration in case two, which will not be repeated here.
  • the embodiment of this application also provides a device for broadcast control, which can be applied to the control equipment in the broadcast control system mentioned in the above embodiment.
  • the broadcast control system includes a first microphone array, a second Microphone array, camera and control equipment, as shown in Figure 20, the device includes:
  • the calibration module 2001 is configured to determine the position of the first microphone array and the position of the camera. Specifically, the calibration function of the above step 401 and other implicit steps can be realized.
  • a determining module 2002 configured to, when the sound source object makes a sound, according to the position of the sound source object relative to the first microphone array, the position of the sound source object relative to the second microphone array, the position of the first microphone array and the position of the second microphone array Position, to determine the position of the sound source object.
  • the determining function of the above-mentioned step 402 and other implicit steps may be implemented.
  • the control module 2003 is configured to determine the guiding operation of the camera based on the position of the sound source object and the position of the camera. Specifically, the control function of the above step 403 and other implicit steps can be realized.
  • the first microphone array is integrated with a first sound generator
  • the second microphone array includes a first microphone and a second microphone
  • the calibration module 2001 is configured to: The time of the sound signal emitted by the first sounder and the time of the sound signal emitted by the first sounder determine the distance D1 between the first sounder and the first microphone and the distance D2 between the first sounder and the second microphone ; Based on the position of the first microphone, the position of the second microphone, the distance D 1 and the distance D 2 , determine the position of the first microphone array relative to the second microphone array.
  • the broadcast control system further includes a second sounder and a third sounder, and the second sounder and the third sounder are integrated with the second microphone array on the same electronic screen
  • the calibration module 2001 also uses In: obtaining the azimuth angle ⁇ 3 of the second sound generator sent by the first microphone array relative to the first microphone array and the azimuth angle ⁇ 4 of the third sound generator relative to the first microphone array; based on the azimuth angle ⁇ 3 , the azimuth angle ⁇ 4.
  • the position of the second sound generator and the position of the third sound generator determine the orientation of the first microphone array.
  • the camera is integrated with a fourth sounder
  • the second microphone array includes a first microphone and a second microphone
  • the calibration module 2001 is configured to: receive the fourth sounder based on the first microphone and the second microphone The time when the sound signal is emitted and the time when the fourth sounder emits the sound signal, determine the distance D 3 between the first microphone and the fourth sounder, and the distance D 4 between the second microphone and the fourth sounder; based on the position of the first microphone , the position of the second microphone, the distance D 3 and the distance D 4 determine the position of the camera relative to the second microphone array.
  • the first microphone array is integrated with a first sounder
  • the camera is integrated with a fourth sounder and a third microphone array
  • the calibration module 2001 is configured to: Based on the detection data of the sound signal, the azimuth angle ⁇ 6 of the first sound generator relative to the third microphone array is determined, and based on the detection data of the first microphone array when the fourth sound generator emits a sound signal, it is determined that the fourth sound generator is relative to the first microphone array.
  • An azimuth angle ⁇ 7 of the microphone array based on the azimuth angle ⁇ 6 , the azimuth angle ⁇ 7 and the azimuth of the first microphone array, the deflection angle of the camera is determined.
  • the first microphone array is integrated with a light emitter
  • the camera is integrated with a fourth sound emitter
  • the calibration module 2001 is used to: determine the position of the luminous point in the image captured by the camera, and the image is taken when the light emitter emits light , based on the position of the light-emitting point in the image and the rotation angle of the camera, determine the azimuth angle ⁇ 9 of the light emitter relative to the camera; based on the detection data of the first microphone array when the fourth sounder emits a sound signal, determine the fourth sounder Relative to the azimuth angle ⁇ 7 of the first microphone array; based on the azimuth angle ⁇ 9 , the azimuth angle ⁇ 7 and the orientation of the first microphone array, determine the orientation of the camera
  • the first microphone array is integrated with a first sound generator
  • the second microphone array includes a first microphone and a second microphone
  • the calibration module 2001 is configured to: The detection data when the sound signal determines the distance D 5 between the first sound generator and the second microphone array and the azimuth angle ⁇ 10 of the first sound generator relative to the second microphone array; based on the distance D 5 , the azimuth angle ⁇ 10 and the first The position of the second microphone array determines the position of the first microphone array.
  • the first microphone array is integrated with a first sound generator
  • the second microphone array is integrated with a fifth sound generator
  • the calibration module 2001 is configured to: emit a sound signal at the first sound generator based on the second microphone array , determine the azimuth angle ⁇ 10 of the first sound generator relative to the second microphone array, and based on the detection data of the first microphone array when the fifth sound generator emits a sound signal, determine the relative position of the fifth sound generator to the first Azimuth angle ⁇ 11 of the microphone array; based on the azimuth angle ⁇ 10 , the azimuth angle ⁇ 11 and the orientation of the second microphone array, the orientation of the first microphone array is determined.
  • the camera is integrated with a fourth sounder
  • the calibration module 2001 is further configured to: based on the time when the first microphone array and the second microphone array receive the sound signal from the fourth sounder and the fourth sounding The time when the sound signal is emitted by the first microphone array and the distance D 6 between the first microphone array and the fourth sound generator, and the distance D 7 between the second microphone array and the fourth sound generator; The position, distance D 6 and distance D 7 , determine the position of the camera.
  • control module 2003 is configured to: determine the azimuth of the sound source object relative to the camera and the distance between the sound source object and the camera based on the position of the sound source object and the camera; The azimuth angle of the object relative to the camera determines the camera's guide rotation angle, and based on the distance between the sound source object and the camera, determines the camera's guide focal length.
  • the broadcasting control system further includes another camera; the control module 2003 is configured to: based on the positions of the sound source object and the positions of the two cameras, determine which of the two cameras is farthest from the sound source object The target camera, based on the position of the sound source object and the position of the target camera, determine the guidance operation for the target camera.
  • the above-mentioned calibration module 2001, determination module 2002 and control module 2003 may be implemented by a processor, or implemented by a processor together with a memory and a transceiver.
  • the broadcasting control device when the broadcasting control device provided by the above-mentioned embodiment executes the broadcasting control process, the division of the above-mentioned functional modules is used as an example for illustration. In practical applications, the above-mentioned functions can be assigned to different functional modules according to needs To complete means to divide the internal structure of the device into different functional modules to complete all or part of the functions described above.
  • the device for broadcasting control provided by the above-mentioned embodiments and the embodiment of the method for broadcasting control belong to the same idea, and its specific implementation process is detailed in the method embodiment, and will not be repeated here.
  • all or part may be realized by software, hardware, firmware or any combination thereof, and when software is used, all or part may be realized in the form of a computer program product.
  • the computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the device, all or part of the processes or functions according to the embodiments of the present application will be generated.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website, computer, server or data center Transmission to another website site, computer, server, or data center by wired (eg, coaxial cable, optical fiber, DSL) or wireless (eg, infrared, wireless, microwave, etc.) means.
  • the computer-readable storage medium may be any available medium that can be accessed by the device, or a data storage device such as a server or a data center integrated with one or more available media.
  • the available medium may be a magnetic medium (such as a floppy disk, a hard disk, and a magnetic tape, etc.), an optical medium (such as a digital video disk (DVD), etc.), or a semiconductor medium (such as a solid-state hard disk, etc.).
  • the program can be stored in a computer-readable storage medium.
  • the above-mentioned The storage medium mentioned may be a read-only memory, a magnetic disk or an optical disk, and the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Otolaryngology (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

本申请实施例公开了一种导播控制的方法、装置、存储介质和计算机程序产品,属于通信技术领域。应用于导播控制系统,所述方法包括:控制设备在声源对象发声时,基于第一麦克风阵列的检测数据确定所述声源对象相对于所述第一麦克风阵列的方位角θ 1,并基于第二麦克风阵列的检测数据确定所述声源对象相对于所述第二麦克风阵列的方位角θ 2;所述控制设备基于所述方位角θ 1、所述方位角θ 2、所述第一麦克风阵列的位置和所述第二麦克风阵列的位置,确定所述声源对象的位置;所述控制设备基于所述声源对象的位置控制摄像机对所述声源对象进行拍摄,得到导播视频图像。采用本申请,可以准确识别发言者,从而提升自动导播的准确度。

Description

导播控制的方法、装置、存储介质和计算机程序产品
本申请要求于2021年11月25日提交的申请号为202111415949.4、发明名称为“一种分布式导播的硬件装置和方法”的中国专利申请的优先权,以及于2022年02月08日提交的申请号为202210119348.7、发明名称为“导播控制的方法、装置、存储介质和计算机程序产品”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及通信技术领域,特别涉及一种导播控制的方法、装置、存储介质和计算机程序产品。
背景技术
导播是指在视频拍摄过程中基于实时拍摄需求控制摄像机对场景中的关键对象(人或物)进行拍摄以输出视频图像。例如,在视频会议中,可以控制摄像机对当前的发言人进行拍摄,发言人变换时,可以控制摄像机对新的发言人进行拍摄。导播的过程中,为了得到包含关键对象的视频图像,可以对摄像机拍摄方向进行调整,也可以在多个摄像机中进行视频图像选择,还可以在视频图像中进行局部截取。
目前,随着计算机技术的发展,自动导播得到了飞速发展,正在逐渐取代人工导播。一般,自动导播的处理过程是:由控制设备对摄像机的实时拍摄的视频图像进行识别,确定图像中具有指定特征的对象(即上述的关键对象),控制摄像机对该对象进行拍摄。例如,在会议场景中,控制设备可以识别实时拍摄的视频图像中站立或存在嘴部动作(在讲话)的人物,将该人物确定为发言者,然后控制摄像机拍摄发言者的特写进行播放。
然而,现有技术中的自动导播方法局限性比较明显,有时候导播的准确度较差。
发明内容
本申请实施例提供了一种导播控制方法,可以解决现有技术中导播准确度较差的问题。所述技术方案如下:
第一方面,提供了一种导播控制的方法,该方法应用于导播控制系统,导播控制系统包括第一麦克风阵列、第二麦克风阵列、摄像机和控制设备,该方法包括:控制设备确定第一麦克风阵列的位置以及摄像机的位置;在声源对象发声时,控制设备根据声源对象相对于第一麦克风阵列的位置、声源对象相对于第二麦克风阵列的位置、第一麦克风阵列的位置和第二麦克风阵列的位置,确定声源对象的位置;控制设备基于声源对象的位置以及摄像机的位置,确定对摄像机的导播操作。
在发言人讲话时,第一麦克风阵列中的每个麦克风可以检测到相应的音频数据,第一麦克风阵列将这些音频数据发送至控制设备。控制设备可以根据这些音频数据进行声源定位,确定发言人相对于第一麦克风阵列的方位角θ 1,声源定位过程使用的算法可以是可控波束(steered-response power,SRP)算法等。同样地,控制设备也可以根据第二麦克风阵列的麦 克风检测到的音频数据进行声源定位,确定发言人相对于第二麦克风阵列的方位角θ 2
对于第一麦克风阵列和第二麦克风的偏角都为0度的情况,控制设备可以根据方位角θ 1、方位角θ 2、第一麦克风阵列的位置和第二麦克风阵列的位置,以及第一麦克风阵列、第二麦克风阵列和发言人之间的几何关系,通过计算得到发言人的位置。
对于第一麦克风阵列和第二麦克风阵列的偏角都不为0度的情况,控制设备可以根据第一麦克风阵列的偏角γ 1、第二麦克风阵列的偏角γ 2、方位角θ 1、方位角θ 2、第一麦克风阵列的位置和第二麦克风阵列的位置,以及第一麦克风阵列、第二麦克风阵列、发言人之间的几何关系,通过计算得到发言人的位置。
控制设备确定发言人的位置后,可以基于发言人的位置和摄像机的位置,计算发言人相对于摄像机的方位角以及发言人与摄像机的距离。该距离是平面等效距离,也即摄像机的等效中心和发言人的等效中心在平面内的投影距离。
可以基于发言人相对于摄像机的方位角,确定摄像机的导播旋转角。摄像机可以包括可旋转摄像头和固定底座,摄像头可以相对于固定底座进行旋转,可以为摄像头指定一个初始拍摄方向,初始拍摄方向和摄像头的基准方向可以相同,该导播旋转角可以是摄像头实时的拍摄方向相对于初始拍摄方向的角度,初始拍摄方向可以认为是0度方向,导播旋转角和发言人相对于摄像机的方位角可以相同。
在确定发言人相对于摄像机的距离之后,可以基于该距离,确定摄像机的导播焦距。控制设备可以查询预先存储的第一对应关系表,确定该距离对应的导播焦距。第一对应关系表中可以记录有发言人相对于摄像机的距离和摄像机焦距的对应关系。
对于摄像机的偏角为0度的情况,控制设备可以根据发言人的位置、摄像机的位置确定摄像机的导播旋转角和导播焦距,从而可以控制摄像机旋转至导播旋转角,并控制摄像机按照导播焦距进行拍摄。
对于摄像机的偏角不为0度的情况,控制设备可以根据摄像机的偏角、发言人的位置、摄像机的位置确定摄像机的导播旋转角和导播焦距,从而可以控制摄像机云台旋转至导播旋转角,并控制摄像机按照导播焦距进行拍摄。
需要说明的是,上述导播控制系统的示例中,可以添加多个摄像头布置在不同的位置,以更好地拍摄参会成员。
对于导播控制系统中存在至少两个摄像头的情况,控制设备可以基于发言人的位置和两个摄像机的位置,确定两个摄像机中与发言人距离较远的目标摄像机,基于发言人的位置以及目标摄像机的位置,确定对目标摄像机的导播操作。
控制设备可以基于声源对象的位置和多个摄像机的位置,控制多个摄像机对声源对象进行拍摄,得到多个视频图像。然后,可以对得到的多个视频图像进行图像识别,选取满足目标条件的视频图像作为导播视频图像。目标条件可以有多种,例如,选取人脸角度最接近正面的视频图像作为导播视频图像等,视频图像中的人脸角度可以使用人脸角度检测的机器学习模型来确定。
本申请实施例所示的方案,只要声源对象在发声,就可以基于声音对其进行定位。这样,避免了在基于图像识别进行声源对象定位时要求发言者必须具有明显动作(如明显的嘴部动作)的问题,这样,摆脱了现有技术中基于图像识别的自动导播方法的局限性,提高了导播的准确度。
在一种可能的实现方式中,第一麦克风阵列中集成有第一发声器,第二麦克风阵列包括第一麦克风和第二麦克风,控制设备基于第一麦克风和第二麦克风接收到第一发声器发出的声音信号的时间以及第一发声器发出声音信号的时间确定第一发声器与第一麦克风之间的距离D 1以及第一发声器与第二麦克风之间的距离D 2;控制设备基于第一麦克风的位置、第二麦克风的位置、距离D 1和距离D 2,确定第一麦克风阵列相对于第二麦克风阵列的位置。
其中,第一发声器和第一麦克风阵列的等效中心可以相同,即第一发声器的位置和第一麦克风阵列的位置可以相同。第一麦克风阵列相对于第二麦克风阵列的位置,可以是第一麦克风阵列中的第一发声器相对于所述第二麦克风阵列的位置。具体实现中,可以利用坐标系来确定所述位置,比如,将坐标系的原点设置在第二麦克风阵列中心时,所述第一麦克风阵列的坐标体现的是所述第一麦克风阵列相对于所述第二麦克风阵列的位置。
获取第一发声器发出声音信号的时间的方式可以有多种,后续处理中发声器发出声音信号的时间都可以参照该处说明。
方式一,可以设置第一发声器在每次上电时发出声音信号,控制设备可以获取第一发声器上电的时间,作为第一发声器发出声音信号的时间。
方式二,控制设备指示第一发声器发出声音信号,第一发声器发出声音信号时,可以记录下发出声音信号的时间,然后将该时间发送至控制设备。
控制设备控制第一发声器发出声音信号S 1时,第一发声器将发出声音信号S 1的时间点t 1发送给控制设备进行记录。第二麦克风阵列中的每个麦克风可以接收到声音信号,并记录检测到该声音信号的时间点,发送给控制设备。控制设备可以获取第二麦克风阵列中的第一麦克风检测到声音信号S 1的时间点t 2、以及第二麦克风阵列中的第二麦克风检测到声音信号S 1的时间点t 3,然后,可以计算得到时间点t 1与时间点t 2之间的时长ΔT 1、时间点t 1与时间点t 3之间的时长ΔT 2。进而,控制设备可以根据预先存储的音速数据V,计算得到第一麦克风与第一发声器的距离D 1和第二麦克风与第一发声器的距离D 2
根据第一麦克风和第二麦克风的位置,可以确定第一麦克风和第二麦克风之间的距离为D。然后,控制设备可以根据距离D、距离D 1和距离D 2,以及第一麦克风、第二麦克风和第一发声器之间的几何关系,通过计算得到第一发声器的位置。
本申请实施例所示的方案,基于第一麦克风和第二麦克风接收到第一发声器发出的声音信号的时间以及第一发声器发出声音信号的时间确定第一发声器与第一麦克风之间的距离D 1以及第一发声器与第二麦克风之间的距离D 2,然后基于第一麦克风的位置、第二麦克风的位置、距离D 1和距离D 2,确定第一麦克风阵列相对于第二麦克风阵列的位置。这样,无需人工标定设备参数,从而,提高了标定设备参数的便捷性。
在一种可能的实现方式中,导播控制系统还包括第二发声器和第三发声器,第二发声器和第三发声器与第二麦克风阵列集成在同一电子屏幕上,控制设备获得第一麦克风阵列发送的第二发声器相对于第一麦克风阵列的方位角θ 3和第三发声器相对于第一麦克风阵列的方位角θ 4;控制设备基于方位角θ 3、方位角θ 4、第二发声器的位置与第三发声器的位置,确定第一麦克风阵列的方位。
其中,第二发声器的位置和第三发声器的位置可以是预先设置好的,控制设备中可以预先存储有第二发声器的位置与第三发声器的位置,而不需要从麦克风阵列处获得。设备的方位指的是设备的基准方向朝向的方向,可以用设备的基准方向与相对于指定方向的夹角表示 (也即设备的偏角),指定方向可以是X轴或Y轴方向。
第二发声器发出声音信号S 2时,第一麦克风阵列中的每个麦克风可以检测到相应的音频数据,第一麦克风阵列将这些音频数据发送至控制设备。控制设备可以根据这些音频数据进行声源定位,确定第二发声器相对于第一麦克风阵列的方位角θ 3。同样地,第三发声器发声时,控制设备也可以根据第一麦克风阵列的麦克风检测到的音频数据进行声源定位,确定第三发声器相对于第一麦克风阵列的方位角θ 4。这里对方位角的计算原理进行介绍,也即对前面提到的SRP算法进行介绍,该算法的计算公式如下:
Figure PCTCN2022105499-appb-000001
X m(k)代表第m个麦克风k频段的快速傅里叶变换(fast fourier transform,FFT)值,s(θ)代表二维空间平面位于角度θ的声源对应的导向矢量,导向矢量可以根据麦克风阵列内部麦克风的布局以及角度搜索范围(人为设置,后续进行最大极值点的确定时所针对的角度范围)提前计算好。以麦克风阵列中各麦克风线型布局为例,导向矢量的计算公式为:
Figure PCTCN2022105499-appb-000002
我们选取第一麦克风为参考麦克风,d mcosθ代表声源到达第m个麦克风与参考麦克风之间路程差。对于单声源定位,在θ属于角度搜索范围前提下,确定Y(θ)的最大极值点对应的角度θ,即为声源对象的方位角。
控制设备可以根据第二发声器的位置坐标、第三发声器的位置坐标,确定第二发声器和第三发声器的距离L。然后控制设备可以基于方位角θ 3、方位角θ 4、第二发声器的位置、第三发声器的位置,以及第一麦克风阵列、第二发声器和第三发声器之间的位置关系,通过计算确定第一麦克风阵列的偏角θ 5
本申请实施例所示的方案,首先获得第一麦克风阵列发送的第二发声器相对于第一麦克风阵列的方位角θ 3和第三发声器相对于第一麦克风阵列的方位角θ 4,然后基于方位角θ 3、方位角θ 4、第二发声器的位置与第三发声器的位置,确定第一麦克风阵列的方位。这样,无需人工标定设备参数,从而,提高了标定设备参数的便捷性。
在一种可能的实现方式中,摄像机集成有第四发声器,第二麦克风阵列包括第一麦克风和第二麦克风,控制设备基于第一麦克风和第二麦克风接收到第四发声器发出声音信号的时间以及第四发声器发出声音信号的时间,确定第一麦克风与第四发声器的距离D 3、以及第二麦克风与第四发声器的距离D 4;控制设备基于第一麦克风的位置、第二麦克风的位置、距离D 3和距离D 4,确定摄像机相对于第二麦克风阵列的位置。
其中,第四发声器和摄像机的等效中心可以相同,即第四发声器的位置和摄像机的位置 可以相同。
控制设备控制第四发声器发出声音信号S 4时,可以记录第四发声器发出声音信号S 4的时间点t 4。第二麦克风阵列中的每个麦克风可以检测到相应的音频数据,并记录有音频数据对应的检测时间点,即检测到该音频数据的时间点。控制设备可以获取第二麦克风阵列中的第一麦克风检测到声音信号S 4的时间点t 5、以及第二麦克风阵列中的第二麦克风检测到声音信号S 4的时间点t 6,然后,可以计算得到时间点t 4与时间点t 5之间的时长ΔT 3、时间点t 4与时间点t 6之间的时长ΔT 4。进而,控制设备可以根据预先存储的音速数据V,计算得到第一麦克风与第四发声器的距离D 3和确定第二麦克风与第四发声器的距离D 4
根据第一麦克风和第二麦克风的位置,可以确定第一麦克风和第二麦克风之间的距离为D。然后,控制设备可以根据距离D、距离D 3和距离D 4,以及第一麦克风、第二麦克风和第四发声器之间的几何关系,通过计算得到第四发声器的位置。
本申请实施例所示的方案,首先基于第一麦克风和第二麦克风接收到第四发声器发出声音信号的时间以及第四发声器发出声音信号的时间,确定第一麦克风与第四发声器的距离D 3、以及第二麦克风与第四发声器的距离D 4,然后基于第一麦克风的位置、第二麦克风的位置、距离D 3和距离D 4,确定摄像机相对于第二麦克风阵列的位置。这样,无需人工标定设备参数,从而,提高了标定设备参数的便捷性。
在一种可能的实现方式中,第一麦克风阵列集成有第一发声器,摄像机集成有第四发声器和第三麦克风阵列,控制设备基于第三麦克风阵列在第一发声器发出声音信号时的检测数据,确定第一发声器相对于第三麦克风阵列的方位角θ 6,基于第一麦克风阵列在第四发声器发出声音信号时的检测数据,确定第四发声器相对于第一麦克风阵列的方位角θ 7;控制设备基于方位角θ 6、方位角θ 7和第一麦克风阵列的方位,确定摄像机的偏角。
其中,第一麦克风阵列的方位可以是人工测量并存储到控制设备中,也可以是通过参数标定过程测定的。第三麦克风阵列的等效中心和摄像机的等效中心可以相同,即第三麦克风的位置和摄像机的位置可以相同。第三麦克风的偏角和摄像机的偏角可以相同。第四发声器的等效中心和摄像机的等效中心可以相同,即第四发声器的位置和摄像机的位置可以相同。
第一发声器发出声音信号S 5时,第三麦克风阵列中的每个麦克风可以检测到相应的音频数据,第三麦克风阵列将这些音频数据发送至控制设备。控制设备可以根据这些音频数据进行声源定位,确定第一发声器相对于第三麦克风阵列的方位角θ 6。同样地,第四发声器发声时,控制设备也可以根据第一麦克风阵列的麦克风检测到的音频数据进行声源定位,确定第四发声器相对于第一麦克风阵列的方位角θ 7。根据方位角θ 6、方位角θ 7、偏角θ 5,以及第一发声器、第三麦克风阵列和第四发声器之间的几何关系,可以计算得到第三麦克风和摄像机的偏角θ 8
本申请实施例所示的方案,首先基于第三麦克风阵列在第一发声器发出声音信号时的检测数据,确定第一发声器相对于第三麦克风阵列的方位角θ 6,基于第一麦克风阵列在第四发声器发出声音信号时的检测数据,确定第四发声器相对于第一麦克风阵列的方位角θ 7,然后基于方位角θ 6、方位角θ 7和第一麦克风阵列的方位,确定摄像机的偏角。这样,无需人工标定设备参数,从而,提高了标定设备参数的便捷性。
在一种可能的实现方式中,第一麦克风阵列集成有发光器,摄像机集成有第四发声器,控制设备确定摄像机拍摄的图像中的发光点位置,图像是发光器发光时拍摄的,基于图像中 的发光点位置以及摄像机的旋转角,确定发光器相对于摄像机的方位角θ 9;控制设备基于第一麦克风阵列在第四发声器发出声音信号时的检测数据,确定第四发声器相对于第一麦克风阵列的方位角θ 7;控制设备基于方位角θ 9、方位角θ 7和第一麦克风阵列的方位,确定摄像机的方位。
其中,第一麦克风阵列的方位是第一麦克风阵列的基准方向相对于第一指定方向的角度,第一指定方向可以是X轴正向,或者其他指定的方向。摄像机的方位是摄像机的基准方向相对于第二指定方向的角度,第二指定方向可以是Y轴正向。发光器的等效中心与第一麦克风阵列的等效中心可以相同,即发光器的位置与第一麦克风阵列的位置可以相同。第四发声器的等效中心和摄像机的等效中心可以相同,即第四发声器的位置和摄像机的位置可以相同。
控制设备可以记录有摄像机焦距与水平拍摄角度范围(又称为水平视场角)的对应关系。该对应关系可以是摄像机上报给控制设备的,也可以是人工录入控制设备的,等等。控制设备可以确定摄像机当前的焦距。然后在上述对应关系表中查找当前的焦距对应的水平拍摄角度范围γ 4。控制器在控制发光器发光之后,可以获取摄像机拍摄的图像,并在图像中确定发光点位置与图像纵向中轴线的距离L 3。控制设备中可以记录有图像左侧或右侧边界与图像纵向中轴线的距离L 4。摄像头的实时拍摄方向对应于图像的纵向中轴线。根据水平拍摄角度γ 4、距离L 3和距离L 4,可以确定发光器相对于摄像头的方位角γ 5,方位角γ 5是从摄像头的实时拍摄方向到发光器与摄像头的连线的逆时针夹角。此时控制设备还可以获取摄像机当前的旋转角γ 6。根据方位角γ 5和旋转角γ 6,可以计算得到发光器相对于摄像机的方位角θ 9。旋转角γ 6是摄像机的摄像头相对于固定底座的旋转角度,一般摄像头是在控制设备的控制下转动的,所以控制设备是已知该旋转角γ 6的。需要说明的是,旋转角并非计算摄像机的方位的必要参数,在其他可能的情况中,也可以不使用旋转角而计算得到摄像机的方位。
控制设备可以控制第四发声器发出声音信号S 6,第四发声器发出声音信号S 6时,第一麦克风阵列中的每个麦克风可以检测到相应的音频数据,第一麦克风阵列可以将这些音频数据发送至控制设备。控制设备可以根据这些音频数据进行声源定位,确定第四发声器相对于第一麦克风阵列的方位角θ 7
控制设备基于方位角θ 9、方位角θ 7和第一麦克风阵列的偏角θ 5,以及第一麦克风阵列、摄像机和第四发声器之间的几何关系,可以计算得到摄像机的偏角θ 8
本申请实施例所示的方案,首先确定摄像机拍摄的图像中的发光点位置,基于图像中的发光点位置以及摄像机的旋转角,确定发光器相对于摄像机的方位角θ 9,然后基于第一麦克风阵列在第四发声器发出声音信号时的检测数据,确定第四发声器相对于第一麦克风阵列的方位角θ 7,进而基于方位角θ 9、方位角θ 7和第一麦克风阵列的方位,确定摄像机的方位。这样,无需人工标定设备参数,从而,提高了标定设备参数的便捷性。
在一种可能的实现方式中,第一麦克风阵列集成有第一发声器,第二麦克风阵列包括第一麦克风和第二麦克风,控制设备基于第二麦克风阵列在第一发声器发出声音信号时的检测数据确定第一发声器与第二麦克风阵列之间的距离D 5以及第一发声器相对于第二麦克风阵列的方位角θ 10;控制设备基于距离D 5、方位角θ 10和第二麦克风阵列的位置,确定第一麦克风阵列的位置。
其中,第一发声器和第一麦克风阵列的等效中心可以相同,即第一发声器的位置和第一麦克风阵列的位置可以相同。
控制设备控制第一发声器发出声音信号S 7时,可以记录第一发声器发出声音信号S 7的时间点t 7。第二麦克风阵列的麦克风可以检测到相应的音频数据,并记录有音频数据对应的检测时间点t 8,即检测到该音频数据的时间点。控制设备可以获取第二麦克风阵列检测到声音信号S 7的时间点t 7、以及第二麦克风阵列检测到声音信号S 7的时间点t 8,然后,可以计算得到时间点t 7与时间点t 8之间的时长ΔT 5。进而,控制设备可以根据预先存储的音速数据V,计算得到第二麦克风阵列与第一发声器的距离D 5
同时,第二麦克风阵列可以将声音信号S 7对应的音频数据发送至控制设备。控制设备可以根据这些音频数据进行声源定位,确定第一发声器相对于第二麦克风阵列的方位角θ 10
控制设备可以根据距离D 5、方位角θ 10和第二麦克风阵列的位置,以及第一发声器与第二麦克风阵列的几何关系,计算得到第一发声器的位置。
本申请实施例所示的方案,首先基于第二麦克风阵列在第一发声器发出声音信号时的检测数据确定第一发声器与第二麦克风阵列之间的距离D 5以及第一发声器相对于第二麦克风阵列的方位角θ 10,然后基于距离D 5、方位角θ 10和第二麦克风阵列的位置,确定第一麦克风阵列的位置。这样,无需人工标定设备参数,从而,提高了标定设备参数的便捷性。
在一种可能的实现方式中,第一麦克风阵列集成有第一发声器,第二麦克风阵列集成有第五发声器,控制设备基于第二麦克风阵列在第一发声器发出声音信号时的检测数据,确定第一发声器相对于第二麦克风阵列的方位角θ 10,以及基于第一麦克风阵列在第五发声器发出声音信号时的检测数据,确定第五发声器相对于第一麦克风阵列的方位角θ 11;控制设备基于方位角θ 10、方位角θ 11和第二麦克风阵列的方位,确定第一麦克风阵列的方位。
第一发声器发出声音信号S 7时,第二麦克风阵列的麦克风可以检测到相应的音频数据,第二麦克风阵列可以将这些音频数据发送至控制设备。控制设备可以根据这些音频数据进行声源定位,确定第一发声器相对于第二麦克风阵列的方位角θ 10。同样地,第五发声器发声时,控制设备也可以根据第一麦克风阵列的麦克风检测到的音频数据进行声源定位,确定第五发声器相对于第一麦克风阵列的方位角θ 11
对于θ 12为0度的情况,控制设备可以根据方位角θ 10、方位角θ 11,以及第二麦克风阵列和第一麦克风阵列的几何关系,确定第一麦克风阵列的偏角θ 5。对于θ 12不为0度的情况,控制设备可以根据方位角θ 10、方位角θ 11、夹角θ 12,以及第二麦克风阵列和第一麦克风阵列的几何关系,确定第一麦克风阵列的偏角θ 5
需要说明的是,第一麦克风阵列和第二麦克风阵列之间的位置关系有多种可能,上述说明过程仅以其中的一种位置关系为例进行说明,对于其他可能的位置关系,均可以根据上述相关数据通过几何运算得到第一麦克风阵列的偏角。
本申请实施例所示的方案,首先基于第二麦克风阵列在第一发声器发出声音信号时的检测数据,确定第一发声器相对于第二麦克风阵列的方位角θ 10,以及基于第一麦克风阵列在第五发声器发出声音信号时的检测数据,确定第五发声器相对于第一麦克风阵列的方位角θ 11,然后设备基于方位角θ 10、方位角θ 11和第二麦克风阵列的方位,确定第一麦克风阵列的方位。这样,无需人工标定设备参数,从而,提高了标定设备参数的便捷性。
在一种可能的实现方式中,摄像机集成有第四发声器,控制设备基于第一麦克风阵列和第二麦克风阵列接收到第四发声器发出的声音信号的时间和第四发声器发出声音信号的时间,确定第一麦克风阵列与第四发声器的距离D 6、以及第二麦克风阵列与第四发声器的距离D 7; 控制设备基于第一麦克风阵列的位置、第二麦克风阵列的位置、距离D 6和距离D 7,确定摄像机的位置。
其中,第四发声器和摄像机的等效中心可以相同,即第四发声器的位置和摄像机的位置可以相同。
控制设备控制第四发声器发出声音信号S 9时,可以记录第四发声器发出声音信号S 9的时间点t 9。第一麦克风阵列和第二麦克风阵列可以检测到相应的音频数据,并记录有音频数据对应的检测时间点,即检测到该音频数据的时间点。控制设备可以获取第一麦克风阵列检测到声音信号S 9的时间点t 10、以及第二麦克风阵列检测到声音信号S 9的时间点t 11,然后,可以计算得到时间点t 9与时间点t 10之间的时长ΔT 6、时间点t 9与时间点t 11之间的时长ΔT 7。进而,控制设备可以根据预先存储的音速数据V,计算得到第一麦克风阵列与第四发声器的距离D 6和确定第二麦克风阵列与第四发声器的距离D 7
根据第一麦克风阵列和第二麦克风阵列的位置,可以确定第一麦克风阵列和第二麦克风阵列之间的距离为D 8。然后,控制设备可以根据距离D 6、距离D 7和距离D 8,以及第一麦克风阵列、第二麦克风阵列和第四发声器之间的几何关系,通过计算得到第四发声器的位置。
本申请实施例所示的方案,首先基于第一麦克风阵列和第二麦克风阵列接收到第四发声器发出的声音信号的时间和第四发声器发出声音信号的时间,确定第一麦克风阵列与第四发声器的距离D 6、以及第二麦克风阵列与第四发声器的距离D 7,然后基于第一麦克风阵列的位置、第二麦克风阵列的位置、距离D 6和距离D 7,确定摄像机的位置。这样,无需人工标定设备参数,从而,提高了标定设备参数的便捷性。
在一种可能的实现方式中,控制设备基于声源对象的位置以及摄像机的位置,控制设备基于声源对象的位置和摄像机的位置,确定声源对象相对于摄像机的方位角、以及声源对象与摄像机的距离;控制设备基于声源对象相对于摄像机的方位角,确定摄像机的导播旋转角,并基于声源对象与摄像机的距离,确定摄像机的导播焦距。
可以基于发言人的位置和摄像机的位置,计算发言人相对于摄像机的方位角以及发言人与摄像机的距离。该距离是平面等效距离,也即摄像机的等效中心和发言人的等效中心在平面内的投影距离。
然后,可以基于发言人相对于摄像机的方位角,确定摄像机的导播旋转角。摄像机可以包括可旋转摄像头和固定底座,摄像头可以相对于固定底座进行旋转,可以为摄像头指定一个初始拍摄方向,初始拍摄方向和摄像头的基准方向可以相同,该导播旋转角可以是摄像头实时的拍摄方向相对于初始拍摄方向的角度,初始拍摄方向可以认为是0度方向,导播旋转角和发言人相对于摄像机的方位角可以相同。
在确定发言人相对于摄像机的距离之后,可以基于该距离,确定摄像机的导播焦距。控制设备可以查询预先存储的第一对应关系表,确定该距离对应的导播焦距。第一对应关系表中可以记录有发言人相对于摄像机的距离和摄像机焦距的对应关系。
本申请实施例所示的方案,首先基于声源对象的位置以及摄像机的位置,控制设备基于声源对象的位置和摄像机的位置,确定声源对象相对于摄像机的方位角、以及声源对象与摄像机的距离,然后基于声源对象相对于摄像机的方位角,确定摄像机的导播旋转角,并基于声源对象与摄像机的距离,确定摄像机的导播焦距。这样,无需人工确定导播参数,从而,提高了导播过程的便捷性。
在一种可能的实现方式中,第一发声器发出声音信号的时间是第一发声器上电的时间。
在一种可能的实现方式中,导播控制系统还包括另一摄像机;控制设备基于声源对象的位置和两个摄像机的位置,确定两个摄像机中与声源对象距离较远的目标摄像机,基于声源对象的位置以及目标摄像机的位置,确定对目标摄像机的导播操作。
这种处理方式可以适用于如下场景:会议室中布置有长条桌,长条桌两侧布置有若干椅子,发言人面向长条桌坐在椅子上。长条桌两侧的墙壁上分别设置有一个摄像机。在这种场景下,对于分别布置在长条桌两侧的墙壁上的两个摄像机而言,距离发言人较远的摄像机能够更好地拍摄到发言人的人脸。
本申请实施例所示的方案,基于声源对象的位置和两个摄像机的位置,确定两个摄像机中与声源对象距离较远的目标摄像机,基于声源对象的位置以及目标摄像机的位置,确定对目标摄像机的导播操作。这样,能够在常规会议场景下更好地拍摄到发言人的人脸,提高了自动导播的准确度。
第二方面,提供了一种导播控制的装置,该装置包括一个或多个模块,该一个或多个模块用于实现第一方面及其可能的实现方式的方法。
第三方面,提供了一种计算机设备,计算机设备包括存储器和处理器,存储器用于存储计算机指令;处理器执行存储器存储的计算机指令,以使计算机设备执行第一方面及其可能的实现方式的方法。
第四方面,提供了一种计算机可读存储介质,计算机可读存储介质存储有计算机程序代码,当计算机程序代码被计算机设备执行时,计算机设备执行第一方面及其可能的实现方式的方法。
第五方面,提供了一种计算机程序产品,计算机程序产品包括计算机程序代码,在计算机程序代码被计算机设备执行时,计算机设备执行第一方面及其可能的实现方式的方法。
本申请实施例提供的技术方案带来的有益效果是:
本申请实施例所示的方案,只要声源对象在发声,就可以基于声音对其进行定位。这样,避免了在基于图像识别进行声源对象定位时要求发言者必须具有明显动作(如明显的嘴部动作)的问题,这样,摆脱了现有技术中基于图像识别的自动导播方法的局限性,提高了导播的准确度。
附图说明
图1是本申请实施例提供的一种导播控制系统的示意图;
图2是本申请实施例提供的一种计算机设备的结构示意图;
图3是本申请实施例提供的一种导播控制系统的示意图;
图4是本申请实施例提供的一种导播控制方法的流程图;
图5是本申请实施例提供的一种处理示意图;
图6是本申请实施例提供的一种处理示意图;
图7是本申请实施例提供的一种处理示意图;
图8是本申请实施例提供的一种导播控制系统的示意图;
图9是本申请实施例提供的一种处理示意图;
图10是本申请实施例提供的一种处理示意图;
图11是本申请实施例提供的一种处理示意图;
图12是本申请实施例提供的一种导播控制系统的示意图;
图13是本申请实施例提供的一种处理示意图;
图14是本申请实施例提供的一种处理示意图;
图15是本申请实施例提供的一种处理示意图;
图16是本申请实施例提供的一种导播控制系统的示意图;
图17是本申请实施例提供的一种处理示意图;
图18是本申请实施例提供的一种处理示意图;
图19是本申请实施例提供的一种处理示意图;
图20是本申请实施例提供的一种导播控制的装置示意图。
具体实施方式
下面对本实施例中使用的一些名词进行解释。
基准方向:导播控制系统中的设备都可以是有向设备,有向设备具有基准方向,也可以称作设备的正方向,设备的基准方向会随设备的旋转而旋转。基准方向一般在设备生产过程中就已经人为设定好,而且还可以在设备上设置相应的图标进行标记,以方便用户安装。例如,某云台摄像机的基准方向是云台座的任意指定半径方向,在云台座上,该半径的位置处,可以印制有一个线条标记。
有向设备的特点是,设备的实时输出参数中会包括方位角或旋转角(下面会分别进行介绍),这类角度参数都需要以基准方向为参照来确定。
方位角:B对象相对于A设备的方位角,指的是从A设备的基准方向到B对象等效中心与A设备等效中心的连线在平面内的夹角。本实施例将从A设备的基准方向到B对象等效中心与A设备等效中心的连线在平面内的逆时针夹角定义为B设备相对于A设备的方位角。
偏角:设备的基准方向相对于指定方向(可以人为设置)的夹角。本实施例将设备的从基准方向到指定方向在平面内的逆时针夹角定义为该设备的偏角。
方位:设备的方位指的是设备的基准方向朝向的方向,可以用设备的基准方向与相对于指定方向的夹角表示(也即设备的偏角),指定方向可以是X轴或Y轴方向。
指定方向:是用于确定设备的偏角而设置的一个方向,导播控制系统中,可以针对不同的设备设置不同的指定方向,也可以对不同的设备设置相同的指定方向。在建立直角坐标系的情况下,指定方向是以坐标轴为参照的方向。上面已经提到,设备的偏角是以指定方向为参照,那么实际上设备的偏角也是以坐标轴为参照的。导播操作的过程中,声源对象的方位以麦克风阵列的基准方向为参照,那么实际上声源对象的方位也可以最终表示为与坐标轴的相对角度。实际应用中,为了计算方便,一般会把指定方向设置为某坐标轴的正向。
旋转角:C设备中可以包括M部件和N部件,M部件可旋转的安装在N部件上,M部 件的旋转角,指的是M部件的正方向相对于N部件的正方向的旋转角,这里,N部件的正方向可以认为是C设备的基准方向。
声源对象:当前发声的人或物,一般是当前的发言人。
拍摄角度范围:又称视场角,指的是摄像机当前能够拍摄到的水平方向的角度和竖直方向的角度。
图像纵向中轴线:指的是图像竖直方向上能够将图像均分为二的假想线。
发声器:本申请实施例中的发声器是能够在控制设备控制下发出声音的器件。下面涉及的发声器可以为超声波发声器,发出的声音为超声波。
本申请实施例提供了一种导播控制方法,该方法可以应用在导播控制系统中。该导播控制系统可以包括麦克风阵列、摄像机和控制设备等。麦克风阵列可以有多种,例如分布式麦克风阵列(“分布式”是指不集成在其它设备上)或者集成在其它设备上的麦克风阵列。摄像机可以有多种,例如分布式摄像机或集成在其他设备上的摄像机。控制设备可以是独立的控制设备,也可以是集成有麦克风阵列和/或摄像机的控制设备。该导播控制系统还可以包括终端设备(如智慧屏或投影仪)等设备。控制设备、麦克风阵列和摄像机中的一种或多种设备可以集成在终端设备上。导播控制系统可以用于多种场景的拍摄和导播,例如会议场景、教学场景或节目录制场景,等等。本实施例以会议场景的导播为例进行说明,其他情况与之类似,在此不作赘述。
会议场景可以有多种,一种非常常见的会议场景是长条桌会议场景,该会议场景可以设置有条形会议桌和若干个座位,座位布置在条形会议桌周围,会议进行过程中,参会人员可以坐在座位上进行会议。本申请实施例以这种会议场景为例进行方案说明。
如图1所示,导播控制系统可以包括第一麦克风阵列、第二麦克风阵列、控制设备和摄像机等。第一麦克风阵列和第二麦克风阵列可以是分布式麦克风阵列。分布式麦克风阵列可以摆放在会议场景中的任意位置。控制设备可以是独立的设备,也可以集成在麦克风阵列或摄像机上。
在控制设备执行导播控制方法之前,可以设定平面坐标系,该平面坐标系可以是水平面内的二维直角坐标系,可以设定会议室空间内的任一点为平面坐标系的原点,平面坐标系的X轴方向和Y轴方向可以是水平面内任意两个相互垂直的方向。控制设备中可以记录有麦克风阵列、摄像机等部分或全部设备的位置、指定方向和偏角,设备的位置可以是该设备的等效中心在平面坐标系的投影点的坐标。一般会将会议室中某个位置不会随意移动的设备的等效中心作为坐标系的原点,并将以该设备作为参照的方向为X轴方向和Y轴方向。例如,将会议终端的等效中心作为坐标系的原点,将会议终端的屏幕法向作为Y轴方向,将水平面内与法向垂直的方向作为X轴方向。
基于上述导播控制系统,本申请实施例提供了一种导播控制方法,该方法可以由导播控制系统中的控制设备来执行。该控制设备可以是服务器、终端或集成在其他设备中的一个组件等。服务器可以是单独的服务器或服务器组。终端可以是布置在会议室中的设备,或者是布置在企业机房中的设备,还可以是便携设备,如智慧屏、台式计算机、笔记本计算机、手机、平板电脑、智能手表等。该控制设备可以集成在智慧屏、摄像机、麦克风阵列等设备中。
图2是本申请实施例提供的一种控制设备的结构示意图,从硬件组成上来看,控制设备20的结构可以如图2所示,包括处理器201、存储器202和通信部件203。
处理器201可以是中央处理器(central processing unit,CPU)或系统级芯片(system on chip,SoC)等,处理器201可以用于确定声源对象相对于第一麦克风阵列的方位角θ 1、声源对象相对于第二麦克风阵列的方位角θ 2,还可以用于确定声源对象的位置等等。
存储器202可以包括各种易失性存储器或非易失性存储器,如固态硬盘(solid state disk,SSD)、动态随机存取存储器(dynamic random access memory,DRAM)内存等。存储器202可以用于存储记录有导播控制的过程中使用到的初始数据、中间数据和结果数据,例如第一麦克风阵列的检测数据、第二麦克风阵列的检测数据、声源对象相对于第一麦克风阵列的方位角θ 1、声源对象相对于第二麦克风阵列的方位角θ 2、第一麦克风阵列的位置、第二麦克风阵列的位置和声源对象的位置,等等。
通信部件203可以是有线网络连接器、无线保真(wireless fidelity,WiFi)模块、蓝牙模块、蜂巢网通信模块等。通信部件203可以用于与其他设备进行数据传输,其他设备可以是服务器、也可以是终端等。例如,控制设备20可以接收第一麦克风阵列的检测数据、第二麦克风阵列的检测数据,还可以将声源对象的位置发送至服务器进行存储。
如图3所示,导播控制系统可以包括第一麦克风阵列,第二麦克风阵列和摄像机。第一麦克风阵列和第二麦克风阵列为分布式麦克风阵列。第一麦克风阵列的数量可以包括一个或多个。摄像机可以是分布式摄像机,摄像机的数量可以包括一个或多个。
在会议室中上述各设备的位置可以任意设置,例如,对于长条桌会议室,第一麦克风阵列和第二麦克风阵列可以放置于长条桌上,两个摄像机可以分别悬挂在长条桌两侧的墙面上。控制设备中可以记录有第一麦克风阵列、第二麦克风阵列、摄像机等设备的位置和偏角,设备的位置可以是该设备的等效中心在平面坐标系的投影点的坐标。
下面针对图3所示的导播控制系统,对本申请实施例提供的导播控制方法的处理流程进行详细说明,该处理流程可以如图4所示。本申请实施例以声源对象为会议场景下的发言人为例进行说明,其他情况与之类似在此不做赘述。
401,控制设备确定第一麦克风阵列的位置以及摄像机的位置。
控制设备可以获取预先存储的第一麦克风阵列的位置以及摄像机的位置。或者,控制设备可以通过参数标定过程来测定第一麦克风阵列的位置以及摄像机的位置,参数标定的具体处理过程在后面内容中会进行详细说明。
402,在发言人发声时,控制设备根据发言人相对于第一麦克风阵列的位置、发言人相对于第二麦克风阵列的位置、第一麦克风阵列的位置和第二麦克风阵列的位置,确定发言人的位置。
其中,发言人相对于第一麦克风阵列的位置和发言人相对于第二麦克风阵列的位置可以通过方位角来表示。
在发言人讲话时,第一麦克风阵列中的每个麦克风可以检测到相应的音频数据,第一麦克风阵列将这些音频数据发送至控制设备。控制设备可以根据这些音频数据进行声源定位,确定发言人相对于第一麦克风阵列的方位角θ 1,声源定位过程使用的算法可以是可控波束(steered-response power,SRP)算法等。同样地,控制设备也可以根据第二麦克风阵列的麦 克风检测到的音频数据进行声源定位,确定发言人相对于第二麦克风阵列的方位角θ 2。参考图5,方位角θ 1是从第一麦克风阵列的基准方向到发言人与第一麦克风阵列的连线在水平面内的逆时针夹角,方位角θ 2是从第一麦克风阵列的基准方向到发言人与第二麦克风阵列的连线在水平面内的逆时针夹角。
下面分两种具体情况介绍一下确定发言人位置的处理:
情况一
对于第一麦克风阵列和第二麦克风的偏角都为0度的情况,控制设备可以根据方位角θ 1、方位角θ 2、第一麦克风阵列的位置和第二麦克风阵列的位置,以及第一麦克风阵列、第二麦克风阵列和发言人之间的几何关系,通过计算得到发言人的位置。
参考图6,发言人的位置坐标表示为(x,y),第一麦克风阵列的坐标表示为(x 1,y 1),第二麦克风阵列的坐标表示为(x 2,y 2),计算过程可以如下:
Figure PCTCN2022105499-appb-000003
Figure PCTCN2022105499-appb-000004
进一步计算可以得到发言人的位置坐标(x,y)。
情况二
对于第一麦克风阵列和第二麦克风阵列的偏角都不为0度的情况,控制设备可以根据第一麦克风阵列的偏角γ 1、第二麦克风阵列的偏角γ 2、方位角θ 1、方位角θ 2、第一麦克风阵列的位置和第二麦克风阵列的位置,以及第一麦克风阵列、第二麦克风阵列、发言人之间的几何关系,通过计算得到发言人的位置。参考图7,发言人的位置坐标表示为(x,y),第一麦克风阵列的坐标表示为(x 1,y 1),第二麦克风阵列的坐标表示为(x 2,y 2),计算过程可以如下:
Figure PCTCN2022105499-appb-000005
Figure PCTCN2022105499-appb-000006
进一步计算可以得到发言人的位置坐标(x,y)。
需要说明的是,第一麦克风阵列、第二麦克风阵列、发言人之间的位置关系有多种可能,上述说明过程仅以其中的一种位置关系为例进行说明,对于其他可能的位置关系,均可以根据上述相关数据通过几何运算得到发言人的位置。上述说明过程中采用的位置关系和计算方法不构成对本实施例的限定。
403,控制设备基于发言人的位置以及摄像机的位置,确定对摄像机的导播操作。
控制设备确定发言人的位置后,可以基于发言人的位置和摄像机的位置,计算发言人相对于摄像机的方位角以及发言人与摄像机的距离。该距离是平面等效距离,也即摄像机的等效中心和发言人的等效中心在平面内的投影距离。
然后,可以基于发言人相对于摄像机的方位角,确定摄像机的导播旋转角。摄像机可以包括可旋转摄像头和固定底座,摄像头可以相对于固定底座进行旋转,可以为摄像头指定一 个初始拍摄方向,初始拍摄方向和摄像头的基准方向可以相同,该导播旋转角可以是摄像头实时的拍摄方向相对于初始拍摄方向的角度,初始拍摄方向可以认为是0度方向,导播旋转角和发言人相对于摄像机的方位角可以相同。
在确定发言人相对于摄像机的距离之后,可以基于该距离,确定摄像机的导播焦距。控制设备可以查询预先存储的第一对应关系表,确定该距离对应的导播焦距。第一对应关系表中可以记录有发言人相对于摄像机的距离和摄像机焦距的对应关系。
下面分两种具体情况介绍一下控制摄像机的导播操作的处理:
情况一
对于摄像机的偏角为0度的情况,控制设备可以根据发言人的位置、摄像机的位置确定摄像机的导播旋转角和导播焦距,从而可以控制摄像机旋转至导播旋转角,并控制摄像机按照导播焦距进行拍摄。
情况二
对于摄像机的偏角不为0度的情况,控制设备可以根据摄像机的偏角、发言人的位置、摄像机的位置确定摄像机的导播旋转角和导播焦距,从而可以控制摄像机云台旋转至导播旋转角,并控制摄像机按照导播焦距进行拍摄。
需要说明的是,上述导播控制系统的示例中,可以添加多个摄像头布置在不同的位置,以更好地拍摄参会成员。以下针对多摄像头的情况介绍几种不同的处理方式:
方式一,对于导播控制系统中存在至少两个摄像头的情况,控制设备可以基于发言人的位置和两个摄像机的位置,确定两个摄像机中与发言人距离较远的目标摄像机,基于发言人的位置以及目标摄像机的位置,确定对目标摄像机的导播操作。
这种处理方式可以适用于如下场景:会议室中布置有长条桌,长条桌两侧布置有若干椅子,发言人面向长条桌坐在椅子上。长条桌两侧的墙壁上分别设置有一个摄像机。在这种场景下,对于分别布置在长条桌两侧的墙壁上的两个摄像机而言,距离发言人较远的摄像机能够更好地拍摄到发言人的人脸。因此,可以将两个摄像机中与发言人距离较远的摄像机确定为目标摄像机,然后基于发言人的位置以及目标摄像机的位置,确定对目标摄像机的导播操作。
方式二,控制设备可以基于声源对象的位置和多个摄像机的位置,控制多个摄像机对声源对象进行拍摄,得到多个视频图像。然后,可以对得到的多个视频图像进行图像识别,选取满足目标条件的视频图像作为导播视频图像。目标条件可以有多种,例如,选取人脸角度最接近正面的视频图像作为导播视频图像等,视频图像中的人脸角度可以使用人脸角度检测的机器学习模型来确定。
上述对发言人的定位过程中,可能涉及的参数包括各设备的位置以及各设备的偏角。这些参数可以全部是预先录入控制设备的,可以在安装后测量录入,或者也可以在设备出厂前录入,这种情况下,安装时要考虑该出厂配置。这些参数也可以有一部分是预先录入控制设备的,另一部分则可以通过参数标定过程来测定。具体哪些参数需要预先录入、哪些参数需要标定,可以基于导播控制系统中的设备情况来确定,例如,位置可以随时改变的设备的参数需要标定,如分布式的麦克风阵列等,位置相对固定的设备的参数可以预先录入,如会议终端上集成的麦克风阵列。
技术人员可以在控制设备中预先录入指定设备的位置和偏角,然后由控制设备通过参数标定过程来测定指定设备外的其他设备的位置和偏角。例如,指定设备可以是某个麦克风阵列等。下面针对几种不同情况的导播控制系统,对参数标定的过程进行详细说明。
情况一,如图8所示,导播控制系统可以包括第一麦克风阵列、会议终端和摄像机。第一麦克风阵列为分布式麦克风阵列,可以集成有第一发声器,第一麦克风阵列的数量可以包括一个或多个。会议终端可以是智慧屏,会议终端可以集成有控制设备、第二麦克风阵列、第二发声器和第三发声器。摄像机可以是分布式摄像机,可以集成有第四发声器和第三麦克风阵列,摄像机的数量可以包括一个或多个。发声器可以有多种可能性,如普通扬声器或超声波发射器等。
在会议室中上述各设备的位置可以任意设置,例如,对于长条桌会议室,会议终端安装在长条桌一端的墙面上,第二麦克风阵列安装于会议终端的顶部正中位置,第二发声器和第三发声器安装于会议终端两侧,第一麦克风阵列可以放置于长条桌上,两个摄像机可以分别悬挂在长条桌两侧的墙面上。
控制设备可以预先记录有第二麦克风阵列的位置、第二麦克风阵列的偏角、第二麦克风阵列中第一麦克风的位置、第二麦克风阵列中第二麦克风的位置、第二发声器的位置、第三发声器的位置,并预先记录第一麦克风阵列对应的第一指定方向,摄像机对应的第二指定方向。示例性地,控制设备以第二麦克风阵列的中心位置为坐标原点、以第二麦克风阵列的基准方向为X轴正向在水平面内建立平面直角坐标系。可替代地,也可以设置第二麦克风阵列的基准方向为屏幕方向,而且,第二麦克风阵列中的第一麦克风和第二麦克风在会议终端上可以是相对于中心位置对称设置的。麦克风阵列中的麦克风之间的距离通常是明确的,当第一麦克风和第二麦克风之间的距离为D时,所以第一麦克风的位置坐标可以记录为(0,-D/2),第二麦克风的位置坐标可以记录为(0,D/2)。同样的,第二发声器和第三发声器在会议终端上一般是相对于中心位置对称设置的。当第二发声器和第三发声器之间的距离为L时,第二发声器的位置坐标可以记录为(0,-L/2),第三发声器的位置坐标可以记录为(0,L/2)。上述第一麦克风、第二麦克风、第二发声器和第三发声器的位置,可以会议终端出厂前预先存储。另外,可以设置并记录第一麦克风阵列对应的第一指定方向为X轴正向,摄像机对应的第二指定方向为Y轴正向。
基于上述导播控制系统,下面分别介绍第一麦克风阵列的位置、第一麦克风阵列的偏角、摄像机的位置和摄像机的偏角的标定过程。
(1)第一麦克风阵列的位置标定(如果存在多个第一麦克风阵列,则每个第一麦克风阵列的位置标定均可以采用如下处理方式)
控制设备控制第一发声器发出声音信号S 1,基于第一发声器发出声音信号S 1的时间点和第二麦克风阵列中的第一麦克风、第二麦克风检测到声音信号S 1的时间点,确定第一麦克风与第一发声器的距离D 1、以及第二麦克风与第一发声器的距离D 2,控制设备基于第一麦克风的位置、第二麦克风的位置、距离D 1和距离D 2,确定第一发声器和第一麦克风阵列的位置。
其中,第一发声器和第一麦克风阵列的等效中心可以相同,即第一发声器的位置和第一麦克风阵列的位置可以相同。
在实施中,控制设备控制第一发声器发出声音信号S 1时,第一发声器将发出声音信号S 1 的时间点t 1发送给控制设备进行记录。第二麦克风阵列中的每个麦克风可以接收到声音信号,并记录检测到该声音信号的时间点,发送给控制设备。控制设备可以获取第二麦克风阵列中的第一麦克风检测到声音信号S 1的时间点t 2、以及第二麦克风阵列中的第二麦克风检测到声音信号S 1的时间点t 3,然后,可以计算得到时间点t 1与时间点t 2之间的时长ΔT 1、时间点t 1与时间点t 3之间的时长ΔT 2。进而,控制设备可以根据预先存储的音速数据V,计算得到第一麦克风与第一发声器的距离D 1和第二麦克风与第一发声器的距离D 2
根据第一麦克风和第二麦克风的位置,可以确定第一麦克风和第二麦克风之间的距离为D。然后,控制设备可以根据距离D、距离D 1和距离D 2,以及第一麦克风、第二麦克风和第一发声器之间的几何关系,通过计算得到第一发声器的位置。参考图9,从第一麦克风与第一发声器的连线到第一麦克风与第二麦克风的连线在水平面内的逆时针夹角表示为γ 3,第一发声器的坐标表示为(x 1,y 1),计算过程可以如下:
Figure PCTCN2022105499-appb-000007
Figure PCTCN2022105499-appb-000008
Figure PCTCN2022105499-appb-000009
需要说明的是,第一麦克风、第二麦克风、第一发声器之间的位置关系有多种可能,上述说明过程仅以其中的一种位置关系为例进行说明,对于其他可能的位置关系,均可以根据上述相关数据通过几何运算得到第一发声器的位置。上述说明过程中采用的位置关系和计算方法不构成对本实施例的限定。
基于图9,对方位角的计算原理进行介绍,也即对前面提到的SRP算法进行介绍:
该算法的计算公式如下:
Figure PCTCN2022105499-appb-000010
X m(k)代表第m个麦克风k频段的快速傅里叶变换(fast fourier transform,FFT)值,s(θ)代表二维空间平面位于角度θ的声源对应的导向矢量,导向矢量可以根据麦克风阵列内部麦克风的布局以及角度搜索范围(人为设置,后续进行最大极值点的确定时所针对的角度范围)提前计算好。以麦克风阵列中各麦克风线型布局为例,导向矢量的计算公式为:
Figure PCTCN2022105499-appb-000011
我们选取第一麦克风为参考麦克风,d mcosθ代表声源到达第m个麦克风与参考麦克风之间路程差。
对于单声源定位,在θ属于角度搜索范围前提下,确定Y(θ)的最大极值点对应的角度θ,即为声源对象的方位角。
(2)第一麦克风阵列的偏角标定(如果存在多个第一麦克风阵列,则每个第一麦克风阵列的偏角标定均可以采用如下处理方式)
第一麦克风阵列的偏角是第一麦克风阵列的基准方向相对于第一指定方向的夹角,第一指定方向可以是X轴正向。
控制设备控制第二发声器发出声音信号S 2,基于第一麦克风阵列的检测数据,确定第二发声器相对于第一麦克风阵列的方位角θ 3,控制设备控制第三发声器发出声音信号S 3,基于第一麦克风阵列的检测数据,确定第三发声器相对于第一麦克风阵列的方位角θ 4,基于方位角θ 3、方位角θ 4、第二发声器的位置与第三发声器的位置,确定第一麦克风阵列的偏角θ 5
在实施中,第二发声器发出声音信号S 2时,第一麦克风阵列中的每个麦克风可以检测到相应的音频数据,第一麦克风阵列将这些音频数据发送至控制设备。控制设备可以根据这些音频数据进行声源定位,确定第二发声器相对于第一麦克风阵列的方位角θ 3。同样地,第三发声器发声时,控制设备也可以根据第一麦克风阵列的麦克风检测到的音频数据进行声源定位,确定第三发声器相对于第一麦克风阵列的方位角θ 4。控制设备可以根据第二发声器的位置坐标、第三发声器的位置坐标,确定第二发声器和第三发声器的距离L。然后控制设备可以基于方位角θ 3、方位角θ 4、第二发声器的位置、第三发声器的位置,以及第一麦克风阵列、第二发声器和第三发声器之间的位置关系,通过计算确定第一麦克风阵列的偏角θ 5。参考图10,第一麦克风阵列的坐标表示为(x 1,y 1),第二发声器的坐标表示为(0,-L/2),第三发声器的坐标表示为(0,L/2),第二发声器和第一麦克风阵列的距离表示为L 1、第三发声器和第一麦克风阵列的距离表示为L 2,计算过程可以如下:
Figure PCTCN2022105499-appb-000012
Figure PCTCN2022105499-appb-000013
cos(θ 35-π)·L 1=cos(π-θ 45)·L 2
sin(θ 35-π)·L 1+sin(π-θ 45)·L 2=L
Figure PCTCN2022105499-appb-000014
进一步计算可以得到偏角θ 5
需要说明的是,第一麦克风阵列、第二发声器、第三发声器之间的位置关系有多种可能,上述说明过程仅以其中的一种位置关系为例进行说明,对于其他可能的位置关系,均可以根据上述相关数据通过几何运算得到第一麦克风阵列的偏角。上述说明过程中采用的位置关系和计算方法不构成对本实施例的限定。
(3)摄像机的位置标定(如果存在多个摄像机,则每个摄像机的位置标定均可以采用如 下处理方式)
控制设备控制第四发声器发出声音信号S 4,基于第四发声器发出声音信号S 4的时间点和第二麦克风阵列中的第一麦克风、第二麦克风检测到声音信号S 4的时间点,确定第一麦克风与第四发声器的距离D 3、以及第二麦克风与第四发声器的距离D 4,控制设备基于第一麦克风的位置、第二麦克风的位置、距离D 3和距离D 4,确定第四发声器和摄像机的位置。
其中,第四发声器和摄像机的等效中心可以相同,即第四发声器的位置和摄像机的位置可以相同。
在实施中,控制设备控制第四发声器发出声音信号S 4时,可以记录第四发声器发出声音信号S 4的时间点t 4。第二麦克风阵列中的每个麦克风可以检测到相应的音频数据,并记录有音频数据对应的检测时间点,即检测到该音频数据的时间点。控制设备可以获取第二麦克风阵列中的第一麦克风检测到声音信号S 4的时间点t 5、以及第二麦克风阵列中的第二麦克风检测到声音信号S 4的时间点t 6,然后,可以计算得到时间点t 4与时间点t 5之间的时长ΔT 3、时间点t 4与时间点t 6之间的时长ΔT 4。进而,控制设备可以根据预先存储的音速数据V,计算得到第一麦克风与第四发声器的距离D 3和确定第二麦克风与第四发声器的距离D 4
根据第一麦克风和第二麦克风的位置,可以确定第一麦克风和第二麦克风之间的距离为D。然后,控制设备可以根据距离D、距离D 3和距离D 4,以及第一麦克风、第二麦克风和第四发声器之间的几何关系,通过计算得到第四发声器的位置。确定第四发声器的位置的计算过程与情况一中确定第一发声器的位置的过程相似,可以参照情况一中第一麦克风阵列的位置标定的相关说明。
(4)摄像机的偏角标定(如果存在多个摄像机,则每个摄像机的偏角标定均可以采用如下处理方式)
控制设备控制第一发声器发出声音信号S 5,基于第三麦克风阵列的检测数据,确定第一发声器相对于第三麦克风阵列的方位角θ 6,并控制第四发声器发出声音信号S 6,基于第一麦克风阵列的检测数据,确定第四发声器相对于第一麦克风阵列的方位角θ 7,控制设备基于方位角θ 6、方位角θ 7和第一麦克风阵列的偏角θ 5,确定第三麦克风阵列和摄像机的偏角θ 8
其中,第三麦克风阵列的等效中心和摄像机的等效中心可以相同,即第三麦克风的位置和摄像机的位置可以相同。第三麦克风的偏角和摄像机的偏角可以相同。第四发声器的等效中心和摄像机的等效中心可以相同,即第四发声器的位置和摄像机的位置可以相同。
在实施中,第一发声器发出声音信号S 5时,第三麦克风阵列中的每个麦克风可以检测到相应的音频数据,第三麦克风阵列将这些音频数据发送至控制设备。控制设备可以根据这些音频数据进行声源定位,确定第一发声器相对于第三麦克风阵列的方位角θ 6。同样地,第四发声器发声时,控制设备也可以根据第一麦克风阵列的麦克风检测到的音频数据进行声源定位,确定第四发声器相对于第一麦克风阵列的方位角θ 7
根据方位角θ 6、方位角θ 7、偏角θ 5,以及第一发声器、第三麦克风阵列和第四发声器之间的几何关系,可以计算得到第三麦克风和摄像机的偏角θ 8。参考图11,计算过程可以如下:
Figure PCTCN2022105499-appb-000015
需要说明的是,第一发声器、第三麦克风阵列、第四发声器之间的位置关系有多种可能,上述说明过程仅以其中的一种位置关系为例进行说明,对于其他可能的位置关系,均可以根 据上述相关数据通过几何运算得到第三麦克风阵列和摄像机的偏角。上述说明过程中采用的位置关系和计算方法不构成对本实施例的限定。
情况二,如图12所示,导播控制系统的架构与情况一相似,不同之处在于摄像机可以没有集成第三麦克风阵列,以及第一麦克风阵列除了集成有第一发声器外,还可以集成有发光器。发光器可以有多种可能性,如普通LED光源或红外LED光源等。
控制设备可以预先记录有第二麦克风阵列的位置、第二麦克风阵列的偏角、第二麦克风阵列中第一麦克风的位置、第二麦克风阵列中第二麦克风的位置、第二发声器的位置、第三发声器的位置,并预先记录第一麦克风阵列对应的第一指定方向,摄像机对应的第二指定方向。示例性地,控制设备以第二麦克风阵列的位置为坐标原点、以第二麦克风阵列的基准方向为X轴正向在水平面内建立平面直角坐标系。第二麦克风阵列中的第一麦克风和第二麦克风在会议终端上可以是相对于中心位置对称设置的。当第一麦克风和第二麦克风之间的距离为D时,第二麦克风阵列中第一麦克风的位置坐标可以记录为(0,-D/2),第二麦克风阵列中第二麦克风的位置坐标可以记录为(0,D/2)。同样的,第二发声器和第三发声器在会议终端上一般是相对于中心位置对称设置的。当第二发声器和第三发声器之间的距离为L时,第二发声器的位置坐标可以记录为(0,-L/2),第三发声器的位置坐标可以记录为(0,L/2)。另外,可以设置并记录第一麦克风阵列对应的第一指定方向为X轴正向,摄像机对应的第二指定方向为Y轴正向。
基于上述导播控制系统,下面分别介绍第一麦克风阵列的位置、第一麦克风阵列的偏角、摄像机的位置和摄像机的偏角的标定过程。
情况二的第一麦克风阵列的位置标定、第一麦克风阵列的偏角标定与摄像机的位置标定与情况一的相应处理相似,可以参照情况一相应处理的说明,在此不做赘述。情况二的摄像机的偏角标定与情况一的相应处理不同,下面将详细说明:
摄像机的偏角标定(如果存在多个摄像机,则每个摄像机的偏角标定均可以采用如下处理方式)
摄像机的偏角是摄像机的基准方向相对于第二指定方向的角度,第二指定方向可以是Y轴正向。
控制设备控制发光器发光,确定摄像机拍摄的图像中的发光点位置,基于图像中的发光点位置,确定发光器相对于摄像机的方位角θ 9,并控制第四发声器发出声音信号S 6,基于第一麦克风阵列的检测数据,确定第四发声器相对于第一麦克风阵列的方位角θ 7,控制设备基于方位角θ 9、方位角θ 7和第一麦克风阵列的基准方向与第一指定方向的夹角θ 5,确定摄像机的偏角θ 8
其中,发光器的等效中心与第一麦克风阵列的等效中心可以相同,即发光器的位置与第一麦克风阵列的位置可以相同。第四发声器的等效中心和摄像机的等效中心可以相同,即第四发声器的位置和摄像机的位置可以相同。
在实施中,控制设备可以记录有摄像机焦距与水平拍摄角度范围(又称为水平视场角)的对应关系。该对应关系可以是摄像机上报给控制设备的,也可以是人工录入控制设备的,等等。控制设备可以确定摄像机当前的焦距。然后在上述对应关系表中查找当前的焦距对应的水平拍摄角度范围γ 4。控制器在控制发光器发光之后,可以获取摄像机拍摄的图像,并在 图像中确定发光点位置与图像纵向中轴线的距离L 3。控制设备中可以记录有图像左侧或右侧边界与图像纵向中轴线的距离L 4。摄像头的实时拍摄方向对应于图像的纵向中轴线。根据水平拍摄角度γ 4、距离L 3和距离L 4,可以确定发光器相对于摄像头的方位角γ 5,方位角γ 5是从摄像头的实时拍摄方向到发光器与摄像头的连线的逆时针夹角。参考图13和图14,计算过程可以如下:
Figure PCTCN2022105499-appb-000016
此时控制设备还可以获取摄像机当前的旋转角γ 6。根据方位角γ 5和旋转角γ 6,可以计算得到发光器相对于摄像机的方位角θ 9。参考图14,计算过程可以如下:
θ 9=γ 65
旋转角γ 6是摄像机的摄像头相对于固定底座的旋转角度,一般摄像头是在控制设备的控制下转动的,所以控制设备是已知该旋转角γ 6的。
需要说明的是,发光器、摄像头和固定底座之间的位置关系有多种可能,上述说明过程仅以其中的一种位置关系为例进行说明,对于其他可能的位置关系,均可以根据上述相关数据通过几何运算得到方位角θ 9。上述说明过程中采用的位置关系和计算方法不构成对本实施例的限定。
控制设备可以控制第四发声器发出声音信号S 6,第四发声器发出声音信号S 6时,第一麦克风阵列中的每个麦克风可以检测到相应的音频数据,第一麦克风阵列可以将这些音频数据发送至控制设备。控制设备可以根据这些音频数据进行声源定位,确定第四发声器相对于第一麦克风阵列的方位角θ 7
控制设备基于方位角θ 9、方位角θ 7和第一麦克风阵列的偏角θ 5,以及第一麦克风阵列、摄像机和第四发声器之间的几何关系,可以计算得到摄像机的偏角θ 8。参考图15,计算过程可以如下:
Figure PCTCN2022105499-appb-000017
对于计算出的θ 8,可以将其数值调整到0~2π的范围内,例如,θ 8为560°,可以将其调整为200°(即560°-360°)。
需要说明的是,第一麦克风阵列、摄像机和第四发声器之间的位置关系有多种可能,上述说明过程仅以其中的一种位置关系为例进行说明,对于其他可能的位置关系,均可以根据上述相关数据通过几何运算得到摄像机的偏角。上述说明过程中采用的位置关系和计算方法不构成对本实施例的限定。
情况三,如图16所示,导播控制系统可以包括第一麦克风阵列、第二麦克风阵列和摄像机。第一麦克风阵列和第二麦克风阵列均为分布式麦克风阵列,第一麦克风阵列可以集成有第一发声器和发光器,第二麦克风阵列可以集成有第五发声器。第一麦克风阵列的数量可以包括一个或多个。摄像机可以是分布式摄像机,可以集成有第四发声器,摄像机的数量可以包括一个或多个。发声器可以有多种可能性,如普通扬声器或超声波发射器等。发光器可以有多种可能性,如普通LED光源或红外LED光源等。导播控制系统中还可以包括会议终端, 控制设备可以集成在会议终端中,或者也可以集成在其他设备中,或者也可以是一个额外的单独的终端设备。
在会议室中上述各设备的位置可以任意设置,例如,对于长条桌会议室,第一麦克风阵列和第二麦克风阵列可以放置于长条桌上,两个摄像机可以分别悬挂在长条桌两侧的墙面上。
控制设备可以预先记录有第二麦克风阵列的位置、第二麦克风阵列的偏角,并预先记录第一麦克风阵列对应的第一指定方向,摄像机对应的第二指定方向。示例性地,控制设备以第二麦克风阵列的位置为坐标原点、以第二麦克风阵列的基准方向为X轴正向在水平面内建立平面直角坐标系。可以设置并记录第一麦克风阵列对应的第一指定方向为X轴正向,摄像机对应的第二指定方向为Y轴正向。
基于上述导播控制系统,下面分别介绍第一麦克风阵列的位置、第一麦克风阵列的偏角、摄像机的位置和摄像机的偏角的标定过程。
(1)第一麦克风阵列的位置标定(如果存在多个第一麦克风阵列,则每个第一麦克风阵列的位置标定均可以采用如下处理方式)
控制设备控制第一发声器发出声音信号S 7,基于第一发声器发出声音信号S 7的时间点和第二麦克风阵列检测到声音信号S 7的时间点,确定第二麦克风阵列与第一发声器的距离D 5,并基于第二麦克风阵列的检测数据,确定第一发声器相对于第二麦克风阵列的方位角θ 10,控制设备基于距离D 5、方位角θ 10和第二麦克风阵列的位置,确定第一发声器和第一麦克风阵列的位置。
其中,第一发声器和第一麦克风阵列的等效中心可以相同,即第一发声器的位置和第一麦克风阵列的位置可以相同。
在实施中,控制设备控制第一发声器发出声音信号S 7时,可以记录第一发声器发出声音信号S 7的时间点t 7。第二麦克风阵列的麦克风可以检测到相应的音频数据,并记录有音频数据对应的检测时间点t 8,即检测到该音频数据的时间点。控制设备可以获取第二麦克风阵列检测到声音信号S 7的时间点t 7、以及第二麦克风阵列检测到声音信号S 7的时间点t 8,然后,可以计算得到时间点t 7与时间点t 8之间的时长ΔT 5。进而,控制设备可以根据预先存储的音速数据V,计算得到第二麦克风阵列与第一发声器的距离D 5
同时,第二麦克风阵列可以将声音信号S 7对应的音频数据发送至控制设备。控制设备可以根据这些音频数据进行声源定位,确定第一发声器相对于第二麦克风阵列的方位角θ 10
控制设备可以根据距离D 5、方位角θ 10和第二麦克风阵列的位置,以及第一发声器与第二麦克风阵列的几何关系,计算得到第一发声器的位置。第一发声器的坐标表示为(x 1,y 1),参考图17,计算过程可以如下:
x 1=D 5·sinθ 10
y 1=D 5·cosθ 10
需要说明的是,第一发声器和第二麦克风阵列之间的位置关系有多种可能,上述说明过程仅以其中的一种位置关系为例进行说明,对于其他可能的位置关系,均可以根据上述相关数据通过几何运算得到第一发声器的位置。上述说明过程中采用的位置关系和计算方法不构成对本实施例的限定。
(2)第一麦克风阵列的偏角标定(如果存在多个第一麦克风阵列,则每个第一麦克风阵 列的偏角标定均可以采用如下处理方式)
控制设备控制第一发声器发出声音信号S 7,基于第二麦克风阵列的检测数据,确定第一发声器相对于第二麦克风阵列的方位角θ 10,并控制第五发声器发出声音信号S 8,基于第一麦克风阵列的检测数据,确定第五发声器相对于第一麦克风阵列的方位角θ 11,控制设备基于方位角θ 10、方位角θ 11和第二麦克风阵列的基准方向与第一指定方向的夹角θ 12,确定第一麦克风阵列的偏角θ 5
在实施中,第一发声器发出声音信号S 7时,第二麦克风阵列的麦克风可以检测到相应的音频数据,第二麦克风阵列可以将这些音频数据发送至控制设备。控制设备可以根据这些音频数据进行声源定位,确定第一发声器相对于第二麦克风阵列的方位角θ 10。同样地,第五发声器发声时,控制设备也可以根据第一麦克风阵列的麦克风检测到的音频数据进行声源定位,确定第五发声器相对于第一麦克风阵列的方位角θ 11
对于θ 12为0度的情况,控制设备可以根据方位角θ 10、方位角θ 11,以及第二麦克风阵列和第一麦克风阵列的几何关系,确定第一麦克风阵列的偏角θ 5。参考图18,计算过程可以如下:
θ 5=θ 1110
对于θ 12不为0度的情况,控制设备可以根据方位角θ 10、方位角θ 11、夹角θ 12,以及第二麦克风阵列和第一麦克风阵列的几何关系,确定第一麦克风阵列的偏角θ 5。参考图19,计算过程可以如下:
θ 5=θ 121110
需要说明的是,第一麦克风阵列和第二麦克风阵列之间的位置关系有多种可能,上述说明过程仅以其中的一种位置关系为例进行说明,对于其他可能的位置关系,均可以根据上述相关数据通过几何运算得到第一麦克风阵列的偏角。上述说明过程中采用的位置关系和计算方法不构成对本实施例的限定。
(3)摄像机的位置标定(如果存在多个摄像机,则每个摄像机的位置标定均可以采用如下处理方式)
控制设备控制第四发声器发出声音信号S 9,基于第四发声器发出声音信号S 9的时间点和第一麦克风阵列、第二麦克风阵列检测到声音信号S 9的时间点,确定第一麦克风阵列与第四发声器的距离D 6、以及第二麦克风阵列与第四发声器的距离D 7,基于第一麦克风阵列的位置、第二麦克风阵列的位置、距离D 6和距离D 7,确定第四发声器和摄像机的位置。
其中,第四发声器和摄像机的等效中心可以相同,即第四发声器的位置和摄像机的位置可以相同。
在实施中,控制设备控制第四发声器发出声音信号S 9时,可以记录第四发声器发出声音信号S 9的时间点t 9。第一麦克风阵列和第二麦克风阵列可以检测到相应的音频数据,并记录有音频数据对应的检测时间点,即检测到该音频数据的时间点。控制设备可以获取第一麦克风阵列检测到声音信号S 9的时间点t 10、以及第二麦克风阵列检测到声音信号S 9的时间点t 11,然后,可以计算得到时间点t 9与时间点t 10之间的时长ΔT 6、时间点t 9与时间点t 11之间的时长ΔT 7。进而,控制设备可以根据预先存储的音速数据V,计算得到第一麦克风阵列与第四发声器的距离D 6和确定第二麦克风阵列与第四发声器的距离D 7
根据第一麦克风阵列和第二麦克风阵列的位置,可以确定第一麦克风阵列和第二麦克风阵列之间的距离为D 8。然后,控制设备可以根据距离D 6、距离D 7和距离D 8,以及第一麦克风阵列、第二麦克风阵列和第四发声器之间的几何关系,通过计算得到第四发声器的位置。确定第四发声器的位置的计算过程与情况一中确定第一发声器的位置的过程相似,可以参照情况一的第一麦克风阵列的位置标定的相关说明。
(4)摄像机的偏角标定(如果存在多个摄像机,则每个摄像机的偏角标定均可以采用如下处理方式)
情况三的摄像机的偏角标定与情况二的相应处理相似,可以参照情况二的摄像机的偏角标定的说明,在此不做赘述。
基于相同的技术构思,本申请实施例还提供了一种导播控制的装置,该装置可以应用于上述实施例提到的导播控制系统中的控制设备,导播控制系统包括第一麦克风阵列、第二麦克风阵列、摄像机和控制设备,如图20所示,该装置包括:
标定模块2001,用于确定第一麦克风阵列的位置以及摄像机的位置。具体可以实现上述步骤401的标定功能,以及其他隐含步骤。
确定模块2002,用于在声源对象发声时,根据声源对象相对于第一麦克风阵列的位置、声源对象相对于第二麦克风阵列的位置、第一麦克风阵列的位置和第二麦克风阵列的位置,确定声源对象的位置。具体可以实现上述步骤402的确定功能,以及其他隐含步骤。
控制模块2003,用于基于声源对象的位置以及摄像机的位置,确定对摄像机的导播操作。具体可以实现上述步骤403的控制功能,以及其他隐含步骤。
在一种可能的实现方式中,第一麦克风阵列中集成有第一发声器,第二麦克风阵列包括第一麦克风和第二麦克风,标定模块2001用于:基于第一麦克风和第二麦克风接收到第一发声器发出的声音信号的时间以及第一发声器发出声音信号的时间确定第一发声器与第一麦克风之间的距离D 1以及第一发声器与第二麦克风之间的距离D 2;基于第一麦克风的位置、第二麦克风的位置、距离D 1和距离D 2,确定第一麦克风阵列相对于第二麦克风阵列的位置。
在一种可能的实现方式中,导播控制系统还包括第二发声器和第三发声器,第二发声器和第三发声器与第二麦克风阵列集成在同一电子屏幕上,标定模块2001还用于:获得第一麦克风阵列发送的第二发声器相对于第一麦克风阵列的方位角θ 3和第三发声器相对于第一麦克风阵列的方位角θ 4;基于方位角θ 3、方位角θ 4、第二发声器的位置与第三发声器的位置,确定第一麦克风阵列的方位。
在一种可能的实现方式中,摄像机集成有第四发声器,第二麦克风阵列包括第一麦克风和第二麦克风,标定模块2001用于:基于第一麦克风和第二麦克风接收到第四发声器发出声音信号的时间以及第四发声器发出声音信号的时间,确定第一麦克风与第四发声器的距离D 3、以及第二麦克风与第四发声器的距离D 4;基于第一麦克风的位置、第二麦克风的位置、距离D 3和距离D 4,确定摄像机相对于第二麦克风阵列的位置。
在一种可能的实现方式中,第一麦克风阵列集成有第一发声器,摄像机集成有第四发声器和第三麦克风阵列,标定模块2001用于:基于第三麦克风阵列在第一发声器发出声音信号时的检测数据,确定第一发声器相对于第三麦克风阵列的方位角θ 6,基于第一麦克风阵列在第四发声器发出声音信号时的检测数据,确定第四发声器相对于第一麦克风阵列的方位角θ 7; 基于方位角θ 6、方位角θ 7和第一麦克风阵列的方位,确定摄像机的偏角。
在一种可能的实现方式中,第一麦克风阵列集成有发光器,摄像机集成有第四发声器,标定模块2001用于:确定摄像机拍摄的图像中的发光点位置,图像是发光器发光时拍摄的,基于图像中的发光点位置以及摄像机的旋转角,确定发光器相对于摄像机的方位角θ 9;基于第一麦克风阵列在第四发声器发出声音信号时的检测数据,确定第四发声器相对于第一麦克风阵列的方位角θ 7;基于方位角θ 9、方位角θ 7和第一麦克风阵列的方位,确定摄像机的方位
在一种可能的实现方式中,第一麦克风阵列集成有第一发声器,第二麦克风阵列包括第一麦克风和第二麦克风,标定模块2001用于:基于第二麦克风阵列在第一发声器发出声音信号时的检测数据确定第一发声器与第二麦克风阵列之间的距离D 5以及第一发声器相对于第二麦克风阵列的方位角θ 10;基于距离D 5、方位角θ 10和第二麦克风阵列的位置,确定第一麦克风阵列的位置。
在一种可能的实现方式中,第一麦克风阵列集成有第一发声器,第二麦克风阵列集成有第五发声器,标定模块2001用于:基于第二麦克风阵列在第一发声器发出声音信号时的检测数据,确定第一发声器相对于第二麦克风阵列的方位角θ 10,以及基于第一麦克风阵列在第五发声器发出声音信号时的检测数据,确定第五发声器相对于第一麦克风阵列的方位角θ 11;基于方位角θ 10、方位角θ 11和第二麦克风阵列的方位,确定第一麦克风阵列的方位。
在一种可能的实现方式中,摄像机集成有第四发声器,标定模块2001还用于:基于第一麦克风阵列和第二麦克风阵列接收到第四发声器发出的声音信号的时间和第四发声器发出声音信号的时间,确定第一麦克风阵列与第四发声器的距离D 6、以及第二麦克风阵列与第四发声器的距离D 7;基于第一麦克风阵列的位置、第二麦克风阵列的位置、距离D 6和距离D 7,确定摄像机的位置。
在一种可能的实现方式中,控制模块2003,用于:基于声源对象的位置和摄像机的位置,确定声源对象相对于摄像机的方位角、以及声源对象与摄像机的距离;基于声源对象相对于摄像机的方位角,确定摄像机的导播旋转角,并基于声源对象与摄像机的距离,确定摄像机的导播焦距。
在一种可能的实现方式中,导播控制系统还包括另一摄像机;控制模块2003,用于:基于声源对象的位置和两个摄像机的位置,确定两个摄像机中与声源对象距离较远的目标摄像机,基于声源对象的位置以及目标摄像机的位置,确定对目标摄像机的导播操作。
需要说明的是,上述标定模块2001、确定模块2002和控制模块2003可以由处理器实现,或者由处理器配合存储器、收发器来实现。
需要说明的是:上述实施例提供的导播控制的装置在执行导播控制处理时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的导播控制的装置与导播控制的方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现,当 使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令,在设备上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴光缆、光纤、数字用户线)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是设备能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(如软盘、硬盘和磁带等),也可以是光介质(如数字视盘(digital video disk,DVD)等),或者半导体介质(如固态硬盘等)。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
以上所述仅为本申请一个实施例,并不用以限制本申请,凡在本申请的原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (25)

  1. 一种导播控制的方法,其特征在于,所述方法应用于导播控制系统,所述导播控制系统包括第一麦克风阵列、第二麦克风阵列、摄像机和控制设备,所述方法包括:
    所述控制设备确定第一麦克风阵列的位置以及所述摄像机的位置;
    在声源对象发声时,所述控制设备根据所述声源对象相对于所述第一麦克风阵列的位置、所述声源对象相对于所述第二麦克风阵列的位置、所述第一麦克风阵列的位置和所述第二麦克风阵列的位置,确定所述声源对象的位置;
    所述控制设备基于所述声源对象的位置以及所述摄像机的位置,确定对所述摄像机的导播操作。
  2. 根据权利要求1所述的方法,其特征在于,所述第一麦克风阵列中集成有第一发声器,所述第二麦克风阵列包括第一麦克风和第二麦克风,所述确定所述第一麦克风阵列的位置包括:
    所述控制设备基于所述第一麦克风和所述第二麦克风接收到所述第一发声器发出的声音信号的时间以及所述第一发声器发出声音信号的时间确定所述第一发声器与所述第一麦克风之间的距离D 1以及所述第一发声器与所述第二麦克风之间的距离D 2
    所述控制设备基于所述第一麦克风的位置、所述第二麦克风的位置、所述距离D 1和所述距离D 2,确定所述第一麦克风阵列相对于所述第二麦克风阵列的位置。
  3. 根据权利要求1或2所述的方法,其特征在于,所述导播控制系统还包括第二发声器和第三发声器,所述第二发声器和所述第三发声器与所述第二麦克风阵列集成在同一电子屏幕上,所述确定所述第一麦克风阵列的位置还包括:
    所述控制设备获得所述第一麦克风阵列发送的所述第二发声器相对于所述第一麦克风阵列的方位角θ 3和所述第三发声器相对于所述第一麦克风阵列的方位角θ 4
    所述控制设备基于所述方位角θ 3、所述方位角θ 4、所述第二发声器的位置与所述第三发声器的位置,确定所述第一麦克风阵列的方位。
  4. 根据权利要求1所述的方法,其特征在于,所述摄像机集成有第四发声器,所述第二麦克风阵列包括第一麦克风和第二麦克风,所述确定所述摄像机的位置包括:
    所述控制设备基于所述第一麦克风和所述第二麦克风接收到所述第四发声器发出声音信号的时间以及所述第四发声器发出声音信号的时间,确定所述第一麦克风与所述第四发声器的距离D 3、以及所述第二麦克风与所述第四发声器的距离D 4
    所述控制设备基于所述第一麦克风的位置、所述第二麦克风的位置、所述距离D 3和所述距离D 4,确定所述摄像机相对于所述第二麦克风阵列的位置。
  5. 根据权利要求3所述的方法,其特征在于,所述第一麦克风阵列集成有第一发声器,所述摄像机集成有第四发声器和第三麦克风阵列,所述确定所述摄像机的位置包括:
    所述控制设备基于所述第三麦克风阵列在所述第一发声器发出声音信号时的检测数据,确定所述第一发声器相对于所述第三麦克风阵列的方位角θ 6,基于所述第一麦克风阵列在第四发声器发出声音信号时的检测数据,确定所述第四发声器相对于所述第一麦克风阵列的方位角θ 7
    所述控制设备基于所述方位角θ 6、所述方位角θ 7和所述第一麦克风阵列的方位,确定所述摄像机的偏角。
  6. 根据权利要求3所述的方法,其特征在于,所述第一麦克风阵列集成有发光器,所述摄像机集成有第四发声器,所述确定所述摄像机的位置包括:
    所述控制设备确定所述摄像机拍摄的图像中的发光点位置,所述图像是发光器发光时拍摄的,基于所述图像中的发光点位置以及所述摄像机的旋转角,确定所述发光器相对于所述摄像机的方位角θ 9
    所述控制设备基于所述第一麦克风阵列在所述第四发声器发出声音信号时的检测数据,确定所述第四发声器相对于所述第一麦克风阵列的方位角θ 7
    所述控制设备基于所述方位角θ 9、所述方位角θ 7和所述第一麦克风阵列的方位,确定所述摄像机的方位。
  7. 根据权利要求1所述的方法,其特征在于,所述第一麦克风阵列集成有第一发声器,所述第二麦克风阵列包括第一麦克风和第二麦克风,所述确定所述第一麦克风阵列的位置包括:
    所述控制设备基于所述第二麦克风阵列在所述第一发声器发出声音信号时的检测数据确定所述第一发声器与所述第二麦克风阵列之间的距离D 5以及所述第一发声器相对于所述第二麦克风阵列的方位角θ 10
    所述控制设备基于所述距离D 5、所述方位角θ 10和所述第二麦克风阵列的位置,确定所述第一麦克风阵列的位置。
  8. 根据权利要求1所述的方法,其特征在于,所述第一麦克风阵列集成有第一发声器,第二麦克风阵列集成有第五发声器,所述确定第一麦克风阵列包括:
    所述控制设备基于所述第二麦克风阵列在所述第一发声器发出声音信号时的检测数据,确定所述第一发声器相对于所述第二麦克风阵列的方位角θ 10,以及基于所述第一麦克风阵列在所述第五发声器发出声音信号时的检测数据,确定所述第五发声器相对于所述第一麦克风阵列的方位角θ 11
    所述控制设备基于所述方位角θ 10、所述方位角θ 11和所述第二麦克风阵列的方位,确定所述第一麦克风阵列的方位。
  9. 根据权利要求1所述的方法,其特征在于,所述摄像机集成有第四发声器,所述方法还包括:
    所述控制设备基于所述第一麦克风阵列和所述第二麦克风阵列接收到所述第四发声器发出的声音信号的时间和所述第四发声器发出所述声音信号的时间,确定所述第一麦克风阵列与所述第四发声器的距离D 6、以及所述第二麦克风阵列与所述第四发声器的距离D 7
    所述控制设备基于所述第一麦克风阵列的位置、所述第二麦克风阵列的位置、所述距离D 6和所述距离D 7,确定所述摄像机的位置。
  10. 根据权利要求1所述的方法,其特征在于,所述控制设备基于所述声源对象的位置以及所述摄像机的位置,确定对所述摄像机的导播操作,包括:
    所述控制设备基于所述声源对象的位置和所述摄像机的位置,确定所述声源对象相对于所述摄像机的方位角、以及所述声源对象与所述摄像机的距离;
    所述控制设备基于所述声源对象相对于所述摄像机的方位角,确定所述摄像机的导播旋转角,并基于所述声源对象与所述摄像机的距离,确定所述摄像机的导播焦距。
  11. 根据权利要求1所述的方法,其特征在于,所述导播控制系统还包括另一摄像机;
    所述控制设备基于所述声源对象的位置以及所述摄像机的位置,确定对所述摄像机的导播操作,包括:
    所述控制设备基于所述声源对象的位置和所述两个摄像机的位置,确定两个摄像机中与所述声源对象距离较远的目标摄像机,基于所述声源对象的位置以及所述目标摄像机的位置,确定对所述目标摄像机的导播操作。
  12. 一种导播控制的装置,其特征在于,所述装置应用于导播控制系统中的控制设备,所述导播控制系统包括第一麦克风阵列、第二麦克风阵列、摄像机和所述控制设备,所述装置包括:
    标定模块,用于确定第一麦克风阵列的位置以及所述摄像机的位置;
    确定模块,用于在声源对象发声时,根据所述声源对象相对于所述第一麦克风阵列的位置、所述声源对象相对于所述第二麦克风阵列的位置、所述第一麦克风阵列的位置和所述第二麦克风阵列的位置,确定所述声源对象的位置;
    控制模块,用于基于所述声源对象的位置以及所述摄像机的位置,确定对所述摄像机的导播操作。
  13. 根据权利要求12所述的装置,其特征在于,所述第一麦克风阵列中集成有第一发声器,所述第二麦克风阵列包括第一麦克风和第二麦克风,所述标定模块用于:
    基于所述第一麦克风和所述第二麦克风接收到所述第一发声器发出的声音信号的时间以及所述第一发声器发出声音信号的时间确定所述第一发声器与所述第一麦克风之间的距离D 1以及所述第一发声器与所述第二麦克风之间的距离D 2
    基于所述第一麦克风的位置、所述第二麦克风的位置、所述距离D 1和所述距离D 2,确定所述第一麦克风阵列相对于所述第二麦克风阵列的位置。
  14. 根据权利要求12或13所述的装置,其特征在于,所述导播控制系统还包括第二发声器和第三发声器,所述第二发声器和所述第三发声器与所述第二麦克风阵列集成在同一电子屏幕上,所述标定模块还用于:
    获得所述第一麦克风阵列发送的所述第二发声器相对于所述第一麦克风阵列的方位角θ 3 和所述第三发声器相对于所述第一麦克风阵列的方位角θ 4
    基于所述方位角θ 3、所述方位角θ 4、所述第二发声器的位置与所述第三发声器的位置,确定所述第一麦克风阵列的方位。
  15. 根据权利要求12所述的装置,其特征在于,所述摄像机集成有第四发声器,所述第二麦克风阵列包括第一麦克风和第二麦克风,所述标定模块用于:
    基于所述第一麦克风和所述第二麦克风接收到所述第四发声器发出声音信号的时间以及所述第四发声器发出声音信号的时间,确定所述第一麦克风与所述第四发声器的距离D 3、以及所述第二麦克风与所述第四发声器的距离D 4
    基于所述第一麦克风的位置、所述第二麦克风的位置、所述距离D 3和所述距离D 4,确定所述摄像机相对于所述第二麦克风阵列的位置。
  16. 根据权利要求14所述的装置,其特征在于,所述第一麦克风阵列集成有第一发声器,所述摄像机集成有第四发声器和第三麦克风阵列,所述标定模块用于:
    基于所述第三麦克风阵列在所述第一发声器发出声音信号时的检测数据,确定所述第一发声器相对于所述第三麦克风阵列的方位角θ 6,基于所述第一麦克风阵列在第四发声器发出声音信号时的检测数据,确定所述第四发声器相对于所述第一麦克风阵列的方位角θ 7
    基于所述方位角θ 6、所述方位角θ 7和所述第一麦克风阵列的方位,确定所述摄像机的偏角。
  17. 根据权利要求14所述的装置,其特征在于,所述第一麦克风阵列集成有发光器,所述摄像机集成有第四发声器,所述标定模块用于:
    确定所述摄像机拍摄的图像中的发光点位置,所述图像是发光器发光时拍摄的,基于所述图像中的发光点位置以及所述摄像机的旋转角,确定所述发光器相对于所述摄像机的方位角θ 9
    基于所述第一麦克风阵列在所述第四发声器发出声音信号时的检测数据,确定所述第四发声器相对于所述第一麦克风阵列的方位角θ 7
    基于所述方位角θ 9、所述方位角θ 7和所述第一麦克风阵列的方位,确定所述摄像机的方位。
  18. 根据权利要求12所述的装置,其特征在于,所述第一麦克风阵列集成有第一发声器,所述第二麦克风阵列包括第一麦克风和第二麦克风,所述标定模块用于:
    基于所述第二麦克风阵列在所述第一发声器发出声音信号时的检测数据确定所述第一发声器与所述第二麦克风阵列之间的距离D 5以及所述第一发声器相对于所述第二麦克风阵列的方位角θ 10
    基于所述距离D 5、所述方位角θ 10和所述第二麦克风阵列的位置,确定所述第一麦克风阵列的位置。
  19. 根据权利要求12所述的装置,其特征在于,所述第一麦克风阵列集成有第一发声器, 第二麦克风阵列集成有第五发声器,所述标定模块用于:
    基于所述第二麦克风阵列在所述第一发声器发出声音信号时的检测数据,确定所述第一发声器相对于所述第二麦克风阵列的方位角θ 10,以及基于所述第一麦克风阵列在所述第五发声器发出声音信号时的检测数据,确定所述第五发声器相对于所述第一麦克风阵列的方位角θ 11
    基于所述方位角θ 10、所述方位角θ 11和所述第二麦克风阵列的方位,确定所述第一麦克风阵列的方位。
  20. 根据权利要求12所述的装置,其特征在于,所述摄像机集成有第四发声器,所述标定模块还用于:
    基于所述第一麦克风阵列和所述第二麦克风阵列接收到所述第四发声器发出的声音信号的时间和所述第四发声器发出所述声音信号的时间,确定所述第一麦克风阵列与所述第四发声器的距离D 6、以及所述第二麦克风阵列与所述第四发声器的距离D 7
    基于所述第一麦克风阵列的位置、所述第二麦克风阵列的位置、所述距离D 6和所述距离D 7,确定所述摄像机的位置。
  21. 根据权利要求12所述的装置,其特征在于,所述控制模块,用于:
    基于所述声源对象的位置和所述摄像机的位置,确定所述声源对象相对于所述摄像机的方位角、以及所述声源对象与所述摄像机的距离;
    基于所述声源对象相对于所述摄像机的方位角,确定所述摄像机的导播旋转角,并基于所述声源对象与所述摄像机的距离,确定所述摄像机的导播焦距。
  22. 根据权利要求12所述的装置,其特征在于,所述导播控制系统还包括另一摄像机;
    所述控制模块,用于:
    基于所述声源对象的位置和所述两个摄像机的位置,确定两个摄像机中与所述声源对象距离较远的目标摄像机,基于所述声源对象的位置以及所述目标摄像机的位置,确定对所述目标摄像机的导播操作。
  23. 一种计算机设备,其特征在于,所述计算机设备包括存储器和处理器,所述存储器用于存储计算机指令;所述处理器用于执行所述存储器存储的计算机指令,以使所述计算机设备执行上述权利要求1至11中任一项所述的方法。
  24. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机程序代码,当所述计算机程序代码被计算机设备执行时,所述计算机设备执行上述权利要求1至11中任一项所述的方法。
  25. 一种计算机程序产品,其特征在于,所述计算机程序产品包括计算机程序代码,在所述计算机程序代码被计算机设备执行时,所述计算机设备执行上述权利要求1至11中任一项所述的方法。
PCT/CN2022/105499 2021-11-25 2022-07-13 导播控制的方法、装置、存储介质和计算机程序产品 WO2023093078A1 (zh)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202111415949.4 2021-11-25
CN202111415949 2021-11-25
CN202210119348.7A CN116193053A (zh) 2021-11-25 2022-02-08 导播控制的方法、装置、存储介质和计算机程序产品
CN202210119348.7 2022-02-08

Publications (1)

Publication Number Publication Date
WO2023093078A1 true WO2023093078A1 (zh) 2023-06-01

Family

ID=86446738

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/105499 WO2023093078A1 (zh) 2021-11-25 2022-07-13 导播控制的方法、装置、存储介质和计算机程序产品

Country Status (2)

Country Link
CN (1) CN116193053A (zh)
WO (1) WO2023093078A1 (zh)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1784900A (zh) * 2003-05-08 2006-06-07 坦德伯格电信公司 用于音源追踪的装置和方法
JP2007214753A (ja) * 2006-02-08 2007-08-23 Matsushita Electric Ind Co Ltd 制御方法及び制御装置
CN101534413A (zh) * 2009-04-14 2009-09-16 深圳华为通信技术有限公司 一种远程呈现的系统、装置和方法
JP2012186551A (ja) * 2011-03-03 2012-09-27 Hitachi Ltd 制御装置、制御システムと制御方法
CN103439689A (zh) * 2013-08-21 2013-12-11 大连理工大学 一种分布式麦克风阵列中麦克风位置估计系统
CN108802689A (zh) * 2018-06-14 2018-11-13 河北工业大学 基于声源阵列的空间麦克风定位方法
CN113203988A (zh) * 2021-04-29 2021-08-03 北京达佳互联信息技术有限公司 声源定位方法及装置

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1784900A (zh) * 2003-05-08 2006-06-07 坦德伯格电信公司 用于音源追踪的装置和方法
JP2007214753A (ja) * 2006-02-08 2007-08-23 Matsushita Electric Ind Co Ltd 制御方法及び制御装置
CN101534413A (zh) * 2009-04-14 2009-09-16 深圳华为通信技术有限公司 一种远程呈现的系统、装置和方法
JP2012186551A (ja) * 2011-03-03 2012-09-27 Hitachi Ltd 制御装置、制御システムと制御方法
CN103439689A (zh) * 2013-08-21 2013-12-11 大连理工大学 一种分布式麦克风阵列中麦克风位置估计系统
CN108802689A (zh) * 2018-06-14 2018-11-13 河北工业大学 基于声源阵列的空间麦克风定位方法
CN113203988A (zh) * 2021-04-29 2021-08-03 北京达佳互联信息技术有限公司 声源定位方法及装置

Also Published As

Publication number Publication date
CN116193053A (zh) 2023-05-30

Similar Documents

Publication Publication Date Title
US10440322B2 (en) Automated configuration of behavior of a telepresence system based on spatial detection of telepresence components
US8754925B2 (en) Audio source locator and tracker, a method of directing a camera to view an audio source and a video conferencing terminal
JP6785908B2 (ja) カメラ撮影制御方法、装置、インテリジェント装置および記憶媒体
US11418758B2 (en) Multiple simultaneous framing alternatives using speaker tracking
US9071895B2 (en) Satellite microphones for improved speaker detection and zoom
US8189807B2 (en) Satellite microphone array for video conferencing
CN108900787B (zh) 图像显示方法、装置、系统及设备、可读存储介质
US10812759B2 (en) Multimodal spatial registration of devices for congruent multimedia communications
US9686605B2 (en) Precise tracking of sound angle of arrival at a microphone array under air temperature variation
CN101189872A (zh) 照相机的规格化图像
US9986360B1 (en) Auto-calibration of relative positions of multiple speaker tracking systems
US9984690B1 (en) Microphone gain using a time of flight (ToF) laser range finding system
CN112839165B (zh) 人脸跟踪摄像的实现方法、装置、计算机设备和存储介质
WO2023093078A1 (zh) 导播控制的方法、装置、存储介质和计算机程序产品
CN111325790A (zh) 目标追踪方法、设备及系统
US11425502B2 (en) Detection of microphone orientation and location for directional audio pickup
TWI566596B (zh) 鏡頭拍攝範圍確定方法及系統
TWI799048B (zh) 環景影像會議系統及方法
US11877058B1 (en) Computer program product and automated method for auto-focusing a camera on a person in a venue who is wearing, or carrying, or holding, or speaking into a microphone at the venue
US11902659B1 (en) Computer program product and method for auto-focusing a lighting fixture on a person in a venue who is wearing, or carrying, or holding, or speaking into a microphone at the venue
US20230086490A1 (en) Conferencing systems and methods for room intelligence
US20120195444A1 (en) Electronic device and method of dynamically correcting audio output of audio devices
CN116472724A (zh) 用于远程呈现会议的麦克风阵列的自动校准
WO2024039892A1 (en) System and method for camera motion stabilization using audio localization

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 2022897174

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2022897174

Country of ref document: EP

Effective date: 20240416