CN103685906A - Control method, control device and control equipment - Google Patents

Control method, control device and control equipment Download PDF

Info

Publication number
CN103685906A
CN103685906A CN201210350741.3A CN201210350741A CN103685906A CN 103685906 A CN103685906 A CN 103685906A CN 201210350741 A CN201210350741 A CN 201210350741A CN 103685906 A CN103685906 A CN 103685906A
Authority
CN
China
Prior art keywords
sound source
information
target sound
capture apparatus
control
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201210350741.3A
Other languages
Chinese (zh)
Other versions
CN103685906B (en
Inventor
陈军
黄强
黄志宏
袁洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN201210350741.3A priority Critical patent/CN103685906B/en
Publication of CN103685906A publication Critical patent/CN103685906A/en
Application granted granted Critical
Publication of CN103685906B publication Critical patent/CN103685906B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Circuit For Audible Band Transducer (AREA)
  • Manipulator (AREA)

Abstract

The embodiment of the invention provides a control method, a control device and control equipment. The control method comprises the steps that audio data which comprise the sound information of a target sound source is acquired; according to the audio data, the position range information of the target sound source is determined; according to the position range information, the rotation of shooting equipment which cannot shoot the target sound source currently is controlled, so that the shooting equipment can shoot the target sound source. According to the embodiment of the invention, the shooting equipment can be supported to shoot the target sound source outside the range of an original screen.

Description

A kind of control method, control device and control appliance
Technical field
The embodiment of the present invention relates to image and follows the tracks of field, relates in particular to a kind of control method, control device and control appliance.
Background technology
In video communication process, camera need to be aimed to speaker.Existing solution is to use image recognition technology to identify people's face, and then remote control camera is aimed at people's face position, but this scheme cannot be followed the tracks of and exceeded the extraneous speaker of screen or in extraneous another speaker of screen.
Summary of the invention
In view of this, the object of the embodiment of the present invention is to provide a kind of control method, control device and control appliance, to support capture apparatus can photograph in the extraneous target sound of former screen source.
For solving the problems of the technologies described above, the embodiment of the present invention provides scheme as follows:
The embodiment of the present invention provides a kind of control method, and described control method comprises:
Obtain the voice data of the acoustic information that comprises target sound source;
According to described voice data, determine the position range information in described target sound source;
According to described position range information, control the current rotation that cannot photograph the capture apparatus in described target sound source, make described capture apparatus can photograph described target sound source.
Preferably, described position range information be described target sound source with respect to the directional information of described capture apparatus, describedly according to described position range information, control the current rotation that cannot photograph the capture apparatus in described target sound source and be specially:
Determine the rotation control parameter of the described capture apparatus that described directional information is corresponding;
According to described rotation, control the rotation that parameter is controlled described capture apparatus.
Preferably, described voice data is collected by a sound collection equipment, describedly according to described voice data, determines that the position range information in described target sound source is specially:
According to described voice data, determine that described target sound source is with respect to the azimuth information of described sound collection equipment;
According to described azimuth information, determine described directional information.
Preferably, describedly according to described azimuth information, determine that described directional information is specially:
According to described azimuth information, and the default corresponding relation of described azimuth information and described directional information is determined described directional information.
Preferably, described sound collection equipment is for determining a default plane of described azimuth information and the preset reference point in described default plane, described capture apparatus is corresponding one first corresponding points in described default plane, and described target sound source is corresponding one second corresponding points in described default plane
Described azimuth information be described the second corresponding points with respect to the position coordinates of described preset reference point, to be described the second corresponding points characterize coordinate with respect to the directional information of described the first corresponding points to described directional information,
The plane geometry function that described corresponding relation is variable as parameter, the described sounding corresponding points of take with respect to the directional information sign coordinate of described the first corresponding points with respect to the position coordinates of described preset reference point as independent variable, described the first corresponding points of take with respect to the position coordinates of described preset reference point for the sounding corresponding points of take in described default plane.
Preferably, described the first corresponding points are to belong to the coordinate (a1 of the described preset reference point of take of described default plane on the rectangular coordinate of the first initial point with respect to the position coordinates of described preset reference point, a2), described sounding corresponding points are the coordinate (x on described rectangular coordinate with respect to the position coordinates of described preset reference point, y), y is greater than a2, and it is to belong to the angle coordinate b of described the first corresponding points of take of described default plane on the polar coordinates of the second initial point that described directional information characterizes coordinate
When a2 is 0, described polar pole axis is identical with the direction of the x axle of described rectangular coordinate; When a2 is not 0, the x axle of described polar pole axis and described rectangular coordinate is parallel and direction is identical,
Described plane geometry function is b=arctan ((y-a2)/(x-a1)), and wherein, x is not equal to a1; Or,
Described plane geometry function is: when x is not equal to a1, b=arctan ((y-a2)/(x-a1)); When x equals a1, b=90 degree.
Preferably, described parameter is determined with respect to the position coordinates of described preset reference point with respect to the directional information sign coordinate of described the first corresponding points by the training points of training sound source correspondence in described default plane according at least one obtaining by learning training mode.
The embodiment of the present invention provides a kind of control device, and described control device comprises:
Acquisition module, for obtaining the voice data of the acoustic information that comprises target sound source;
Determination module, for determining the position range information in described target sound source according to described voice data;
Control module, for controlling the current rotation that cannot photograph the capture apparatus in described target sound source according to described position range information, makes described capture apparatus can photograph described target sound source.
Preferably, described position range information be described target sound source with respect to the directional information of described capture apparatus, described control module comprises:
The first determining unit, for determining the rotation control parameter of the described capture apparatus that described directional information is corresponding;
Control unit, for controlling according to described rotation the rotation that parameter is controlled described capture apparatus, makes described capture apparatus can photograph described target sound source.
Preferably, described voice data is collected by a sound collection equipment, and described determination module comprises:
The second determining unit, for determining that according to described voice data described target sound source is with respect to the azimuth information of described sound collection equipment;
The 3rd determining unit, for determining described directional information according to described azimuth information.
Preferably, described the 3rd determining unit comprises:
Determine subelement, for according to described azimuth information, and the default corresponding relation of described azimuth information and described directional information is determined described directional information.
The embodiment of the present invention provides a kind of control appliance that comprises above-described control device.
From the above, can find out, the control method that the embodiment of the present invention provides, control device and control appliance at least comprise following technique effect:
By obtaining the voice data of the acoustic information that comprises target sound source, determine accordingly the position range information in target sound source, and control the current rotation that cannot photograph the capture apparatus in target sound source according to this position range information, make capture apparatus can photograph target sound source, thereby support that capture apparatus can photograph in the extraneous target sound of former screen source.
Accompanying drawing explanation
The flow chart of a kind of control method that Fig. 1 provides for the embodiment of the present invention;
The array microphone of preferred embodiment one and the position coordinates figure of sound source of a kind of control method that Fig. 2 provides for the embodiment of the present invention;
The array microphone of the preferred embodiment two of a kind of control method that Fig. 3 provides for the embodiment of the present invention is placed on the position view in camera dead ahead;
The array microphone of preferred embodiment two and the position coordinates figure of sound source of a kind of control method that Fig. 4 provides for the embodiment of the present invention;
The training schematic diagram of the preferred embodiment two of a kind of control method that Fig. 5 provides for the embodiment of the present invention;
The schematic diagram of the preferred embodiment three of a kind of control method that Fig. 6 provides for the embodiment of the present invention.
Embodiment
For making object, technical scheme and the advantage of the embodiment of the present invention clearer, below in conjunction with the accompanying drawings and the specific embodiments the embodiment of the present invention is described in detail.
The flow chart of a kind of control method that Fig. 1 provides for the embodiment of the present invention, with reference to Fig. 1, the embodiment of the present invention provides a kind of control method, and described control method comprises the steps:
Step 101, obtains the voice data of the acoustic information that comprises target sound source;
Step 102, determines the position range information in described target sound source according to described voice data;
Step 103, controls the current rotation that cannot photograph the capture apparatus in described target sound source according to described position range information, makes described capture apparatus can photograph described target sound source.
Visible, by obtaining the voice data of the acoustic information that comprises target sound source, determine accordingly the position range information in target sound source, and control the current rotation that cannot photograph the capture apparatus in target sound source according to this position range information, make capture apparatus can photograph target sound source, thereby support that capture apparatus can photograph in the extraneous target sound of former screen source.
Obviously, described target sound source should in described capture apparatus by rotate can reach whole can coverage in.
Described target sound source can be talker, can be also audible device.
Described capture apparatus can be camera or camera.
Particularly, for example: in described acoustic information, can comprise the keyword content of the described position range information of default expression, by speech recognition technology, just can determine described position range information according to described voice data.
Or, for example: described position range information can be the directional information of described target sound source with respect to described capture apparatus, describedly according to described position range information, control the current rotation that cannot photograph the capture apparatus in described target sound source and be specifically as follows:
Determine the rotation control parameter of the described capture apparatus that described directional information is corresponding;
According to described rotation, control the rotation that parameter is controlled described capture apparatus.
Wherein, parameter is controlled in described rotation, for example: the sign of a certain angle of described capture apparatus in some adjustable angles, the anglec of rotation of the cradle head controllor of camera, the directioin parameter of the optical axis of camera, etc.
Particularly, described voice data can be collected by a sound collection equipment, describedly according to described voice data, determines that the position range information in described target sound source is specifically as follows:
According to described voice data, determine that described target sound source is with respect to the azimuth information of described sound collection equipment;
According to described azimuth information, determine described directional information.
Wherein, described sound collection equipment for example, array microphone.
Described azimuth information can be direction or positional information.
Further, describedly according to described azimuth information, determine that described directional information is specifically as follows:
According to described azimuth information, and the default corresponding relation of described azimuth information and described directional information is determined described directional information.
Particularly, for example, by enough training of a plurality of points, obtain the combination of abundant azimuth information and directional information, and obtain described corresponding relation by these combinations are carried out to matching.For example, take 0.1m places or mobile training sound source as distribution granularity.
Again for example, according to certain positional relationship, place described capture apparatus and described sound collection equipment, make described target sound generation source when arbitrary position described in the azimuth information direction represented with described directional information corresponding consistent; Based on this certain positional relationship, determine described corresponding relation.Such as, described capture apparatus and described sound collection equipment can be put together, or, when described sound collection equipment level is placed, described capture apparatus be placed on described sound collection equipment directly over.Preferred embodiment below once in adopted the mode putting together.
It should be noted that, consider the restriction of capture apparatus described in reality on placement location, described capture apparatus placement location can allow certain departing from, what at synchronization, can photograph due to described capture apparatus is a wider scope, as long as this departs from and can make described capture apparatus can photograph the represented direction of described azimuth information, this can realize in engineering practice in operation on the spot, does not repeat them here.
Again for example, described sound collection equipment is for determining a default plane of described azimuth information and the preset reference point in described default plane, described capture apparatus is corresponding one first corresponding points in described default plane, and described target sound source is corresponding one second corresponding points in described default plane
Described azimuth information be described the second corresponding points with respect to the position coordinates of described preset reference point, described directional information be described the second corresponding points with respect to the directional information characterization value of described the first corresponding points,
The plane geometry function that described corresponding relation is variable as parameter, the described sounding corresponding points of take with respect to the directional information characterization value of described the first corresponding points with respect to the position coordinates of described preset reference point as independent variable, described the first corresponding points of take with respect to the position coordinates of described preset reference point for the sounding corresponding points of take in described default plane.
Wherein, described the first corresponding points, such as, the photocentre of described capture apparatus or the photocentre of described capture apparatus are in the projection of described default plane.
Described the second corresponding points, such as, described target sound source described default plane certain a bit or described target sound source not in certain of described default plane a bit in projection of described default plane.
Described sounding corresponding points, such as, sound source is in the not projection in described default plane in the sounding reference point of described default plane of the sounding reference point of described default plane or sound source.Wherein, sounding reference point can be certain point of throat of people or certain point of the voice output unit of sound source.
Described directional information characterization value, such as, described the second corresponding points of take are initial point, while having an axis coordinate system in described default plane centered by this initial point, described sounding corresponding points are in the angle coordinate value of this axis coordinate system.
Which kind of device-dependent the corresponding default plane of described sound collection equipment and preset reference point and described sound collection equipment specifically adopt, the plane of orientation and the location reference point that adopt such as planar array Mike.
It should be noted that, in actual applications, the position of sound source can be in described default plane, also can be in a certain side of described default plane, and due to the impact of other factors, resulting azimuth information may have very little error, but, what at synchronization, can photograph due to capture apparatus is a wider scope, and thus, this error does not affect the solution of embodiment of the present invention technical problem to be solved.
Here provide the object lesson of described plane geometry function: described the first corresponding points are to belong to the coordinate (a1 of the described preset reference point of take of described default plane on the rectangular coordinate of the first initial point with respect to the position coordinates of described preset reference point, a2), described sounding corresponding points are the coordinate (x on described rectangular coordinate with respect to the position coordinates of described preset reference point, y), y is greater than a2, it is to belong to the angle coordinate b of described the first corresponding points of take of described default plane on the polar coordinates of the second initial point that described directional information characterizes coordinate
When a2 is 0, described polar pole axis is identical with the direction of the x axle of described rectangular coordinate; When a2 is not 0, the x axle of described polar pole axis and described rectangular coordinate is parallel and direction is identical,
Described plane geometry function is b=arctan ((y-a2)/(x-a1)), and wherein, x is not equal to a1; Or,
Described plane geometry function is: when x is not equal to a1, b=arctan ((y-a2)/(x-a1)); When x equals a1, b=90 degree.
Described parameter can obtain by the measuring and calculating on the spot in project implementing process; Or, described parameter can by according at least one the training sound source obtaining by learning training mode in described default plane corresponding training points with respect to the position coordinates of described preset reference point with determined with respect to the directional information characterization value of described the first corresponding points.
Described learning training mode is for example:
Determine a first sound source of sound in described default plane corresponding the first training points with respect to the primary importance coordinate of described preset reference point with respect to the first direction information representation value of described the first corresponding points;
According to described primary importance coordinate and described first direction information representation value, obtain described parameter;
Wherein, described the first training points, described the first corresponding points and described preset reference are put not conllinear.
Learning training mode in preferred embodiment two has below adopted this learning training mode exactly.
Described learning training mode is again for example:
Determine a rising tone source of sound in described default plane corresponding the second training points with respect to the second place coordinate of described preset reference point with respect to the second direction information representation value of described the first corresponding points;
Determine one the 3rd sound source in described default plane corresponding the 3rd training points with respect to the 3rd position coordinates of described preset reference point with respect to the third direction information representation value of described the first corresponding points;
According to described second place coordinate, described second direction information representation value, described the 3rd position coordinates, described third direction information representation value, obtain described parameter;
Wherein, described the second training points, described the 3rd training points and described the first corresponding points conllinear not.
Learning training mode in preferred embodiment three has below adopted this learning training mode exactly.
For above-mentioned control method is further set forth and understood, below provide three preferred embodiments of described control method:
Preferred embodiment one:
The array microphone of preferred embodiment one and the position coordinates figure of sound source of a kind of control method that Fig. 2 provides for the embodiment of the present invention, with reference to Fig. 2, array microphone has multiple physical aspect, and this preferred embodiment is a linear array Mike, comprises at least 3 miaow heads above.Meanwhile, camera and array microphone put together.The step of this preferred embodiment is as follows:
Step 201, is used a plurality of miaow heads of array microphone to receive respectively voice data, issues processing center after filter out background noise, or issues filtering noise after processing center.
Step 202. processing center is separated by the voice extracting section in multi-path audio-frequency data according to frequency, then according to the phasometer of voice part in multi-path audio-frequency data, calculates the voice time difference that a plurality of miaow heads are received.
Step 203, the time difference of receiving according to a plurality of miaow heads is multiplied by velocity of sound and can calculates range difference, then according to the range difference between three miaow heads, can calculate the orientation of sound.
Particularly, array microphone miaow head direct range is known distance, and we are set as R, and we mark miaow head 2 for the origin of coordinates, miaow head 1 coordinate is (R, 0), and miaow head 3 coordinates are (R, 0) the sound source coordinate that, we need to calculate is (x, y);
The distance that we mark sound source arrival miaow head 1, miaow head 2, miaow head 3 is respectively L1, L2, L3, the time difference that actual our previous step records is multiplied by velocity of sound, for the difference between L1, L2, L3, the value that is to say L1-L3L2-L3 is known, we mark known L1-L3 is D13, and L2-L3 is D23;
According to Pythagorean theorem, draw:
L 1 = ( x + R ) 2 + y 2 = x 2 + y 2 + R 2 + 2 xR
L 2 = x 2 + y 2
L 3 = ( x - R ) 2 + y 2 = x 2 + y 2 + R 2 - 2 xR
?
L 3 = L 1 = L 3 = x 2 + y 2 + R 2 + 2 xR - x 2 + y 2 + R 2 - 2 xR
= ( x 2 + y 2 + R 2 + 2 xR - x 2 + y 2 + R 2 - 2 xR ) 2
= 2 x 2 + 2 y 2 + 2 R 2 - 2 x 4 + y 4 + R 4 + 2 x 2 y 2 + 2 x 2 R 2 + 2 y 2 R 2 - 4 x 2 R 2
After square:
D 13 2 = 2 x 2 + 2 y 2 + 2 R 2 - 2 x 4 + y 4 + R 4 + 2 x 2 y 2 - 2 x 2 R 2 + 2 y 2 R 2
x 4 + y 4 + R 4 + 2 x 2 y 2 - 2 x 2 R 2 + 2 y 2 R 2
= x 2 + 2 y 2 + R 2 - 0.5 * D 13 2
After square:
x 4+y 4+R 4+2x 2y 2-2x 2R 2+2y 2R 2=x 4+y 4+(R 2-0.5*D13 2) 2+2x 2y 2
+2x 2(R 2-0.5*D13 2)+2y 2(R 2-0.5*D13 2)
After expansion:
x 4+y 4+R 4+2x 2y 2-2x 2R 2+2y 2R 2=x 4+y 4+R 4-R 2D13 2+0.25D13 4+2x 2y 2
+2x 2R 2-x 2D13 2+2y 2R 2-y 2D13 2
Left and right obtains after eliminating:
y 2D13 2=-R 2D13 2+0.25D13 4+4x 2R 2-x 2D13 2
Finally obtain:
y = ± ( 4 R 2 D 13 2 - 1 ) x 2 + 0.25 * D 13 2 - R 2
In the practical application scene of this preferred embodiment, sound source is forever from the place ahead, and now negative sign can omit, and becomes:
( 4 R 2 D 13 2 - 1 ) x 2 + 0.25 * D 13 2 - R 2 Formula A
Meanwhile, we also need to meet:
D 23 = L 2 - L 3 = x 2 + y 2 - x 2 + y 2 + R 2 - 2 xR
That is:
D 23 = x 2 + y 2 - x 2 + y 2 + R 2 - 2 xR Formula B
Use software program can easily draw the x that simultaneously meets formula A and formula B, y, be specially: positive and negative according to the positive negative judgement x of D13, take x as cyclic variable again, recycle formula A and obtain y, re-use this x, y substitution formula B is until formula B establishment, the x now obtaining, y is sound source location.
The angle of sound source is
Step 204, the sound bearing calculating according to previous step, controls this direction of camera rotational alignment.
Step 205, in the image of camera picked-up, is used face recognition technology to find people's face position, specific as follows:
The view data that first step input collects;
Second step, by complexion model binary image, is about to the non-colour of skin and is partly set to 0, and the colour of skin is partly set to 1, and wherein colour of skin span can obtain by statistical learning in physical device;
Corrosion expansion algorithm filtering for the 3rd step;
The 4th step is used UNICOM region to detect, and the wide people of meeting in UNICOM region of the take little height of being bold is more than or equal to people and is bold little to be standard, to judge people's face position.
Step 206, rotates camera until aim at people's face to people's face direction.
Preferred embodiment two:
The array microphone of the preferred embodiment two of a kind of control method that Fig. 3 provides for the embodiment of the present invention is placed on the position view in camera dead ahead, with reference to Fig. 3, array microphone has multiple physical aspect, and this preferred embodiment is a circular array Mike, comprises at least 3 miaow heads above.Meanwhile, camera does not put together with array microphone, and array microphone is placed on camera dead ahead.The step of this preferred embodiment is as follows:
Step 301, is used a plurality of miaow heads of array microphone to receive respectively voice data, issues processing center after filter out background noise, or issues filtering noise after processing center.
Step 302, processing center is separated by the voice extracting section in multi-path audio-frequency data according to frequency, then according to the phasometer of voice part in multi-path audio-frequency data, calculates the voice time difference that a plurality of miaow heads are received.
Step 303, the time difference of receiving according to a plurality of miaow heads is multiplied by velocity of sound and can calculates range difference, then according to the range difference between three miaow heads, can calculate the orientation of sound.
Particularly, the array microphone of preferred embodiment two and the position coordinates figure of sound source of a kind of control method that Fig. 4 provides for the embodiment of the present invention, with reference to Fig. 4, array microphone miaow head direct range is known distance, and we are set as R, and we mark array microphone center is the origin of coordinates, miaow head 1 coordinate is (R, 0), miaow head 2 coordinates be (0, R), miaow head 3 coordinates are (R, 0) the sound source coordinate that, we need to calculate is (x, y);
The distance that we mark sound source arrival miaow head 1, miaow head 2, miaow head 3 is respectively L1, L2, L3, the time difference that actual our previous step records is multiplied by velocity of sound, for the difference between L1, L2, L3, the value that is to say L1-L3L2-L3 is known, we mark known L1-L3 is D13, and L2-L3 is D23;
According to Pythagorean theorem, draw:
L 1 = ( x + R ) 2 + y 2 = x 2 + y 2 + R 2 + 2 xR
L 2 = x 2 + ( y - R ) 2 = x 2 + y 2 + R 2 - 2 yR
L 3 = ( x - R ) 2 + y 2 = x 2 + y 2 + R 2 - 2 xR
?
D 13 = L 1 - L 3 = x 2 + y 2 + R 2 + 2 xR - x 2 + y 2 + R 2 - 2 xR
Identically with preferred embodiment one according to D13, can derive and draw formula A:
y = ( 4 R 2 D 13 2 - 1 ) x 2 + 0.25 * D 13 2 - R 2 Formula A
Meanwhile, we also need to meet:
D 23 = L 2 - L 3 = x 2 + y 2 + R 2 - 2 yR - x 2 + y 2 + R 2 - 2 xR
That is:
D 23 = x 2 + y 2 + R 2 - 2 yR - x 2 + y 2 + R 2 - 2 xR Formula C
Use software program can easily draw the x that simultaneously meets formula A and formula C, y, be specially: positive and negative according to the positive negative judgement x of D13, take x as cyclic variable again, recycle formula A and obtain y, re-use this x, y substitution formula C until formula C sets up, the x, the y that now obtain are sound source location.
Step 304, the direction that camera theta alignment angle arctan ((d+y)/x) is represented.
Wherein, at actual use scenes, because digital microphone and the camera position in meeting-place are fixed, can not move, d can adopt learning training mode to obtain.Particularly, the training schematic diagram of the preferred embodiment two of a kind of control method that Fig. 5 provides for the embodiment of the present invention, with reference to Fig. 5, during training, allow speaker not stand in camera dead ahead, namely in Fig. 5, a can not be 90 degree, and then by camera rotational alignment speaker, camera records angle b.After speaker speaks, use step above to obtain x, y coordinate figure, can be by calculating the distance d between video camera and array microphone, computing formula is: d=x/tan (b)-y.
Step 305, in the image of camera picked-up, is used face recognition technology to find people's face position, and method is as follows:
The yuv data that first step input collects;
Second step, by complexion model binary image, is about to the non-colour of skin and is partly set to 0, and the colour of skin is partly set to 1, and wherein colour of skin span can obtain by statistical learning in physical device;
Corrosion expansion algorithm filtering for the 3rd step
The 4th step is used UNICOM region to detect, and the wide people of meeting in UNICOM region of the take little height of being bold is more than or equal to people and is bold little to be standard, to judge people's face position
Step 306, then rotates camera until aim at people's face to people's face direction.
Preferred embodiment three:
The schematic diagram of the preferred embodiment three of a kind of control method that Fig. 6 provides for the embodiment of the present invention, with reference to Fig. 6, array microphone has multiple physical aspect, and this preferred embodiment is a circular array Mike, comprises at least 3 miaow heads above.Camera does not put together with array microphone, and array microphone is placed on camera the place ahead and has the displacement of horizontal direction.Sound source location coordinate is (x, y), and the coordinate of the relative array microphone of camera is (l ,-d).The step of this preferred embodiment is as follows:
Step 401, by with preferred embodiment two in step 301 ~ 303 similarly mode obtain x, y.
Step 402, the direction that camera theta alignment angle b is represented, b=arctan ((y+d)/(x-l)).
Wherein, at actual use scenes, because digital microphone and the camera position in meeting-place are fixed, can not move, d and l can adopt learning training mode to obtain.Particularly, first, stand camera dead ahead speech of trainer, array microphone calculates coordinate (x1, y1), the abscissa l=x1 of camera; Then, the trainer's non-dead ahead of camera speech of standing, operating personnel control camera and aim at trainer, and now to count on angle be b2 to camera oneself; Array microphone calculates coordinate (x2, y2), tan (b2)=(y2+d)/(x2-l), and l=x1, tan (b2)=(y2+d)/(x2-x1), can calculate thus d=tan (b2) * (x2-x1)-y2.
Step 403, in the image of camera picked-up, is used face recognition technology to find people's face position, and method is as follows:
The yuv data that first step input collects;
Second step, by complexion model binary image, is about to the non-colour of skin and is partly set to 0, and the colour of skin is partly set to 1, and wherein colour of skin span can obtain by statistical learning in physical device;
Corrosion expansion algorithm filtering for the 3rd step
The 4th step is used UNICOM region to detect, and the wide people of meeting in UNICOM region of the take little height of being bold is more than or equal to people and is bold little to be standard, to judge people's face position
Step 404, then rotates camera until aim at people's face to people's face direction.
The embodiment of the present invention also provides a kind of control device, and described control device comprises:
Acquisition module, for obtaining the voice data of the acoustic information that comprises target sound source;
Determination module, for determining the position range information in described target sound source according to described voice data;
Control module, for controlling the current rotation that cannot photograph the capture apparatus in described target sound source according to described position range information, makes described capture apparatus can photograph described target sound source.
Visible, by obtaining the voice data of the acoustic information that comprises target sound source, determine accordingly the position range information in target sound source, and control the current rotation that cannot photograph the capture apparatus in target sound source according to this position range information, make capture apparatus can photograph target sound source, thereby support that capture apparatus can photograph in the extraneous target sound of former screen source
Further, described position range information be described target sound source with respect to the directional information of described capture apparatus, described control module comprises:
The first determining unit, for determining the rotation control parameter of the described capture apparatus that described directional information is corresponding;
Control unit, for controlling according to described rotation the rotation that parameter is controlled described capture apparatus, makes described capture apparatus can photograph described target sound source.
Further, described voice data is collected by a sound collection equipment, and described determination module comprises:
The second determining unit, for determining that according to described voice data described target sound source is with respect to the azimuth information of described sound collection equipment;
The 3rd determining unit, for determining described directional information according to described azimuth information.
Further, described the 3rd determining unit comprises:
Determine subelement, for according to described azimuth information, and the default corresponding relation of described azimuth information and described directional information is determined described directional information.
The embodiment of the present invention also provides a kind of control appliance, and described control appliance comprises above-described control device.
The above is only the execution mode of the embodiment of the present invention; should be understood that; for those skilled in the art; do not departing under the prerequisite of embodiment of the present invention principle; can also make some improvements and modifications, these improvements and modifications also should be considered as the protection range of the embodiment of the present invention.

Claims (12)

1. a control method, is characterized in that, described control method comprises:
Obtain the voice data of the acoustic information that comprises target sound source;
According to described voice data, determine the position range information in described target sound source;
According to described position range information, control the current rotation that cannot photograph the capture apparatus in described target sound source, make described capture apparatus can photograph described target sound source.
2. control method as claimed in claim 1, is characterized in that, described position range information be described target sound source with respect to the directional information of described capture apparatus,
Describedly according to described position range information, control the current rotation that cannot photograph the capture apparatus in described target sound source and be specially:
Determine the rotation control parameter of the described capture apparatus that described directional information is corresponding;
According to described rotation, control the rotation that parameter is controlled described capture apparatus.
3. control method as claimed in claim 2, is characterized in that, described voice data is collected by a sound collection equipment, describedly according to described voice data, determines that the position range information in described target sound source is specially:
According to described voice data, determine that described target sound source is with respect to the azimuth information of described sound collection equipment;
According to described azimuth information, determine described directional information.
4. control method as claimed in claim 3, is characterized in that, describedly according to described azimuth information, determines that described directional information is specially:
According to described azimuth information, and the default corresponding relation of described azimuth information and described directional information is determined described directional information.
5. control method as claimed in claim 4, it is characterized in that, described sound collection equipment is for determining a default plane of described azimuth information and the preset reference point in described default plane, described capture apparatus is corresponding one first corresponding points in described default plane, described target sound source is corresponding one second corresponding points in described default plane
Described azimuth information be described the second corresponding points with respect to the position coordinates of described preset reference point, to be described the second corresponding points characterize coordinate with respect to the directional information of described the first corresponding points to described directional information,
The plane geometry function that described corresponding relation is variable as parameter, the described sounding corresponding points of take with respect to the directional information sign coordinate of described the first corresponding points with respect to the position coordinates of described preset reference point as independent variable, described the first corresponding points of take with respect to the position coordinates of described preset reference point for the sounding corresponding points of take in described default plane.
6. control method as claimed in claim 5, it is characterized in that, described the first corresponding points are to belong to the coordinate (a1 of the described preset reference point of take of described default plane on the rectangular coordinate of the first initial point with respect to the position coordinates of described preset reference point, a2), described sounding corresponding points are the coordinate (x on described rectangular coordinate with respect to the position coordinates of described preset reference point, y), y is greater than a2, it is to belong to the angle coordinate b of described the first corresponding points of take of described default plane on the polar coordinates of the second initial point that described directional information characterizes coordinate
When a2 is 0, described polar pole axis is identical with the direction of the x axle of described rectangular coordinate; When a2 is not 0, the x axle of described polar pole axis and described rectangular coordinate is parallel and direction is identical,
Described plane geometry function is b=arctan ((y-a2)/(x-a1)), and wherein, x is not equal to a1; Or,
Described plane geometry function is: when x is not equal to a1, b=arctan ((y-a2)/(x-a1)); When x equals a1, b=90 degree.
7. control method as claimed in claim 5, it is characterized in that, described parameter is determined with respect to the position coordinates of described preset reference point with respect to the directional information sign coordinate of described the first corresponding points by the training points of training sound source correspondence in described default plane according at least one obtaining by learning training mode.
8. a control device, is characterized in that, described control device comprises:
Acquisition module, for obtaining the voice data of the acoustic information that comprises target sound source;
Determination module, for determining the position range information in described target sound source according to described voice data;
Control module, for controlling the current rotation that cannot photograph the capture apparatus in described target sound source according to described position range information, makes described capture apparatus can photograph described target sound source.
9. control device as claimed in claim 8, is characterized in that, described position range information be described target sound source with respect to the directional information of described capture apparatus, described control module comprises:
The first determining unit, for determining the rotation control parameter of the described capture apparatus that described directional information is corresponding;
Control unit, for controlling according to described rotation the rotation that parameter is controlled described capture apparatus, makes described capture apparatus can photograph described target sound source.
10. control device as claimed in claim 9, is characterized in that, described voice data is collected by a sound collection equipment, and described determination module comprises:
The second determining unit, for determining that according to described voice data described target sound source is with respect to the azimuth information of described sound collection equipment;
The 3rd determining unit, for determining described directional information according to described azimuth information.
11. control device as claimed in claim 10, is characterized in that, described the 3rd determining unit comprises:
Determine subelement, for according to described azimuth information, and the default corresponding relation of described azimuth information and described directional information is determined described directional information.
12. 1 kinds of control appliances, is characterized in that, described control appliance comprises the control device described in any one in claim 8 to 11.
CN201210350741.3A 2012-09-20 2012-09-20 A kind of control method, control device and control device Active CN103685906B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210350741.3A CN103685906B (en) 2012-09-20 2012-09-20 A kind of control method, control device and control device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210350741.3A CN103685906B (en) 2012-09-20 2012-09-20 A kind of control method, control device and control device

Publications (2)

Publication Number Publication Date
CN103685906A true CN103685906A (en) 2014-03-26
CN103685906B CN103685906B (en) 2018-01-02

Family

ID=50322082

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210350741.3A Active CN103685906B (en) 2012-09-20 2012-09-20 A kind of control method, control device and control device

Country Status (1)

Country Link
CN (1) CN103685906B (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015042897A1 (en) * 2013-09-29 2015-04-02 中兴通讯股份有限公司 Control method, control apparatus and control device
CN104883524A (en) * 2015-06-02 2015-09-02 阔地教育科技有限公司 Method and system for automatically tracking and shooting moving object in online class
CN106331466A (en) * 2015-06-30 2017-01-11 芋头科技(杭州)有限公司 Method for rapid location and photographing through voice instruction and photographing system
WO2017143910A1 (en) * 2016-02-25 2017-08-31 中兴通讯股份有限公司 Acquisition processing method, device and system, and computer storage medium
CN108093167A (en) * 2016-11-22 2018-05-29 谷歌有限责任公司 Use the operable camera of natural language instructions
CN108231073A (en) * 2016-12-16 2018-06-29 深圳富泰宏精密工业有限公司 Phonetic controller, system and control method
CN108702458A (en) * 2017-11-30 2018-10-23 深圳市大疆创新科技有限公司 Image pickup method and device
CN110170170A (en) * 2019-05-30 2019-08-27 维沃移动通信有限公司 A kind of information display method and terminal device
CN110876036A (en) * 2018-08-31 2020-03-10 腾讯数码(天津)有限公司 Video generation method and related device
CN111142836A (en) * 2019-12-28 2020-05-12 深圳创维-Rgb电子有限公司 Screen orientation angle adjusting method and device, electronic product and storage medium
CN111193872A (en) * 2020-03-20 2020-05-22 北京文香信息技术有限公司 Method and system for controlling camera equipment and camera equipment
CN111936795A (en) * 2018-04-13 2020-11-13 三星电子株式会社 Air conditioner and method of controlling the same
CN112367473A (en) * 2021-01-13 2021-02-12 北京电信易通信息技术股份有限公司 Rotatable camera device based on voiceprint arrival phase and control method thereof
CN115103115A (en) * 2022-06-16 2022-09-23 北京达佳互联信息技术有限公司 Camera equipment control method and device and electronic equipment
WO2023164814A1 (en) * 2022-03-01 2023-09-07 深圳市大疆创新科技有限公司 Media apparatus and control method and device therefor, and target tracking method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6618073B1 (en) * 1998-11-06 2003-09-09 Vtel Corporation Apparatus and method for avoiding invalid camera positioning in a video conference
CN101449593A (en) * 2006-12-29 2009-06-03 坦德伯格电信公司 Microphone for audio source tracking
CN101567969A (en) * 2009-05-21 2009-10-28 上海交通大学 Intelligent video director method based on microphone array sound guidance
EP2388996A2 (en) * 2010-05-18 2011-11-23 Polycom, Inc. Videoconferencing endpoint having multiple voice-tracking cameras
CN102300043A (en) * 2010-06-23 2011-12-28 中兴通讯股份有限公司 Method for adjusting meeting place camera of remote presentation meeting system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6618073B1 (en) * 1998-11-06 2003-09-09 Vtel Corporation Apparatus and method for avoiding invalid camera positioning in a video conference
CN101449593A (en) * 2006-12-29 2009-06-03 坦德伯格电信公司 Microphone for audio source tracking
CN101567969A (en) * 2009-05-21 2009-10-28 上海交通大学 Intelligent video director method based on microphone array sound guidance
EP2388996A2 (en) * 2010-05-18 2011-11-23 Polycom, Inc. Videoconferencing endpoint having multiple voice-tracking cameras
CN102300043A (en) * 2010-06-23 2011-12-28 中兴通讯股份有限公司 Method for adjusting meeting place camera of remote presentation meeting system

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9591229B2 (en) 2013-09-29 2017-03-07 Zte Corporation Image tracking control method, control device, and control equipment
WO2015042897A1 (en) * 2013-09-29 2015-04-02 中兴通讯股份有限公司 Control method, control apparatus and control device
CN104883524B (en) * 2015-06-02 2018-09-11 阔地教育科技有限公司 Moving target automatic tracking image pickup method and system in a kind of Online class
CN104883524A (en) * 2015-06-02 2015-09-02 阔地教育科技有限公司 Method and system for automatically tracking and shooting moving object in online class
CN106331466B (en) * 2015-06-30 2019-06-07 芋头科技(杭州)有限公司 It is a kind of quickly to position the method taken pictures and camera system by phonetic order
CN106331466A (en) * 2015-06-30 2017-01-11 芋头科技(杭州)有限公司 Method for rapid location and photographing through voice instruction and photographing system
WO2017143910A1 (en) * 2016-02-25 2017-08-31 中兴通讯股份有限公司 Acquisition processing method, device and system, and computer storage medium
CN107124540A (en) * 2016-02-25 2017-09-01 中兴通讯股份有限公司 Acquiring and processing method, apparatus and system
CN108093167A (en) * 2016-11-22 2018-05-29 谷歌有限责任公司 Use the operable camera of natural language instructions
CN108093167B (en) * 2016-11-22 2020-07-17 谷歌有限责任公司 Apparatus, method, system, and computer-readable storage medium for capturing images
CN108231073A (en) * 2016-12-16 2018-06-29 深圳富泰宏精密工业有限公司 Phonetic controller, system and control method
WO2019104681A1 (en) * 2017-11-30 2019-06-06 深圳市大疆创新科技有限公司 Image capture method and device
CN108702458A (en) * 2017-11-30 2018-10-23 深圳市大疆创新科技有限公司 Image pickup method and device
US11388333B2 (en) 2017-11-30 2022-07-12 SZ DJI Technology Co., Ltd. Audio guided image capture method and device
CN111936795B (en) * 2018-04-13 2022-11-04 三星电子株式会社 Air conditioner and method of controlling the same
US11428426B2 (en) 2018-04-13 2022-08-30 Samsung Electronics Co., Ltd. Air conditioner and method for controlling air conditioner
CN111936795A (en) * 2018-04-13 2020-11-13 三星电子株式会社 Air conditioner and method of controlling the same
CN110876036A (en) * 2018-08-31 2020-03-10 腾讯数码(天津)有限公司 Video generation method and related device
CN110876036B (en) * 2018-08-31 2022-08-02 腾讯数码(天津)有限公司 Video generation method and related device
CN110170170A (en) * 2019-05-30 2019-08-27 维沃移动通信有限公司 A kind of information display method and terminal device
CN111142836A (en) * 2019-12-28 2020-05-12 深圳创维-Rgb电子有限公司 Screen orientation angle adjusting method and device, electronic product and storage medium
CN111142836B (en) * 2019-12-28 2023-08-29 深圳创维-Rgb电子有限公司 Screen orientation angle adjusting method and device, electronic product and storage medium
CN111193872A (en) * 2020-03-20 2020-05-22 北京文香信息技术有限公司 Method and system for controlling camera equipment and camera equipment
CN112367473A (en) * 2021-01-13 2021-02-12 北京电信易通信息技术股份有限公司 Rotatable camera device based on voiceprint arrival phase and control method thereof
WO2023164814A1 (en) * 2022-03-01 2023-09-07 深圳市大疆创新科技有限公司 Media apparatus and control method and device therefor, and target tracking method and device
CN115103115A (en) * 2022-06-16 2022-09-23 北京达佳互联信息技术有限公司 Camera equipment control method and device and electronic equipment

Also Published As

Publication number Publication date
CN103685906B (en) 2018-01-02

Similar Documents

Publication Publication Date Title
CN103685906A (en) Control method, control device and control equipment
US9591229B2 (en) Image tracking control method, control device, and control equipment
US9690262B2 (en) Display device and method for regulating viewing angle of display device
CN108089152B (en) Equipment control method, device and system
CN109032039B (en) Voice control method and device
CN111432115B (en) Face tracking method based on voice auxiliary positioning, terminal and storage device
CN103716669A (en) Electronic apparatus and control method of the same
CN103581606B (en) A kind of multimedia collection device and method
US20110274311A1 (en) Sign language recognition system and method
CN111062234A (en) Monitoring method, intelligent terminal and computer readable storage medium
CN110602389B (en) Display method and electronic equipment
CN111602139A (en) Image processing method and device, control terminal and mobile device
US11482237B2 (en) Method and terminal for reconstructing speech signal, and computer storage medium
CN206559550U (en) The remote control and television system of a kind of built-in microphone array
CN105554475A (en) IoT (Internet of Things) intelligent device with full duplex communication function
US20170345437A1 (en) Voice receiving method and device
CN109982054A (en) A kind of projecting method based on location tracking, device, projector and optical projection system
CN106465030A (en) Position determination apparatus, audio apparatus, position determination method, and program
CN204539315U (en) A kind of video conference machine of auditory localization
CN105516692A (en) Intelligent equipment for Internet of Things
CN103414992B (en) A kind of message adjustment system
CN112839165B (en) Method and device for realizing face tracking camera shooting, computer equipment and storage medium
WO2019119290A1 (en) Method and apparatus for determining prompt information, and electronic device and computer program product
CN201839377U (en) Whole scene infrared separation automatic tracking device
CN110133595A (en) A kind of sound source direction-finding method, device and the device for sound source direction finding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant