CN103685906A

CN103685906A - Control method, control device and control equipment

Info

Publication number: CN103685906A
Application number: CN201210350741.3A
Authority: CN
Inventors: 陈军; 黄强; 黄志宏; 袁洁
Original assignee: ZTE Corp
Current assignee: ZTE Corp
Priority date: 2012-09-20
Filing date: 2012-09-20
Publication date: 2014-03-26
Anticipated expiration: 2032-09-20
Also published as: CN103685906B

Abstract

The embodiment of the invention provides a control method, a control device and control equipment. The control method comprises the steps that audio data which comprise the sound information of a target sound source is acquired; according to the audio data, the position range information of the target sound source is determined; according to the position range information, the rotation of shooting equipment which cannot shoot the target sound source currently is controlled, so that the shooting equipment can shoot the target sound source. According to the embodiment of the invention, the shooting equipment can be supported to shoot the target sound source outside the range of an original screen.

Description

A kind of control method, control device and control appliance

Technical field

The embodiment of the present invention relates to image and follows the tracks of field, relates in particular to a kind of control method, control device and control appliance.

Background technology

In video communication process, camera need to be aimed to speaker.Existing solution is to use image recognition technology to identify people's face, and then remote control camera is aimed at people's face position, but this scheme cannot be followed the tracks of and exceeded the extraneous speaker of screen or in extraneous another speaker of screen.

Summary of the invention

In view of this, the object of the embodiment of the present invention is to provide a kind of control method, control device and control appliance, to support capture apparatus can photograph in the extraneous target sound of former screen source.

For solving the problems of the technologies described above, the embodiment of the present invention provides scheme as follows:

The embodiment of the present invention provides a kind of control method, and described control method comprises:

Obtain the voice data of the acoustic information that comprises target sound source;

According to described voice data, determine the position range information in described target sound source;

According to described position range information, control the current rotation that cannot photograph the capture apparatus in described target sound source, make described capture apparatus can photograph described target sound source.

Preferably, described position range information be described target sound source with respect to the directional information of described capture apparatus, describedly according to described position range information, control the current rotation that cannot photograph the capture apparatus in described target sound source and be specially:

Determine the rotation control parameter of the described capture apparatus that described directional information is corresponding;

According to described rotation, control the rotation that parameter is controlled described capture apparatus.

Preferably, described voice data is collected by a sound collection equipment, describedly according to described voice data, determines that the position range information in described target sound source is specially:

According to described voice data, determine that described target sound source is with respect to the azimuth information of described sound collection equipment;

According to described azimuth information, determine described directional information.

Preferably, describedly according to described azimuth information, determine that described directional information is specially:

According to described azimuth information, and the default corresponding relation of described azimuth information and described directional information is determined described directional information.

Preferably, described sound collection equipment is for determining a default plane of described azimuth information and the preset reference point in described default plane, described capture apparatus is corresponding one first corresponding points in described default plane, and described target sound source is corresponding one second corresponding points in described default plane

Described azimuth information be described the second corresponding points with respect to the position coordinates of described preset reference point, to be described the second corresponding points characterize coordinate with respect to the directional information of described the first corresponding points to described directional information,

The plane geometry function that described corresponding relation is variable as parameter, the described sounding corresponding points of take with respect to the directional information sign coordinate of described the first corresponding points with respect to the position coordinates of described preset reference point as independent variable, described the first corresponding points of take with respect to the position coordinates of described preset reference point for the sounding corresponding points of take in described default plane.

Preferably, described the first corresponding points are to belong to the coordinate (a1 of the described preset reference point of take of described default plane on the rectangular coordinate of the first initial point with respect to the position coordinates of described preset reference point, a2), described sounding corresponding points are the coordinate (x on described rectangular coordinate with respect to the position coordinates of described preset reference point, y), y is greater than a2, and it is to belong to the angle coordinate b of described the first corresponding points of take of described default plane on the polar coordinates of the second initial point that described directional information characterizes coordinate

When a2 is 0, described polar pole axis is identical with the direction of the x axle of described rectangular coordinate; When a2 is not 0, the x axle of described polar pole axis and described rectangular coordinate is parallel and direction is identical,

Described plane geometry function is b=arctan ((y-a2)/(x-a1)), and wherein, x is not equal to a1; Or,

Described plane geometry function is: when x is not equal to a1, b=arctan ((y-a2)/(x-a1)); When x equals a1, b=90 degree.

Preferably, described parameter is determined with respect to the position coordinates of described preset reference point with respect to the directional information sign coordinate of described the first corresponding points by the training points of training sound source correspondence in described default plane according at least one obtaining by learning training mode.

The embodiment of the present invention provides a kind of control device, and described control device comprises:

Acquisition module, for obtaining the voice data of the acoustic information that comprises target sound source;

Determination module, for determining the position range information in described target sound source according to described voice data;

Control module, for controlling the current rotation that cannot photograph the capture apparatus in described target sound source according to described position range information, makes described capture apparatus can photograph described target sound source.

Preferably, described position range information be described target sound source with respect to the directional information of described capture apparatus, described control module comprises:

The first determining unit, for determining the rotation control parameter of the described capture apparatus that described directional information is corresponding;

Control unit, for controlling according to described rotation the rotation that parameter is controlled described capture apparatus, makes described capture apparatus can photograph described target sound source.

Preferably, described voice data is collected by a sound collection equipment, and described determination module comprises:

The second determining unit, for determining that according to described voice data described target sound source is with respect to the azimuth information of described sound collection equipment;

The 3rd determining unit, for determining described directional information according to described azimuth information.

Preferably, described the 3rd determining unit comprises:

Determine subelement, for according to described azimuth information, and the default corresponding relation of described azimuth information and described directional information is determined described directional information.

The embodiment of the present invention provides a kind of control appliance that comprises above-described control device.

From the above, can find out, the control method that the embodiment of the present invention provides, control device and control appliance at least comprise following technique effect:

By obtaining the voice data of the acoustic information that comprises target sound source, determine accordingly the position range information in target sound source, and control the current rotation that cannot photograph the capture apparatus in target sound source according to this position range information, make capture apparatus can photograph target sound source, thereby support that capture apparatus can photograph in the extraneous target sound of former screen source.

Accompanying drawing explanation

The flow chart of a kind of control method that Fig. 1 provides for the embodiment of the present invention;

The array microphone of preferred embodiment one and the position coordinates figure of sound source of a kind of control method that Fig. 2 provides for the embodiment of the present invention;

The array microphone of the preferred embodiment two of a kind of control method that Fig. 3 provides for the embodiment of the present invention is placed on the position view in camera dead ahead;

The array microphone of preferred embodiment two and the position coordinates figure of sound source of a kind of control method that Fig. 4 provides for the embodiment of the present invention;

The training schematic diagram of the preferred embodiment two of a kind of control method that Fig. 5 provides for the embodiment of the present invention;

The schematic diagram of the preferred embodiment three of a kind of control method that Fig. 6 provides for the embodiment of the present invention.

Embodiment

For making object, technical scheme and the advantage of the embodiment of the present invention clearer, below in conjunction with the accompanying drawings and the specific embodiments the embodiment of the present invention is described in detail.

The flow chart of a kind of control method that Fig. 1 provides for the embodiment of the present invention, with reference to Fig. 1, the embodiment of the present invention provides a kind of control method, and described control method comprises the steps:

Step 101, obtains the voice data of the acoustic information that comprises target sound source;

Step 102, determines the position range information in described target sound source according to described voice data;

Step 103, controls the current rotation that cannot photograph the capture apparatus in described target sound source according to described position range information, makes described capture apparatus can photograph described target sound source.

Visible, by obtaining the voice data of the acoustic information that comprises target sound source, determine accordingly the position range information in target sound source, and control the current rotation that cannot photograph the capture apparatus in target sound source according to this position range information, make capture apparatus can photograph target sound source, thereby support that capture apparatus can photograph in the extraneous target sound of former screen source.

Obviously, described target sound source should in described capture apparatus by rotate can reach whole can coverage in.

Described target sound source can be talker, can be also audible device.

Described capture apparatus can be camera or camera.

Particularly, for example: in described acoustic information, can comprise the keyword content of the described position range information of default expression, by speech recognition technology, just can determine described position range information according to described voice data.

Or, for example: described position range information can be the directional information of described target sound source with respect to described capture apparatus, describedly according to described position range information, control the current rotation that cannot photograph the capture apparatus in described target sound source and be specifically as follows:

Wherein, parameter is controlled in described rotation, for example: the sign of a certain angle of described capture apparatus in some adjustable angles, the anglec of rotation of the cradle head controllor of camera, the directioin parameter of the optical axis of camera, etc.

Particularly, described voice data can be collected by a sound collection equipment, describedly according to described voice data, determines that the position range information in described target sound source is specifically as follows:

Wherein, described sound collection equipment for example, array microphone.

Described azimuth information can be direction or positional information.

Further, describedly according to described azimuth information, determine that described directional information is specifically as follows:

Particularly, for example, by enough training of a plurality of points, obtain the combination of abundant azimuth information and directional information, and obtain described corresponding relation by these combinations are carried out to matching.For example, take 0.1m places or mobile training sound source as distribution granularity.

Again for example, according to certain positional relationship, place described capture apparatus and described sound collection equipment, make described target sound generation source when arbitrary position described in the azimuth information direction represented with described directional information corresponding consistent; Based on this certain positional relationship, determine described corresponding relation.Such as, described capture apparatus and described sound collection equipment can be put together, or, when described sound collection equipment level is placed, described capture apparatus be placed on described sound collection equipment directly over.Preferred embodiment below once in adopted the mode putting together.

It should be noted that, consider the restriction of capture apparatus described in reality on placement location, described capture apparatus placement location can allow certain departing from, what at synchronization, can photograph due to described capture apparatus is a wider scope, as long as this departs from and can make described capture apparatus can photograph the represented direction of described azimuth information, this can realize in engineering practice in operation on the spot, does not repeat them here.

Again for example, described sound collection equipment is for determining a default plane of described azimuth information and the preset reference point in described default plane, described capture apparatus is corresponding one first corresponding points in described default plane, and described target sound source is corresponding one second corresponding points in described default plane

Described azimuth information be described the second corresponding points with respect to the position coordinates of described preset reference point, described directional information be described the second corresponding points with respect to the directional information characterization value of described the first corresponding points,

The plane geometry function that described corresponding relation is variable as parameter, the described sounding corresponding points of take with respect to the directional information characterization value of described the first corresponding points with respect to the position coordinates of described preset reference point as independent variable, described the first corresponding points of take with respect to the position coordinates of described preset reference point for the sounding corresponding points of take in described default plane.

Wherein, described the first corresponding points, such as, the photocentre of described capture apparatus or the photocentre of described capture apparatus are in the projection of described default plane.

Described the second corresponding points, such as, described target sound source described default plane certain a bit or described target sound source not in certain of described default plane a bit in projection of described default plane.

Described sounding corresponding points, such as, sound source is in the not projection in described default plane in the sounding reference point of described default plane of the sounding reference point of described default plane or sound source.Wherein, sounding reference point can be certain point of throat of people or certain point of the voice output unit of sound source.

Described directional information characterization value, such as, described the second corresponding points of take are initial point, while having an axis coordinate system in described default plane centered by this initial point, described sounding corresponding points are in the angle coordinate value of this axis coordinate system.

Which kind of device-dependent the corresponding default plane of described sound collection equipment and preset reference point and described sound collection equipment specifically adopt, the plane of orientation and the location reference point that adopt such as planar array Mike.

It should be noted that, in actual applications, the position of sound source can be in described default plane, also can be in a certain side of described default plane, and due to the impact of other factors, resulting azimuth information may have very little error, but, what at synchronization, can photograph due to capture apparatus is a wider scope, and thus, this error does not affect the solution of embodiment of the present invention technical problem to be solved.

Here provide the object lesson of described plane geometry function: described the first corresponding points are to belong to the coordinate (a1 of the described preset reference point of take of described default plane on the rectangular coordinate of the first initial point with respect to the position coordinates of described preset reference point, a2), described sounding corresponding points are the coordinate (x on described rectangular coordinate with respect to the position coordinates of described preset reference point, y), y is greater than a2, it is to belong to the angle coordinate b of described the first corresponding points of take of described default plane on the polar coordinates of the second initial point that described directional information characterizes coordinate

Described parameter can obtain by the measuring and calculating on the spot in project implementing process; Or, described parameter can by according at least one the training sound source obtaining by learning training mode in described default plane corresponding training points with respect to the position coordinates of described preset reference point with determined with respect to the directional information characterization value of described the first corresponding points.

Described learning training mode is for example:

Determine a first sound source of sound in described default plane corresponding the first training points with respect to the primary importance coordinate of described preset reference point with respect to the first direction information representation value of described the first corresponding points;

According to described primary importance coordinate and described first direction information representation value, obtain described parameter;

Wherein, described the first training points, described the first corresponding points and described preset reference are put not conllinear.

Learning training mode in preferred embodiment two has below adopted this learning training mode exactly.

Described learning training mode is again for example:

Determine a rising tone source of sound in described default plane corresponding the second training points with respect to the second place coordinate of described preset reference point with respect to the second direction information representation value of described the first corresponding points;

Determine one the 3rd sound source in described default plane corresponding the 3rd training points with respect to the 3rd position coordinates of described preset reference point with respect to the third direction information representation value of described the first corresponding points;

According to described second place coordinate, described second direction information representation value, described the 3rd position coordinates, described third direction information representation value, obtain described parameter;

Wherein, described the second training points, described the 3rd training points and described the first corresponding points conllinear not.

Learning training mode in preferred embodiment three has below adopted this learning training mode exactly.

For above-mentioned control method is further set forth and understood, below provide three preferred embodiments of described control method:

Preferred embodiment one:

The array microphone of preferred embodiment one and the position coordinates figure of sound source of a kind of control method that Fig. 2 provides for the embodiment of the present invention, with reference to Fig. 2, array microphone has multiple physical aspect, and this preferred embodiment is a linear array Mike, comprises at least 3 miaow heads above.Meanwhile, camera and array microphone put together.The step of this preferred embodiment is as follows:

Step 201, is used a plurality of miaow heads of array microphone to receive respectively voice data, issues processing center after filter out background noise, or issues filtering noise after processing center.

Step 202. processing center is separated by the voice extracting section in multi-path audio-frequency data according to frequency, then according to the phasometer of voice part in multi-path audio-frequency data, calculates the voice time difference that a plurality of miaow heads are received.

Step 203, the time difference of receiving according to a plurality of miaow heads is multiplied by velocity of sound and can calculates range difference, then according to the range difference between three miaow heads, can calculate the orientation of sound.

Particularly, array microphone miaow head direct range is known distance, and we are set as R, and we mark miaow head 2 for the origin of coordinates, miaow head 1 coordinate is (R, 0), and miaow head 3 coordinates are (R, 0) the sound source coordinate that, we need to calculate is (x, y);

The distance that we mark sound source arrival miaow head 1, miaow head 2, miaow head 3 is respectively L1, L2, L3, the time difference that actual our previous step records is multiplied by velocity of sound, for the difference between L1, L2, L3, the value that is to say L1-L3L2-L3 is known, we mark known L1-L3 is D13, and L2-L3 is D23;

According to Pythagorean theorem, draw:

L 1 = \sqrt{{(x + R)}^{2} + y^{2}} = \sqrt{x^{2} + y^{2} + R^{2} + 2 xR}

L 2 = \sqrt{x^{2} + y^{2}}

L 3 = \sqrt{{(x - R)}^{2} + y^{2}} = \sqrt{x^{2} + y^{2} + R^{2} - 2 xR}

?

L 3 = L 1 = L 3 = \sqrt{x^{2} + y^{2} + R^{2} + 2 xR} - \sqrt{x^{2} + y^{2} + R^{2} - 2 xR}

= \sqrt{{(\sqrt{x^{2} + y^{2} + R^{2} + 2 xR} - \sqrt{x^{2} + y^{2} + R^{2} - 2 xR})}^{2}}

= \sqrt{{2 x}^{2} + {2 y}^{2} + {2 R}^{2} - 2 \sqrt{x^{4} + y^{4} + R^{4} + {2 x}^{2} y^{2} + {2 x}^{2} R^{2} + {2 y}^{2} R^{2} - {4 x}^{2} R^{2}}}

After square:

{D 13}^{2} = {2 x}^{2} + {2 y}^{2} + {2 R}^{2} - 2 \sqrt{x^{4} + y^{4} + R^{4} + {2 x}^{2} y^{2} - {2 x}^{2} R^{2} + {2 y}^{2} R^{2}}

\sqrt{x^{4} + y^{4} + R^{4} + {2 x}^{2} y^{2} - {2 x}^{2} R^{2} + {2 y}^{2} R^{2}}

= x^{2} + {2 y}^{2} + R^{2} - 0.5 * {D 13}^{2}

After square:

x ⁴+y ⁴+R ⁴+2x ²y ²-2x ²R ²+2y ²R ²＝x ⁴+y ⁴+(R ²-0.5*D13 ²) ²+2x ²y ²

+2x ²(R ²-0.5*D13 ²)+2y ²(R ²-0.5*D13 ²)

After expansion:

x ⁴+y ⁴+R ⁴+2x ²y ²-2x ²R ²+2y ²R ²＝x ⁴+y ⁴+R ⁴-R ²D13 ²+0.25D13 ⁴+2x ²y ²

+2x ²R ²-x ²D13 ²+2y ²R ²-y ²D13 ²

Left and right obtains after eliminating:

y ²D13 ²＝-R ²D13 ²+0.25D13 ⁴+4x ²R ²-x ²D13 ²

Finally obtain:

y = &PlusMinus; \sqrt{(\frac{{4 R}^{2}}{{D 13}^{2}} - 1) x^{2} + 0.25 * {D 13}^{2} - R^{2}}

In the practical application scene of this preferred embodiment, sound source is forever from the place ahead, and now negative sign can omit, and becomes:

\sqrt{(\frac{{4 R}^{2}}{{D 13}^{2}} - 1) x^{2} + 0.25 * {D 13}^{2} - R^{2}}

Formula A

Meanwhile, we also need to meet:

D 23 = L 2 - L 3 = \sqrt{x^{2} + y^{2}} - \sqrt{x^{2} + y^{2} + R^{2} - 2 xR}

That is:

D 23 = \sqrt{x^{2} + y^{2}} - \sqrt{x^{2} + y^{2} + R^{2} - 2 xR}

Formula B

Use software program can easily draw the x that simultaneously meets formula A and formula B, y, be specially: positive and negative according to the positive negative judgement x of D13, take x as cyclic variable again, recycle formula A and obtain y, re-use this x, y substitution formula B is until formula B establishment, the x now obtaining, y is sound source location.

The angle of sound source is

Step 204, the sound bearing calculating according to previous step, controls this direction of camera rotational alignment.

Step 205, in the image of camera picked-up, is used face recognition technology to find people's face position, specific as follows:

The view data that first step input collects;

Second step, by complexion model binary image, is about to the non-colour of skin and is partly set to 0, and the colour of skin is partly set to 1, and wherein colour of skin span can obtain by statistical learning in physical device;

Corrosion expansion algorithm filtering for the 3rd step;

The 4th step is used UNICOM region to detect, and the wide people of meeting in UNICOM region of the take little height of being bold is more than or equal to people and is bold little to be standard, to judge people's face position.

Step 206, rotates camera until aim at people's face to people's face direction.

Preferred embodiment two:

The array microphone of the preferred embodiment two of a kind of control method that Fig. 3 provides for the embodiment of the present invention is placed on the position view in camera dead ahead, with reference to Fig. 3, array microphone has multiple physical aspect, and this preferred embodiment is a circular array Mike, comprises at least 3 miaow heads above.Meanwhile, camera does not put together with array microphone, and array microphone is placed on camera dead ahead.The step of this preferred embodiment is as follows:

Step 301, is used a plurality of miaow heads of array microphone to receive respectively voice data, issues processing center after filter out background noise, or issues filtering noise after processing center.

Step 302, processing center is separated by the voice extracting section in multi-path audio-frequency data according to frequency, then according to the phasometer of voice part in multi-path audio-frequency data, calculates the voice time difference that a plurality of miaow heads are received.

Step 303, the time difference of receiving according to a plurality of miaow heads is multiplied by velocity of sound and can calculates range difference, then according to the range difference between three miaow heads, can calculate the orientation of sound.

Particularly, the array microphone of preferred embodiment two and the position coordinates figure of sound source of a kind of control method that Fig. 4 provides for the embodiment of the present invention, with reference to Fig. 4, array microphone miaow head direct range is known distance, and we are set as R, and we mark array microphone center is the origin of coordinates, miaow head 1 coordinate is (R, 0), miaow head 2 coordinates be (0, R), miaow head 3 coordinates are (R, 0) the sound source coordinate that, we need to calculate is (x, y);

According to Pythagorean theorem, draw:

L 1 = \sqrt{{(x + R)}^{2} + y^{2}} = \sqrt{x^{2} + y^{2} + R^{2} + 2 xR}

L 2 = \sqrt{x^{2} + {(y - R)}^{2}} = \sqrt{x^{2} + y^{2} + R^{2} - 2 yR}

L 3 = \sqrt{{(x - R)}^{2} + y^{2}} = \sqrt{x^{2} + y^{2} + R^{2} - 2 xR}

?

D 13 = L 1 - L 3 = \sqrt{x^{2} + y^{2} + R^{2} + 2 xR} - \sqrt{x^{2} + y^{2} + R^{2} - 2 xR}

Identically with preferred embodiment one according to D13, can derive and draw formula A:

y = \sqrt{(\frac{{4 R}^{2}}{{D 13}^{2}} - 1) x^{2} + 0.25 * {D 13}^{2} - R^{2}}

Formula A

Meanwhile, we also need to meet:

D 23 = L 2 - L 3 = \sqrt{x^{2} + y^{2} + R^{2} - 2 yR} - \sqrt{x^{2 +} y^{2} + R^{2} - 2 xR}

That is:

D 23 = \sqrt{x^{2} + y^{2} + R^{2} - 2 yR} - \sqrt{x^{2 +} y^{2} + R^{2} - 2 xR}

Formula C

Use software program can easily draw the x that simultaneously meets formula A and formula C, y, be specially: positive and negative according to the positive negative judgement x of D13, take x as cyclic variable again, recycle formula A and obtain y, re-use this x, y substitution formula C until formula C sets up, the x, the y that now obtain are sound source location.

Step 304, the direction that camera theta alignment angle arctan ((d+y)/x) is represented.

Wherein, at actual use scenes, because digital microphone and the camera position in meeting-place are fixed, can not move, d can adopt learning training mode to obtain.Particularly, the training schematic diagram of the preferred embodiment two of a kind of control method that Fig. 5 provides for the embodiment of the present invention, with reference to Fig. 5, during training, allow speaker not stand in camera dead ahead, namely in Fig. 5, a can not be 90 degree, and then by camera rotational alignment speaker, camera records angle b.After speaker speaks, use step above to obtain x, y coordinate figure, can be by calculating the distance d between video camera and array microphone, computing formula is: d=x/tan (b)-y.

Step 305, in the image of camera picked-up, is used face recognition technology to find people's face position, and method is as follows:

The yuv data that first step input collects;

Corrosion expansion algorithm filtering for the 3rd step

The 4th step is used UNICOM region to detect, and the wide people of meeting in UNICOM region of the take little height of being bold is more than or equal to people and is bold little to be standard, to judge people's face position

Step 306, then rotates camera until aim at people's face to people's face direction.

Preferred embodiment three:

The schematic diagram of the preferred embodiment three of a kind of control method that Fig. 6 provides for the embodiment of the present invention, with reference to Fig. 6, array microphone has multiple physical aspect, and this preferred embodiment is a circular array Mike, comprises at least 3 miaow heads above.Camera does not put together with array microphone, and array microphone is placed on camera the place ahead and has the displacement of horizontal direction.Sound source location coordinate is (x, y), and the coordinate of the relative array microphone of camera is (l ,-d).The step of this preferred embodiment is as follows:

Step 401, by with preferred embodiment two in step 301 ~ 303 similarly mode obtain x, y.

Step 402, the direction that camera theta alignment angle b is represented, b=arctan ((y+d)/(x-l)).

Wherein, at actual use scenes, because digital microphone and the camera position in meeting-place are fixed, can not move, d and l can adopt learning training mode to obtain.Particularly, first, stand camera dead ahead speech of trainer, array microphone calculates coordinate (x1, y1), the abscissa l=x1 of camera; Then, the trainer's non-dead ahead of camera speech of standing, operating personnel control camera and aim at trainer, and now to count on angle be b2 to camera oneself; Array microphone calculates coordinate (x2, y2), tan (b2)=(y2+d)/(x2-l), and l=x1, tan (b2)=(y2+d)/(x2-x1), can calculate thus d=tan (b2) * (x2-x1)-y2.

Step 403, in the image of camera picked-up, is used face recognition technology to find people's face position, and method is as follows:

The yuv data that first step input collects;

Corrosion expansion algorithm filtering for the 3rd step

Step 404, then rotates camera until aim at people's face to people's face direction.

The embodiment of the present invention also provides a kind of control device, and described control device comprises:

Visible, by obtaining the voice data of the acoustic information that comprises target sound source, determine accordingly the position range information in target sound source, and control the current rotation that cannot photograph the capture apparatus in target sound source according to this position range information, make capture apparatus can photograph target sound source, thereby support that capture apparatus can photograph in the extraneous target sound of former screen source

Further, described position range information be described target sound source with respect to the directional information of described capture apparatus, described control module comprises:

Further, described voice data is collected by a sound collection equipment, and described determination module comprises:

Further, described the 3rd determining unit comprises:

The embodiment of the present invention also provides a kind of control appliance, and described control appliance comprises above-described control device.

The above is only the execution mode of the embodiment of the present invention; should be understood that; for those skilled in the art; do not departing under the prerequisite of embodiment of the present invention principle; can also make some improvements and modifications, these improvements and modifications also should be considered as the protection range of the embodiment of the present invention.

Claims

1. a control method, is characterized in that, described control method comprises:

2. control method as claimed in claim 1, is characterized in that, described position range information be described target sound source with respect to the directional information of described capture apparatus,

Describedly according to described position range information, control the current rotation that cannot photograph the capture apparatus in described target sound source and be specially:

3. control method as claimed in claim 2, is characterized in that, described voice data is collected by a sound collection equipment, describedly according to described voice data, determines that the position range information in described target sound source is specially:

4. control method as claimed in claim 3, is characterized in that, describedly according to described azimuth information, determines that described directional information is specially:

5. control method as claimed in claim 4, it is characterized in that, described sound collection equipment is for determining a default plane of described azimuth information and the preset reference point in described default plane, described capture apparatus is corresponding one first corresponding points in described default plane, described target sound source is corresponding one second corresponding points in described default plane

6. control method as claimed in claim 5, it is characterized in that, described the first corresponding points are to belong to the coordinate (a1 of the described preset reference point of take of described default plane on the rectangular coordinate of the first initial point with respect to the position coordinates of described preset reference point, a2), described sounding corresponding points are the coordinate (x on described rectangular coordinate with respect to the position coordinates of described preset reference point, y), y is greater than a2, it is to belong to the angle coordinate b of described the first corresponding points of take of described default plane on the polar coordinates of the second initial point that described directional information characterizes coordinate

7. control method as claimed in claim 5, it is characterized in that, described parameter is determined with respect to the position coordinates of described preset reference point with respect to the directional information sign coordinate of described the first corresponding points by the training points of training sound source correspondence in described default plane according at least one obtaining by learning training mode.

8. a control device, is characterized in that, described control device comprises:

9. control device as claimed in claim 8, is characterized in that, described position range information be described target sound source with respect to the directional information of described capture apparatus, described control module comprises:

10. control device as claimed in claim 9, is characterized in that, described voice data is collected by a sound collection equipment, and described determination module comprises:

11. control device as claimed in claim 10, is characterized in that, described the 3rd determining unit comprises:

12. 1 kinds of control appliances, is characterized in that, described control appliance comprises the control device described in any one in claim 8 to 11.