WO2024002022A1

WO2024002022A1 - Photographing composition method and apparatus, and computer device and storage medium

Info

Publication number: WO2024002022A1
Application number: PCT/CN2023/102488
Authority: WO
Inventors: 蔡智; 马龙祥; 张伟俊; 吴烁楠; 蒋宪宏
Original assignee: 影石创新科技股份有限公司
Priority date: 2022-06-27
Filing date: 2023-06-26
Publication date: 2024-01-04
Also published as: CN115022549A; CN115022549B

Abstract

The present application relates to a photographing composition method and apparatus, and a computer device, a storage medium and a computer program product. The method comprises: acquiring the current frame of image, and acquiring a target subject state of a subject in the current frame of image, and the current gesture state; updating a historical gesture trajectory set on the basis of the current gesture state, so as to obtain the current gesture trajectory set; determining, from the current gesture trajectory set, the current gesture trajectory triggered by the subject and a target gesture state triggered by the subject; and acquiring target working parameters according to the target gesture state, and adjusting a corresponding device in a photographing composition system according to the target working parameters. By means of analyzing image content in real time, position and size information of a single/plurality of subjects are extracted, and a corresponding composition is generated on the basis of a composition mode; and by means of analyzing a gesture action of a photographed subject, an interactive composition function based on gesture control is realized, thereby realizing non-contact composition adjustment which can be flexibly set.

Description

Photography composition methods, devices, computer equipment and storage media

Technical field

The present application relates to the field of image processing technology, and in particular to a photographing and composition method, device, computer equipment, storage medium and computer program product.

Background technique

Automatic composition technology can be divided into two categories according to the implementation stage of the composition operation: one-time composition, that is, before the shooting action, the image content of the preview screen is analyzed to obtain the deviation information between the current composition and the target composition, and then the corresponding composition prompts are output. Secondary composition means that after the shooting is completed, the software automatically analyzes the content of the shot, confirms the composition method and performs corresponding cropping, zooming and other operations, and finally outputs the image/video after the second composition. In related technologies, the user mainly manually adjusts the position and angle of the shooting device to achieve one-time composition, which is inefficient and manual adjustment is not accurate enough.

Contents of the invention

Based on this, it is necessary to provide a quick and contactless shooting composition method, device, computer equipment, computer-readable storage medium and computer program product to address the above technical problems.

In a first aspect, this application provides a photographing composition method. The methods include:

Obtain the current frame image captured by the shooting device, obtain the target subject state of the subject in the current frame image and the current gesture state of at least one gesture;

Obtain a collection of historical gesture trajectories, update the collection of historical gesture trajectories based on the current gesture state of at least one gesture, and obtain a collection of current gesture trajectories;

According to the state of the target subject, determine the current gesture trajectory triggered by the subject in the current gesture trajectory set, and determine the target gesture state triggered by the subject in the current gesture trajectory;

According to the target gesture state, the target working parameter of at least one device in the shooting composition system is obtained, and the corresponding device in the shooting composition system is adjusted according to the target working parameter.

In one embodiment, the current frame image belongs to the image frame group, the images in the image frame group are sorted according to the shooting sequence of the shooting device, and the current frame image is the last frame image; the image of the main object in the current frame image is obtained. Target subject status, including:

Obtain the historical subject status of the subject in the first frame of the image frame group;

Based on the image frame group, perform target tracking on the subject and obtain the predicted subject state of the subject in the current frame image;

The historical subject state and the predicted subject state are integrated to obtain the target subject state of the subject in the current frame image.

In one embodiment, updating the historical gesture trajectory set based on the current gesture state of at least one gesture to obtain the current gesture trajectory set includes:

Match the current gesture state of each gesture with each historical gesture trajectory in the historical gesture trajectory collection, and determine the matching current gesture state and historical gesture trajectory;

Add the current gesture state of each matching historical gesture track to the matching historical gesture track to obtain a current gesture track set.

In one embodiment, after matching the current gesture state of each gesture with each historical gesture trajectory in the historical gesture trajectory collection, the method further includes:

In the case where there is a current gesture state that does not match each historical gesture trajectory, a new gesture trajectory is created based on the current gesture state that does not match each historical gesture trajectory and is added to the current gesture trajectory collection.

In one embodiment, the current gesture trajectory set also includes historical gesture trajectories in which the current gesture state is not added to the historical gesture trajectory set; according to the state of the target subject, the current gesture triggered by the subject is determined in the current gesture trajectory set Before the trajectory, also included:

For each gesture trajectory in the current gesture trajectory collection, obtain the addition moment corresponding to the last gesture state added in each gesture trajectory;

Calculate the time interval between each addition moment and the acquisition moment of the current frame image, and delete the gesture trajectories corresponding to the time interval greater than the preset time from the current gesture trajectory set.

In one embodiment, the target subject state includes the position of the subject; according to the target subject state, the current gesture trajectory triggered by the subject is determined in the current gesture trajectory set, and in the current gesture trajectory, the current gesture trajectory triggered by the subject is determined. Target gesture states triggered by objects, including:

For each gesture trajectory in the current gesture trajectory set, determine the last gesture state added in each gesture trajectory, where the gesture state includes the gesture position;

According to the distance between the position of the subject and the gesture position in each last added gesture state, filter the current gesture trajectory set to obtain the current gesture trajectory;

When the current gesture trajectory meets the preset detection conditions, the last gesture state added in the current gesture trajectory is used as the target gesture state.

In one embodiment, the preset detection condition includes at least one of the following two conditions. The following two conditions are that the number of gesture states in the current gesture trajectory is not less than the preset number, and that the last addition in the current gesture trajectory is When adding k gesture states corresponding to The moment sequence matches the shooting moment sequence corresponding to the last k frames of images captured by the shooting device; where k is a positive integer.

In one embodiment, the target gesture state includes the target gesture type; before obtaining the target working parameter of at least one device in the shooting composition system according to the target gesture state, the method further includes:

Determine whether the target gesture type matches the specified gesture type;

When the target gesture type matches the specified gesture type, the step of obtaining the target working parameter of at least one device in the shooting composition system is performed according to the target gesture state.

In one embodiment, the target subject state includes the position of the subject, and the target gesture state includes the target gesture position; according to the target gesture state, the target working parameter of at least one device in the shooting composition system is obtained, including:

Based on the position of the subject and the position of the target gesture, determine the relative positional relationship between the gesture corresponding to the subject and the target gesture state in the current frame image;

According to the relative position relationship, the target working parameters of at least one piece of equipment in the shooting composition system are determined.

In one embodiment, the shooting composition system also includes a control device, which is used to control the movement of the shooting device; the target working parameters include the zoom parameter of the shooting device or the optical axis orientation of the shooting device or the location of the shooting device at least one of them.

Obtain the historical subject state of the subject in the historical image captured by the shooting device and the historical gesture state triggered by the subject;

Based on the position of the subject and the target gesture position, calculate the first distance between the subject and the gesture corresponding to the target gesture state in the current frame image;

Based on the historical position of the subject in the historical subject state and the historical gesture position in the historical gesture state, calculate the second distance between the subject in the historical image and the gesture corresponding to the historical gesture state;

According to the difference between the first distance and the second distance, a target working parameter of at least one piece of equipment in the shooting composition system is determined.

In one embodiment, the target gesture state includes the target gesture type; according to the target gesture state, the target working parameters of at least one device in the shooting composition system are obtained, including:

When the target gesture type matches the preset gesture type, the preset working parameters corresponding to the preset gesture type are obtained and used as the target working parameters. The preset working parameters include the preset zoom coefficient of the shooting device.

In one of the embodiments, the shooting composition system also includes a control device, and the control device is used to control the movement of the shooting device; accordingly, The preset working parameters also include at least one of the preset optical axis orientation of the shooting device or the preset position of the shooting device.

In a second aspect, this application also provides a shooting composition device. Devices include:

A data acquisition module, used to acquire the current frame image captured by the shooting device, acquire the target subject state of the subject in the current frame image and the current gesture state of at least one gesture;

The gesture update module is used to obtain the historical gesture trajectory collection, update the historical gesture trajectory collection based on the current gesture state of at least one gesture, and obtain the current gesture trajectory collection;

The gesture determination module is used to determine the current gesture trajectory triggered by the subject object in the current gesture trajectory set according to the state of the target subject object, and determine the target gesture state triggered by the subject object in the current gesture trajectory;

The equipment adjustment module is used to obtain the target working parameters of at least one device in the shooting composition system according to the target gesture state, and adjust the corresponding equipment in the shooting composition system according to the target working parameters.

In a third aspect, this application also provides a computer device. The computer device includes a memory and a processor, the memory stores a computer program, and the processor implements the following steps when executing the computer program:

In a fourth aspect, this application also provides a computer-readable storage medium. The computer-readable storage medium has a computer program stored thereon, and when the computer program is executed by the processor, the following steps are implemented:

In a fifth aspect, this application also provides a computer program product. The computer program product includes a computer program that implements the following steps when executed by a processor:

The above-mentioned shooting and composition methods, devices, computer equipment, storage media and computer program products acquire the current frame image captured by the shooting device, acquire the target subject state of the subject in the current frame image and the current gesture state of at least one gesture; obtain The historical gesture trajectory collection updates the historical gesture trajectory collection based on the current gesture state of at least one gesture to obtain the current gesture trajectory collection; determines the current gesture trajectory triggered by the subject object in the current gesture trajectory collection according to the target subject state, And in the current gesture trajectory, the target gesture state triggered by the subject object is determined; according to the target gesture state, the target working parameters of at least one device in the shooting composition system are obtained, and the corresponding equipment in the shooting composition system is processed according to the target working parameters. adjust. By analyzing the image content in real time, extracting the position and size information of single/multiple subjects, generating the corresponding composition based on the composition mode, and by analyzing the gesture movements of the subjects, the interactive composition function based on gesture control is realized. This enables non-contact, flexibly set composition adjustment.

Description of drawings

Figure 1 is an application environment diagram of the shooting composition method in one embodiment;

Figure 2 is a schematic flowchart of a shooting composition method in one embodiment;

Figure 3 is a schematic flowchart of obtaining the target subject status of the subject in one embodiment;

Figure 4 is a schematic flowchart of determining the target gesture state in one embodiment;

Figure 5 is a schematic flowchart of a shooting composition method in another embodiment;

Figure 6 is a schematic flowchart of a shooting composition method in yet another embodiment;

Figure 7 is a structural block diagram of a shooting and composition device in one embodiment;

Figure 8 is an internal structure diagram of a computer device in one embodiment.

Detailed ways

In order to make the purpose, technical solutions and advantages of the present application more clear, the present application will be further described in detail below with reference to the drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application and are not used to limit the present application.

Automatic composition technology in related technologies can be divided into two categories according to the implementation stage of the composition operation. Among them, one-time composition refers to analyzing the image content of the rain screen before the shooting action, obtaining the deviation information between the current composition and the target composition, and then outputting According to the corresponding composition prompt, some users manually adjust the position and angle of the shooting equipment, and trigger the corresponding shooting operation after reaching the specified position. Or the shooting equipment can control the corresponding actuator to reach the specified composition state by itself based on the composition deviation information to realize the shooting operation. Secondary composition means that after the shooting is completed, the software automatically analyzes the content of the shot, confirms the composition method and implements corresponding cropping, zooming and other operations, and finally outputs the image/video after the second composition. For the one-time composition solution, since the user manually adjusts the shooting position and angle, the efficiency is low and the manual adjustment is not accurate enough.

The shooting composition method provided by the embodiment of the present application is applied to real-time video shooting scenarios, and can be specifically applied to the application environment as shown in Figure 1. Among them, the terminal 102 communicates with the server 104 through the network. Specifically, the video stream captured in real time or the still image obtained can be transmitted to the server 104, and the server analyzes the main object in the video stream or still image, and then analyzes the shooting device. Make adjustments. The data storage system may store data that server 104 needs to process. The data storage system can be integrated on the server 104, or placed on the cloud or other network servers.

The terminal 102 is a device capable of acquiring video streams or images, and may include but is not limited to various personal computers with cameras, laptops, smart phones, tablets, logistics network equipment, etc. Server 104 The server may be an independent physical server, a server cluster or a distributed system composed of multiple physical servers, or a cloud server that provides cloud computing services.

In some embodiments, as shown in Figure 2, embodiments of the present application provide a shooting composition method. It can be understood that, for the current frame image in the video stream, multiple subject objects may actually be detected in the current frame image. The main idea of the embodiment of the present application is to use gestures triggered by the target subject to automatically adjust the shooting composition. It is understandable that among these subjects, only the gesture state triggered by the target subject is meaningful for automatically adjusting the shooting composition. In this way, the gesture status corresponding to the subject can be analyzed, and then the shooting device can be adjusted without contact. Taking this method applied to a computer device (the computer device may be a terminal or a server in Figure 1) as an example, the description includes the following steps:

Step 202: Obtain the current frame image captured by the shooting device, obtain the target subject state of the subject in the current frame image and the current gesture state of at least one gesture;

The photographing device may be a camera or a mobile terminal with a photographing function, which is not specifically limited in the embodiments of the present application. The current frame image may be an image captured by the shooting device at the current moment, or may be an image frame at the current moment in the video stream captured by the shooting device. This is not specifically limited in the embodiment of the present application. It should be noted that, regardless of whether the shooting device captures a video stream or an image frame, in the embodiment of the present application, it is mainly to obtain a target gesture state that is triggered by the subject object and is meaningful for the shooting composition. It is understandable that the target gesture state that is meaningful for the shooting composition may not necessarily be recognized and obtained through only one frame of image. Therefore, only in the application embodiment, multiple frames of images are captured, and at the current moment when the "current frame image" is captured, the target gesture state triggered by the subject object is obtained by combining the previously captured images.

It can be understood from this that the previous frame image of the current frame image is also processed using the method provided by the embodiment of the present application. As for the relationship between the current frame image and the previous frame image, when the shooting device captures the image frame, in the actual implementation process, the current frame image and the previous frame image can be two frames of images continuously acquired by the shooting device. A preset frame image may also be spaced between a frame image and the previous frame image, which is not specifically limited in the embodiments of the present application. When the shooting device captures a real-time video stream, the current frame image can be intercepted from the real-time video stream at the current moment, and the previous frame image and the current frame image can be continuous or separated by multiple Image frames are not specifically limited in the embodiments of this application. In order to facilitate understanding and explanation, the embodiment of the present application takes the current frame image intercepted from the real-time video stream as an example to describe the subsequent process.

The subject object refers to an object that can trigger the gesture. In the embodiment of the present application, the object may refer to a person. It is understandable that there may be more than one subject captured in the current frame image, such as more than one person. These subjects may all trigger gestures, but usually only one subject can trigger a gesture state related to the shooting composition, and this subject can be the target subject. The "target subject state" mentioned in this step It mainly refers to the state of the target subject. The gesture state triggered by the target subject is the target gesture state. In addition, taking the subject as a person as an example, the image content presented by the subject in the current frame image may be the person's head, the person's upper body, or the person's whole body captured by the shooting device. The embodiments of the present application are No specific limitation is made.

The target subject state is mainly used to represent the state of the image content presented by the subject in the current frame image, which may include at least one of the range or position occupied by the subject in the current frame image. Specifically, the target subject state may include the position and size of the subject's bounding box in the current frame image, and may be obtained by detecting the current frame image using a person detection algorithm, which is not specifically limited in the embodiments of this application. It should be noted that the position of the external frame may be the coordinates of the upper left corner of the external frame or the coordinates of the center point of the external frame, which is not specifically limited in the embodiment of the present application.

The current gesture state is mainly used to refer to the state of the image content presented by the subject's hand in the current frame image, which may also include at least one of the range or position occupied in the current frame image. Specifically, the current gesture status may include the position, size, and gesture category of the gesture, and may be obtained by detecting the current frame image through a gesture detection algorithm. Among them, the position and size of the gesture are large Small can also be represented by an external frame. At the same time, the position of the bounding box can also be the coordinates of the upper left corner of the bounding box or the coordinates of the center point of the bounding box. It should be noted that the "current" in "current gesture state" mainly emphasizes the gesture state obtained in the current frame image, while the "at least one" in "at least one gesture" is mainly because the subject may There will be more than one, and the resulting current gesture state may also be more than one.

It can be understood that the shooting composition method provided by the embodiment of the present application needs to use both the subject object and the gesture state. Therefore, if the gesture state is not detected in the current frame image, the method provided by the embodiment of the present application will continue to be used to process the next frame image of the current frame image.

Step 204: Obtain a historical gesture trajectory set, update the historical gesture trajectory set based on the current gesture state of at least one gesture, and obtain the current gesture trajectory set;

Among them, the gesture trajectory refers to a collection of a series of gesture states recorded in the order of acquisition within a period of time. The gesture trajectory collection is a collection of multiple gesture trajectories. Multiple gesture trajectories are generated because more than one gesture state may be detected in an image frame. The historical gesture trajectory set refers to the gesture trajectory set determined based on the image before the current frame image, and the current gesture trajectory set is obtained by updating the historical gesture trajectory set based on the current frame image.

For example, if the gesture state includes position a, size b, and category c, then R=(a, b, c) is used to represent the gesture state. The current frame image is recorded as the t-th frame image, and the current gesture state obtained in the current frame image is R _ti , i=1,...,n. Among them, i represents the i-th gesture, and n represents the total number of gestures. The historical gesture trajectory of the i-th gesture can be expressed as T={R _1i ,...,R _t-1i }. The historical gesture trajectory set can be expressed as {T _j }, j=1,...,m. Among them, m represents the number of gesture trajectories in the historical gesture trajectory collection, and j represents the jth historical gesture trajectory in the historical gesture trajectory collection.

In this step, "updating the historical gesture trajectory set based on the current gesture state of the at least one gesture" mainly means adding the current gesture state to the historical gesture trajectory or forming a new gesture trajectory. Specifically, the current gesture state can be matched with each gesture track in the historical track set. If the match is successful, the current gesture state is added to the successfully matched gesture track. If the match fails, a new gesture track can be created based on the current gesture state. gesture trajectories. When matching the current gesture state with the gesture track, the gesture position in the gesture state can be used for matching, such as calculating the gesture position in the current gesture state and the gesture state at the last recorded moment of each gesture track in the historical gesture track set. The distance between the gesture positions. If the minimum value among all distances meets the preset distance, the gesture trajectory corresponding to the minimum distance will be regarded as the gesture trajectory that successfully matches the current gesture state. For another example, matching may be performed based on the gesture category in the current gesture state, which is not specifically limited in the embodiments of the present application.

Step 206: Determine the current gesture trajectory triggered by the subject object in the current gesture trajectory set according to the state of the target subject object, and determine the target gesture state triggered by the subject object in the current gesture trajectory;

It is understandable that there may be multiple gestures in the current frame image, and the current gesture trajectory set obtained will include multiple gestures. gesture trajectory. It can be seen from the above process that the process of determining the target subject state of the subject based on the current frame image and the process of determining the current gesture state of the gesture based on the current frame image are independent of each other. Before adjusting the shooting device according to the subject's gesture, It is necessary to establish a connection between the subject and the gesture, so that after updating the historical gesture trajectory set to obtain the current gesture trajectory set, the current gesture trajectory corresponding to the subject can be determined in the current gesture trajectory set according to the target subject state of the subject. Then determine the target gesture state triggered by the subject object.

Step 208: Obtain the target working parameters of at least one device in the shooting composition system according to the target gesture state, and adjust the corresponding equipment in the shooting composition system according to the target working parameters.

The target gesture state refers to a gesture state that can be used to adjust the working parameters of the shooting composition system. For example, when the gesture category of the target gesture state triggered by the subject is an automatic composition gesture, the orientation of the camera can be adjusted according to the position of the bounding box in the target state and the position of the preset bounding frame, so that in the next frame of the image The position of the external frame of the main object should fit as closely as possible to the position of the preset external frame.

It is understandable that the working parameters that need to be adjusted may be different according to different composition requirements. Therefore, more than one parameter may need to be adjusted according to the target gesture state. It should be noted that in the embodiment of the present application, the composition requirements are determined through the gesture category corresponding to the gesture triggered by the subject object, so as to adjust the working parameters. Understandably, one gesture category usually corresponds to one compositional need.

Specifically, according to the gesture category in the target gesture state, the target working parameters corresponding to this gesture category can be directly determined, or the adjustment method of the working parameters can also be determined, and the target working parameters can be obtained by adjusting the working parameters. For example, if the gesture category in the target gesture state indicates adjusting the focal length of the shooting device, you can obtain the previous zoom coefficient of the shooting device, and then determine the amount of change in focal length based on the relative position of the subject and the gesture. The zoom factor and focal length change determine the current zoom factor. Among them, the focal length change amount is positive or negative. A positive number indicates that the focal length becomes larger, and a negative number indicates that the focal length becomes smaller. The specific adjustment method can be determined based on the gesture category in the target gesture state.

In the actual implementation process, considering that the time interval between the current frame image and the previous frame image may be small, this makes it unlikely to meet the composition needs with one adjustment. Therefore, the shooting device is usually adjusted continuously through corresponding gestures showing the same composition requirement in several consecutive frames, so that the shooting device can meet the composition requirement. Combined with the above example, when adjusting the zoom factor, the focal length may need to be adjusted to a value of 4, but the focal length change determined based on each frame of image is 0.5, that is, only 0.5 can be adjusted at a time, so 8 consecutive frames of images are required. The zoom gesture appears, which means you need to make 8 consecutive adjustments to meet the composition requirements. In addition, the zoom coefficient mentioned above is actually the target working parameter of the shooting equipment in the shooting composition system. Of course, there may be other types of target operating parameters during the actual implementation process, such as the shooting angle of the shooting equipment, etc., which are not specifically limited in the embodiments of the present application.

In the above shooting composition method, since the user does not need to manually adjust the position and angle of the shooting device to achieve one composition, the automatic shooting composition can be completed through gestures, so the composition efficiency is higher and the composition result is more accurate. In addition, based on multiple consecutive image frames, we can obtain And record the historical gesture trajectories, and update the historical gesture trajectories based on the current gesture status obtained from the current frame image to obtain the target working parameters for adjusting the shooting composition system. That is to say, multiple image frames can be used for continuous tracking, which can make the shooting composition result more accurate than determining the target working parameters based on one image frame. Finally, when obtaining the target gesture state, the state of the target subject is also referenced, which can avoid misjudgment of the target gesture state caused by gesture states generated by irrelevant subjects, and in turn can make the shooting composition results more accurate.

In some embodiments, the current frame image belongs to the image frame group, the images in the image frame group are sorted according to the shooting timing of the shooting device, and the current frame image is the last frame image; see Figure 3 to obtain the current frame image. The target subject state of the subject, including:

Step 302: Obtain the historical subject status of the subject in the first frame of the image frame group.

The historical subject state refers to the state of the subject in the image frame before the current frame image, which may include position or size, etc. This embodiment of the present application does not specifically limit this. The image frame group is mainly used for target tracking processing. It should be noted that the "first frame image" mentioned here mainly emphasizes the first frame image used for target tracking, rather than the first frame image captured by the shooting device or the first frame in other senses. image.

Step 304: Based on the image frame group, perform target tracking on the subject object, and obtain the predicted subject state of the subject object in the current frame image.

In the actual prediction process, the historical subject state of the target subject in the previous frame image can be directly used for prediction, or the historical subject state of the target subject in a series of images before the current frame image can be used for prediction. This application The examples do not specifically limit this. Specifically, the target tracking algorithm can be used to obtain the predicted subject state of the subject in the current frame image.

Step 306: Integrate the historical subject state and the predicted subject state to obtain the target subject state of the subject in the current frame image.

Among them, the integration process may be to directly use the predicted subject state, that is, directly use the predicted subject state as the target subject state. Of course, the integration process can also be other methods such as averaging. Specifically, combined with the above explanation of the subject state, the subject state may include the position and size of the subject's bounding box, and the position of the bounding box in the predicted subject state will be the same as the position of the bounding box in the historical subject state. Taking the average value, the position of the bounding box in the target subject state can be obtained. In the same way, by averaging, the size of the bounding box in the target object state can also be calculated.

It should be noted that the "historical subject state" used for integration here can only include the historical subject state of the subject in the first frame image, and can also include the historical subject state of the subject in other images before the current frame image. , the embodiments of this application do not specifically limit this.

In the above embodiment, when obtaining the target subject state of the subject in the current frame image, it is based on the integration of the historical subject state of the subject in the image before the current frame image and the predicted subject state obtained through target tracking. obtained, and the historical subject material state The state is the clear state of the subject in the image before the current frame image, so that the target subject state obtained based on the historical subject state can be as accurate as possible. In addition, through the target tracking method, the status of the target subject can be obtained more accurately even if the subject is prevented from being temporarily blocked.

In some embodiments, updating the set of historical gesture trajectories based on the current gesture state of at least one gesture to obtain the current set of gesture trajectories includes: comparing the current gesture state of each gesture with each historical gesture trajectory in the set of historical gesture trajectories. Matching determines the current gesture state and historical gesture trajectories that match each other; adds the current gesture state of each matching historical gesture trajectory to the matching historical gesture trajectories to obtain the current gesture trajectory set.

Specifically, the process of matching the current gesture state with the historical gesture trajectory may adopt the Hungarian algorithm. Take the current gesture state of a total of n gestures detected in the current frame image as {R _ti }, i=1,...,n as an example. Among them, i represents the i-th gesture, and t represents the current frame image as the t-th frame image. The set of historical gesture trajectories is recorded as {T _j }, {T _j } includes m historical gesture trajectories, and j represents the jth historical gesture trajectory in the set of historical gesture trajectories. By performing Hungarian matching between the gesture states of n gestures and each of the m historical gesture trajectories, a matching matrix A _n×m can be obtained.

For example, take the element in the matching matrix that has two values as an example. Among them, a value of 1 indicates a match, and a value of -1 indicates a mismatch. For a certain element a _ij in the matching matrix, it represents the matching situation between the i-th gesture state in the current frame image and the j-th historical gesture trajectory in the historical gesture trajectory set. If a _ij =1, it means that the i-th current gesture state and the j-th historical gesture trajectory are successfully matched, that is, the j-th historical gesture trajectory is actually generated based on the gesture corresponding to the i-th current gesture state. Thus, R _ti can be added to the matching historical gesture trajectory, that is, added to the jth historical gesture trajectory. And if for any j=1,...,m, there is a _ij =-1, it means that the i-th current gesture state does not match every historical gesture trajectory, that is, the historical gesture trajectory in the historical gesture trajectory set None of them are generated based on the gesture corresponding to the i-th current gesture state.

In the above embodiment, the current gesture state in the current frame image is added to the matching historical gesture trajectories to form a current gesture trajectory set. Subsequently, the target working parameters for adjusting the shooting composition system can be obtained based on the current gesture trajectories. Since multiple image frames can be used for continuous tracking, compared to determining target working parameters based on one image frame, the shooting composition results can be more accurate.

In some embodiments, after matching the current gesture state of each gesture with each historical gesture trajectory in the historical gesture trajectory set, the method further includes:

It can be understood that the reason why the embodiment of the present application creates a new gesture trajectory for the current gesture state that does not match is mainly because the subject may not have made a gesture towards the shooting device before, or has not made any gesture related to the shooting composition. The associated gestures are started at the corresponding moment of the current frame image and are presented in the current frame image. Therefore, there will be a current gesture that does not match each historical gesture trajectory. Gesture status. Obviously, the current gesture state, which does not match every historical gesture trajectory, should not be ignored because it may be related to the composition of the shot. Therefore, in the embodiment of the present application, a new gesture trajectory can be created to record the current gesture state and serve as a new gesture trajectory in the current gesture trajectory set.

In the above embodiment, since a new gesture trajectory can be created for the current gesture state that does not match each historical gesture trajectory and added to the current gesture trajectory collection for shooting composition, it is possible to avoid missing a gesture that has not appeared before but is actually Gestures that act on the composition of a shot to improve the success rate of controlling the composition of the shot through gestures.

In some embodiments, the current gesture trajectory set also includes historical gesture trajectories in which the current gesture state is not added to the historical gesture trajectory set; the current gesture trajectory triggered by the subject object is determined in the current gesture trajectory set according to the state of the target subject. Previously, this also included:

For each gesture trajectory in the current gesture trajectory set, obtain the adding moment corresponding to the last gesture state added in each gesture trajectory; calculate the time interval between each adding moment and the acquisition moment of the current frame image, which will be greater than the preset The gesture trajectory corresponding to the time interval is deleted from the current gesture trajectory collection.

It can be understood that after obtaining a certain current gesture state, if the current gesture state is immediately added to a certain historical gesture track to form a certain gesture track in the current gesture track set, then the last gesture track in the gesture track will be Since the added gesture state is the current gesture state obtained in the current frame image, the time interval between the addition time corresponding to the last gesture state added in the gesture trajectory and the acquisition time of the current frame image will not exceed big. Only historical gesture trajectories that have not been updated based on the current frame image, or even historical gesture trajectories that have not been updated based on multiple frames of images before the current frame image, are left to the current gesture trajectory collection as the current gesture trajectory collection. A certain gesture trajectory in the gesture trajectory; the time interval between the addition time corresponding to the last gesture state added in the gesture trajectory and the acquisition time of the current frame image will be too large.

In this embodiment of the present application, gesture trajectories with larger time intervals are filtered out through a preset duration. It is also understandable that the reason why we choose to delete gesture trajectories corresponding to time intervals greater than the preset duration from the current gesture trajectory collection is mainly because such gesture trajectories usually cannot be updated for too long, and the main object cannot be updated. It is very possible to use such a gesture trajectory as a basis and continue to make gestures on this basis to control the composition of the shot. Therefore, in order to ensure the accuracy of the subsequent determination of the current gesture trajectory triggered by the subject, you may choose to delete the gesture trajectory corresponding to a time interval greater than the preset time from the current gesture trajectory set.

In the above embodiment, since gesture trajectories that have not been updated for too long can be deleted from the current gesture trajectory set, the accuracy of subsequent determination results when determining the current gesture trajectory triggered by the subject object can be ensured. In addition, since gesture trajectories that have not been updated for too long are deleted from the current gesture trajectories, the amount of data in the current gesture trajectory collection can also be reduced to save resources.

In some embodiments, the target subject state includes the position of the subject; see Figure 4, according to the target subject state, the current gesture trajectory triggered by the subject is determined in the current gesture trajectory set, and in the current gesture trajectory, Determine the target gesture state triggered by the subject object status, including:

Step 402: For each gesture trajectory in the current gesture trajectory set, determine the gesture state last added in each gesture trajectory, where the gesture state includes the gesture position.

Among them, the gesture states in each gesture trajectory are usually arranged in the order of adding time, and the gesture state added last is the gesture state arranged last. The gesture position can be represented by the coordinates of the center point of the external frame, or can be represented by other methods, which are not specifically limited in the embodiments of the present application.

Step 404: Filter the current gesture trajectory set according to the distance between the position of the subject and the gesture position in each last added gesture state to obtain the current gesture trajectory.

It can be understood that if a certain gesture state is triggered by the subject object, due to the integration of the hand and the subject object, the gesture position in the gesture state will not be too far away from the position of the subject object. Therefore, in this step, the current gesture trajectory collection can be filtered according to the distance between the two. Specifically, if the distance between the gesture position in the last gesture state added in a certain gesture trajectory and the position of the subject is greater than a preset threshold or is not within a certain range, the gesture can be filtered out from the current gesture trajectory collection. Gesture trajectories. Since the gesture trajectories in the current gesture trajectory collection are screened based on distance, this process can also be understood as selecting the current gesture trajectory that matches the subject object from the current gesture trajectory collection.

Step 406: When the current gesture trajectory meets the preset detection conditions, the last gesture state added in the current gesture trajectory is used as the target gesture state.

It should be noted that after successfully matching the gesture trajectory with the subject object, it can be further determined whether the successfully matched gesture trajectory is the correct gesture trajectory to ensure that the determined target gesture state is more accurate. Therefore, in this step, it can be further determined whether the current gesture trajectory meets the preset detection conditions. Among them, the preset detection conditions are based on the current gesture trajectory as the basis for determining the target gesture state, and what rational conditions need to be met are set. For example, since the target gesture state needs to be determined from the current gesture trajectory, the current gesture trajectory should be stable. This "stability" can be reflected in the fact that the time intervals between the adding moments of different gesture states in the current gesture trajectory are uniform. The reason why the above content can reflect "stability" is mainly because if the subject needs to control the shooting composition, it is usually The same gesture will be performed over a period of time to produce stable recognition results, thus producing a series of gesture states with even time intervals between adding moments. Of course, during actual implementation, the preset detection conditions may also have other setting bases, which are not specifically limited in the embodiments of this application.

It should also be noted that the reason why the last added gesture state in the current gesture trajectory is selected as the target gesture state is mainly because the last added gesture state is the latest gesture state in the current gesture trajectory, which can reflect the latest shooting of the subject. Composition intention, so as to achieve precise shooting composition control.

In the above embodiment, according to the distance between the main object and the gesture, the current gesture track set matching the main object is filtered out. Pre-gesture trajectories. Since the calculation process is relatively simple, the processing efficiency can be improved. In addition, after the current gesture trajectory is initially screened based on distance, the current gesture trajectory needs to be further detected based on preset detection conditions, so that a more accurate target gesture state can be obtained.

In some embodiments, the preset detection conditions include at least one of the following two conditions. The following two conditions are that the number of gesture states in the current gesture trajectory is not less than the preset number, and that the number of gesture states last added in the current gesture trajectory is not less than the preset number. The added time sequence corresponding to the k gesture states matches the shooting time sequence corresponding to the k frames of images last captured by the shooting device; where k is a positive integer.

It should be noted that the reason why the number of gesture states in the gesture trajectory is used as the basis for setting the preset detection conditions is mainly because only when the number of gesture states in the gesture trajectory reaches a certain number, can the gesture trajectory be "stable", and "Stable" gesture trajectories are more conducive to accurately determining the target gesture state.

The addition time sequence corresponding to the last k gesture states added to the current gesture trajectory matches the shooting time sequence corresponding to the last k frames of images captured by the shooting device, which can explain this problem for the last k gesture states added. During this period, the update progress of the current gesture track is basically synchronized with the shooting progress of the image taken by the shooting device. That is, basically every time the shooting device takes a frame of image, the current gesture track will add the gesture state inside itself based on the captured image. . It can also be seen from here that the current gesture trajectory that meets this condition should gradually form gesture instructions for controlling the shooting composition, that is, the current gesture trajectory that meets this condition should be "valid". Among them, the so-called "matching" can mean that the adding time sequence and the shooting time sequence are completely consistent with the adding time and shooting time in the same sequence, or it can be that the error is within an acceptable range, which is not the case in the embodiment of the present application. Specific limitations.

In the above embodiment, after the current gesture trajectory is initially screened based on distance, the current gesture trajectory needs to be further detected based on preset detection conditions, so that a more accurate target gesture state can be obtained.

In some embodiments, the target gesture state includes the target gesture type; before obtaining the target working parameter of at least one device in the shooting composition system according to the target gesture state, the method further includes:

Determine whether the target gesture type matches the specified gesture type; if the target gesture type matches the specified gesture type, perform the step of obtaining the target working parameter of at least one device in the shooting composition system according to the target gesture state.

The embodiment of the present application mainly focuses on the process of starting the automatic shooting composition mode based on a specified gesture. Specifically, the specified gesture can be used as a trigger condition for entering the automatic shooting composition mode controlled by gestures, that is, only when the specified gesture is recognized, "according to the target gesture state, obtain at least one of the shooting composition modes in the shooting composition system" "Target working parameters of an equipment" step, and then control the shooting composition according to the target working parameters.

It should be noted that during the actual implementation process, whether the automatic shooting composition mode has been entered, the shooting composition system can issue a reminder through an external indicator light to inform the user whether the shooting composition system is currently in the automatic shooting composition mode. Among them, you can use the indicator light Color distinguishes automatic shooting composition modes from non-automatic shooting composition modes. In addition, since you can enter the automatic shooting composition mode through designated gestures, you can also exit the automatic shooting composition mode through designated gestures. The preset designated gestures for entering and exiting the automatic shooting composition mode may be the same or different, and this is not specifically limited in the embodiments of the present application.

In the above embodiment, by determining whether the specified gesture is recognized as a trigger condition for entering the automatic shooting composition mode, the automatic shooting composition mode can be entered first through the specified gesture, and then the shooting composition can be controlled. Through the above two layers of control logic, the accuracy of operation can be improved and false triggering of gestures can be avoided.

In some embodiments, referring to Figure 5, the target subject state includes the position of the subject, and the target gesture state includes the target gesture position; according to the target gesture state, the target working parameter of at least one device in the shooting composition system is obtained, including:

Step 502: Based on the position of the main object and the position of the target gesture, determine the relative positional relationship between the main object and the gesture corresponding to the target gesture state in the current frame image.

The relative positional relationship between the subject and the gesture refers to the relative positional relationship presented in the two-dimensional image. The relative position relationship can be specifically determined based on the relationship between the center coordinates of the subject's bounding box (x _head , y _head ) and the center coordinates of the gesture's bounding box (x _hand , y _hand ). Since it is a coordinate value, the relative position relationship can include multiple types. Taking the subject object as the user's head as an example, the relative position relationship may include the hand above the head, the hand below the head, or the hand above the left side of the head, etc.

Step 504: Determine the target working parameter of at least one piece of equipment in the shooting composition system based on the relative position relationship.

It can be understood that this step is mainly a process of parsing the meaning of the gesture. The analysis process includes two aspects that are advanced in sequence. The first aspect is the process of "what to adjust" for the shooting composition, and the second aspect is the process of "how to adjust" the shooting composition. Regarding "what to adjust", reference may be made to the description in the above embodiments. Optional adjustment objects may be the zoom coefficient or the optical axis orientation of the shooting device, etc. This is not specifically limited in the embodiments of the present application. The specific adjustment object to be selected can be the default, or indicated by the target gesture type in the target gesture state, such as a zoom control gesture or an optical axis direction control gesture.

As for "how to adjust", it can be determined based on the adjustment method set in advance based on the relative position relationship. For example, take the subject object as the user's head. If the relative position relationship is "hand above head", the adjustment object can be the zoom coefficient, and the adjustment method is to increase the zoom coefficient. The adjustment process of the zoom coefficient may be to fix the proportion of the main object but enlarge or reduce the size of the main object. It should be noted that during the actual implementation process, both "what to adjust" and "how to adjust" can provide personalized customization channels to meet different needs. This is not specifically limited in the embodiments of this application.

In the above embodiment, since the automatic shooting composition can be completed according to the relative positional relationship between the subject and the gesture corresponding to the target gesture state, the composition efficiency is higher and the composition result is more accurate, and due to the adjustment object and the composition of the shooting The adjustment method can be personalized based on gestures, making the operation more flexible and convenient.

In some embodiments, the shooting composition system further includes a control device, which is used to control the movement of the shooting device; the target working parameters include at least one of the zoom parameters of the shooting device or the orientation of the optical axis of the shooting device or the position of the shooting device. One item.

Among them, the control device refers to a mechanical device that can change the shooting range or shooting angle of the shooting device by adjusting its own position or shape. For example, the control device may be a pan/tilt, and the pan/tilt may include a robotic arm, and a shooting device may be placed on the carrying portion of the robotic arm. Among them, the bearing part can expand and contract and translate along with the expansion and contraction of the mechanical arm, and the bearing part can also rotate, so that the shooting equipment placed on the bearing part of the robotic arm can expand, contract, translate or rotate along with the bearing part. Obviously, if the robotic arm expands and contracts, it will change the viewing range of the shooting equipment. If the robotic arm translates, the viewing area of the shooting device will change. If the robotic arm flips, it will change the viewing angle of the shooting equipment. In the embodiment of the present application, the position of the shooting device can be changed by telescopic and panning. By rotating, the orientation of the optical axis of the shooting device can be changed. Through the focusing function of the gimbal, you can change the zoom parameters of the shooting device.

In the above embodiments, at least the zoom parameter or the direction or position of the optical axis of the shooting device can be determined based on the relative positional relationship between the subject and the gesture corresponding to the target gesture state, so that the composition efficiency is higher and the composition result is more accurate. It is more accurate, and because the adjustment objects and methods of shooting composition can be customized based on gestures, the operation is more flexible and convenient.

In some embodiments, referring to Figure 6, the target subject state includes the position of the subject, and the target gesture state includes the target gesture position; according to the target gesture state, the target working parameter of at least one device in the shooting composition system is obtained, including:

Step 602: Obtain the historical subject state of the subject in the historical image captured by the shooting device and the historical gesture state triggered by the subject.

It can be seen from the content of the above embodiment that the target gesture state is determined in the current gesture trajectory, and the gesture states in the current gesture trajectory are sequentially sorted based on the adding time, and the historical gesture states triggered by the subject object can refer to The gesture state preceding the target gesture state in the current gesture trajectory. The historical gesture state is obtained from the historical image, so the historical subject state of the subject can also be obtained in the historical image.

For example, record the current gesture trajectory as {R ₁ , R ₂ , R ₃ , R ₄ , R ₅ }. Among them, R ₅ is the target gesture state determined in the current frame image, and R ₁ , R ₂ , R ₃ and R ₄ are the historical gesture states in the 4 consecutive frames of historical images obtained before the current frame image. The historical subject states in the four consecutive frames of historical images can be W ₁ , W ₂ , W ₃ and W ₄ respectively, and the target subject state in the current frame image can be W ₅ .

Step 604: Based on the position of the main object and the position of the target gesture, calculate the first distance between the main object in the current frame image and the gesture corresponding to the target gesture state.

Specifically, according to the position of the subject in W5 and the target gesture position in R5, the first distance bias ₅ between the subject and the gesture in the current frame image can be calculated = (x _hand5 ,y _hand5 )-(x _head5 ,y _head5 ). Among them, (x _hand5 , y _hand5 ) represents the target gesture position, (x _head5 , y _head5 ) represents the position of the subject object.

Step 606: Calculate the second distance between the subject in the historical image and the gesture corresponding to the historical gesture state based on the historical position of the subject in the historical subject state and the historical gesture position in the historical gesture state.

It can be known from the above steps that because there is more than one historical gesture state in the current gesture trajectory, there is also more than one historical gesture position. In this step, when calculating the second distance, a second distance may be calculated based only on a certain historical gesture position. For example, in the above example, R ₁ , R ₂ , R ₃ and R ₄ are respectively the historical gesture states in the 4 consecutive frames of images acquired before the current frame image. In this step, the second distance bias ₄ ₌ (x _hand4 ,yh _and4 )- ₍ x _head4 _, y _head4 ). Among them, (x _hand4 , y _hand4 ) represents the historical gesture position in the 4th frame image, (x _head4 , y _head4 ) represents the historical position of the subject in the 4th frame image. It should be noted that during the actual implementation process, the second distance may not necessarily be calculated based on the 4th frame image, but may also be calculated based on other frame historical images, such as the 1st frame image as the initial frame. The embodiment of the present application is This is not specifically limited.

Step 608: Determine the target working parameter of at least one device in the shooting composition system based on the difference between the first distance and the second distance.

Specifically, the difference between the first distance and the second distance can be expressed as Δbias _c =bias ₅ -bias ₄ . Taking the subject as the head as an example, it can be understood that the difference represents how much the distance between the hand and the head has changed at the corresponding moment of the current frame image compared to before. Among them, the change, that is, the difference calculated above, can have a negative number. It can be understood that the positive or negative value of the difference can indicate whether to increase or decrease the target operating parameter, and the numerical value of the difference can indicate how much the target operating parameter has been changed. Therefore, the content stated above can solve the problem of "how to adjust".

Based on the contents of the above embodiments, it can be seen that not only the problem of "how to adjust" needs to be solved, but also the problem of "what to adjust" needs to be solved. Referring to the description in the above embodiments, the optional adjustment object may be the zoom coefficient or the optical axis orientation of the shooting device, which is not specifically limited in the embodiments of the present application. The specific adjustment object to be selected can also be the default, or it can also be indicated by the target gesture type in the target gesture state, such as a zoom control gesture or an optical axis direction control gesture. It should be noted that when the optical axis orientation of the shooting device needs to be adjusted, the difference can be converted into an angle.

In the above embodiment, since the control device itself may move and move with the shooting device, and the motion process of the shooting device driven by the control device is synchronously coupled with the hand motion process, the gestures in the images captured by the shooting device The position of the coordinates in the image will remain unchanged, which will cause the movement of the hand in the real world to be misjudged as stationary. Since the shooting composition requires a series of changing gesture positions, it will slow down the shooting process. Compositional stability. By changing the distance between the subject and the gesture at different times to reflect the movement of the hand in the real world, the problem of synchronous coupling between the movement process of the shooting equipment and the movement process of the hand can be bypassed, thereby improving the stability of the shooting composition. sex.

In some embodiments, the shooting composition system also includes a control device, which is used to control the movement of the shooting device; the target working parameters The number includes at least one of the zoom parameter of the photographing device, the orientation of the optical axis of the photographing device, or the position of the photographing device.

For specific explanations, reference may be made to the contents of the above embodiments, which will not be described again here.

In some embodiments, the target gesture state includes the target gesture type; according to the target gesture state, the target working parameters of at least one device in the shooting composition system are obtained, including:

It is mentioned in the above explanation about the specified gesture type that the specified gesture type can be used as a trigger condition for entering the automatic shooting composition mode controlled by gestures. In the embodiment of the present application, it is mainly a process of obtaining the default value of the target working parameter of at least one device in the shooting composition system when it is determined that the target gesture type conforms to the preset gesture type.

Specifically, by setting the preset gesture type in advance, and after obtaining the target gesture state through the above process, it can be determined whether the target gesture type in the target gesture state conforms to the preset gesture type. If it matches, it means that the target gesture state triggered the preset gesture. Since the default value of the target working parameter of at least one device in the shooting composition system can be set for the preset gesture in advance, when it is determined that the target gesture state triggers the preset gesture, when executing "according to the target gesture state, obtain In the step of "shooting the target working parameter of at least one piece of equipment in the composition system", the above default value can be directly used as the target working parameter of shooting at least one piece of equipment in the composition system.

In the above embodiment, since the specified gesture can be recognized and the target working parameter of at least one device in the shooting composition system can be set using the default value corresponding to the specified gesture, the shooting composition can be controlled simply and conveniently.

In one embodiment, the shooting composition system also includes a control device, which is used to control the movement of the shooting device; accordingly, the preset working parameters also include the preset optical axis orientation of the shooting device or the preset location of the shooting device. At least one of the locations.

For specific explanations, reference may be made to the contents of the above embodiments, which will not be described again here. It should be noted that since the default values used here are actually default values, the preset working parameters may include at least one of the "preset" optical axis orientation of the shooting device or the "preset" position of the shooting device.

In the above embodiment, since at least the orientation or position of the optical axis of the shooting device can be determined based on the relative positional relationship between the subject and the gesture corresponding to the target gesture state, the composition efficiency is higher and the composition result is more accurate. And because the adjustment objects and methods of shooting composition can be customized based on gestures, the operation is more flexible and convenient.

It should be understood that although the steps in the flowcharts involved in the above embodiments are shown in sequence as indicated by arrows, These steps are not necessarily performed in the order indicated by the arrows. Unless explicitly stated in this article, there is no strict order restriction on the execution of these steps, and these steps can be executed in other orders. Moreover, at least some of the steps in the flowcharts involved in the above embodiments may include multiple steps or multiple stages. These steps or stages are not necessarily executed at the same time, but may be executed at different times. The execution order of these steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least part of the steps or stages in other steps.

Based on the same inventive concept, embodiments of the present application also provide a photographing and composition device for implementing the above-mentioned photographing and composition method. The solution to the problem provided by this device is similar to the solution recorded in the above method. Therefore, for the specific limitations in one or more embodiments of the shooting and composition device provided below, please refer to the limitations on the shooting and composition method mentioned above. I won’t go into details here.

In one embodiment, as shown in Figure 7, a shooting composition device is provided, including: a data acquisition module 701, a gesture update module 702, a gesture determination module 703, and a device adjustment module 704, wherein:

The data acquisition module 701 is used to acquire the current frame image captured by the shooting device, acquire the target subject state of the subject in the current frame image and the current gesture state of at least one gesture;

The gesture update module 702 is used to obtain a set of historical gesture trajectories, update the set of historical gesture trajectories based on the current gesture state of at least one gesture, and obtain a set of current gesture trajectories;

The gesture determination module 703 is used to determine the current gesture trajectory triggered by the subject object in the current gesture trajectory set according to the state of the target subject object, and determine the target gesture state triggered by the subject object in the current gesture trajectory;

The equipment adjustment module 704 is used to obtain the target working parameters of at least one device in the shooting and composition system according to the target gesture state, and adjust the corresponding equipment in the shooting and composition system according to the target working parameters.

In one of the embodiments, the data acquisition module 701 is also used to:

In one of the embodiments, the gesture update module 702 is also used to:

In one of the embodiments, the gesture determination module 703 is also used to:

In one embodiment, the gesture determination module 703 is further configured to: determine that the preset detection condition includes at least one of the following two conditions. The following two conditions are that the number of gesture states in the current gesture trajectory is not less than the preset number. , and, the addition time sequence corresponding to the last k gesture states added in the current gesture trajectory matches the shooting time sequence corresponding to the last k frame images captured by the shooting device; where k is a positive integer.

In one embodiment, the device adjustment module 704 is also used to:

Determine whether the target gesture type matches the specified gesture type;

In one embodiment, the device adjustment module 704 is also used to:

In one embodiment, the equipment adjustment module 704 is also used to determine that the shooting composition system also includes a control device, and the control device is used to control the movement of the shooting device; the target working parameters include the zoom parameters of the shooting device or the optical axis orientation of the shooting device, or At least one item of the location of the shooting device.

In one embodiment, the device adjustment module 704 is also used to:

In one embodiment, the equipment adjustment module 704 is also used to determine that the shooting composition system also includes a control device, and the control device is used to control the movement of the shooting device; accordingly, the preset working parameters also include the preset optical axis orientation of the shooting device. Or at least one of the preset positions of the shooting device.

Each module in the above-mentioned shooting and composition device can be realized in whole or in part by software, hardware and combinations thereof. Each of the above modules may be embedded in or independent of the processor of the computer device in the form of hardware, or may be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.

In one embodiment, a computer device is provided. The computer device may be a server, and its internal structure diagram may be shown in Figure 8 . The computer device includes a processor, memory, and network interfaces connected through a system bus. Wherein, the processor of the computer device is used to provide computing and control capabilities. The memory of the computer device includes non-volatile storage media and internal memory. The non-volatile storage medium stores operating systems, computer programs and databases. This internal memory provides an environment for the execution of operating systems and computer programs in non-volatile storage media. The database of the computer device is used to store gesture trajectory data and subject state data. The network interface of the computer device is used to communicate with external terminals through a network connection. The computer program implements a shooting composition method when executed by the processor.

Those skilled in the art can understand that the structure shown in Figure 8 is only a block diagram of a partial structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied. Specific computer equipment can May include more or fewer parts than shown, or combine certain parts, or have a different arrangement of parts.

In one embodiment, a computer device is provided, including a memory and a processor. A computer program is stored in the memory. When the processor executes the computer program, it implements the steps in the above method embodiments.

In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored. When the computer program is executed by a processor, the steps in the above method embodiments are implemented.

In one embodiment, a computer program product is provided, including a computer program that implements the steps in each of the above method embodiments when executed by a processor.

Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be completed by instructing relevant hardware through a computer program. The computer program can be stored in a non-volatile computer-readable storage. In the media, when executed, the computer program may include the processes of the above method embodiments. Any reference to memory, database or other media used in the embodiments provided in this application may include at least one of non-volatile and volatile memory. Non-volatile memory can include read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical memory, high-density embedded non-volatile memory, resistive memory (ReRAM), magnetic variable memory (Magnetoresistive Random Access Memory (MRAM), ferroelectric memory (Ferroelectric Random Access Memory, FRAM), phase change memory (Phase Change Memory, PCM), graphene memory, etc. Volatile memory may include random access memory (Random Access Memory, RAM) or external cache memory, etc. By way of illustration and not limitation, RAM can be in many forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM). The databases involved in the various embodiments provided in this application may include at least one of a relational database and a non-relational database. Non-relational databases may include blockchain-based distributed databases, etc., but are not limited thereto. The processors involved in the various embodiments provided in this application may be general-purpose processors, central processing units, graphics processors, digital signal processors, programmable logic devices, quantum computing-based data processing logic devices, etc., and are not limited to this.

The technical features of the above embodiments can be combined in any way. To simplify the description, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, all possible combinations should be used. It is considered to be within the scope of this manual.

The above-described embodiments only express several implementation modes of the present application, and their descriptions are relatively specific and detailed, but should not be construed as limiting the patent scope of the present application. It should be noted that, for those of ordinary skill in the art, several modifications and improvements can be made without departing from the concept of the present application, and these all fall within the protection scope of the present application. Therefore, the scope of protection of this application should be determined by the appended claims.

Claims

A photographing composition method, characterized in that the method is applied to a photographing composition system including photographing equipment; the method includes:

Obtain the current frame image captured by the shooting device, obtain the target subject state of the subject in the current frame image and the current gesture state of at least one gesture;

Obtain a set of historical gesture trajectories, update the set of historical gesture trajectories based on the current gesture state of the at least one gesture, and obtain a set of current gesture trajectories;

According to the target subject state, determine the current gesture trajectory triggered by the subject object in the current gesture trajectory set, and determine the target gesture state triggered by the subject object in the current gesture trajectory;

According to the target gesture state, a target working parameter of at least one device in the shooting and composition system is obtained, and the corresponding device in the shooting and composition system is adjusted according to the target working parameter.
The method according to claim 1, characterized in that the current frame image belongs to an image frame group, the images in the image frame group are sorted according to the shooting timing of the shooting device, and the current frame image is the last frame image; obtaining the target subject state of the subject in the current frame image includes:

Obtain the historical subject status of the subject in the first frame image in the image frame group;

Based on the image frame group, perform target tracking on the subject object, and obtain the predicted subject state of the subject object in the current frame image;

The historical subject state and the predicted subject state are integrated to obtain the target subject state of the subject in the current frame image.
The method according to claim 1, characterized in that, updating the historical gesture trajectory set based on the current gesture state of the at least one gesture to obtain the current gesture trajectory set includes:

Match the current gesture state of each gesture with each historical gesture trajectory in the historical gesture trajectory set, and determine the matching current gesture state and historical gesture trajectory;

Add the current gesture state corresponding to each matching historical gesture trajectory to the matching historical gesture trajectory to obtain a current gesture trajectory set.
The method according to claim 3, characterized in that after matching the current gesture state of each gesture with each historical gesture trajectory in the historical gesture trajectory set, it further includes:

In the case where there is a current gesture state that does not match each historical gesture trajectory, a new gesture trajectory is created based on the current gesture state that does not match each historical gesture trajectory and is added to the current gesture trajectory set.
The method according to claim 3, characterized in that the current gesture trajectory set also includes the historical gesture trajectory set The historical gesture trajectories of the current gesture state are not added to the combination; before determining the current gesture trajectory triggered by the subject object in the current gesture trajectory set according to the target subject state, it also includes:

For each gesture trajectory in the current gesture trajectory set, obtain the addition moment corresponding to the last gesture state added in each gesture trajectory;

Calculate the time interval between each addition moment and the acquisition moment of the current frame image, and delete the gesture trajectories corresponding to the time interval greater than the preset time from the current gesture trajectory set.
The method according to claim 1, characterized in that the target subject state includes the position of the subject; and according to the target subject state, it is determined by the subject in the current gesture trajectory set. The current gesture trajectory triggered by the object, and in the current gesture trajectory, determine the target gesture state triggered by the main object, including:

For each gesture trajectory in the current gesture trajectory set, determine the gesture state last added in each gesture trajectory, where the gesture state includes a gesture position;

According to the distance between the position of the subject and the gesture position in each last added gesture state, filter the current gesture trajectory set to obtain the current gesture trajectory;

When the current gesture trajectory satisfies the preset detection condition, the gesture state last added to the current gesture trajectory is used as the target gesture state.
The method according to claim 6, characterized in that the preset detection condition includes at least one of the following two conditions, the following two conditions are respectively that the number of gesture states in the current gesture trajectory is not less than The preset number, and the addition time sequence corresponding to the last k gesture states added in the current gesture trajectory match the shooting time sequence corresponding to the k frame images last captured by the shooting device; wherein, the k is a positive integer.
The method of claim 1, wherein the target gesture state includes a target gesture type; before obtaining the target working parameter of at least one device in the shooting composition system according to the target gesture state, include:

Determine whether the target gesture type conforms to the specified gesture type;

If the target gesture type conforms to the specified gesture type, then perform the step of obtaining the target working parameter of at least one device in the shooting composition system according to the target gesture state.
The method according to any one of claims 1 to 8, characterized in that the target subject state includes the position of the subject object, the target gesture state includes the target gesture position; Status, obtain the target working parameters of at least one piece of equipment in the shooting composition system, including:

Based on the position of the main object and the target gesture position, determine the status of the main object and the target gesture in the current frame image. The relative positional relationship between corresponding gestures;

According to the relative position relationship, the target operating parameter of at least one device in the shooting composition system is determined.
The method according to claim 9, characterized in that the shooting composition system further includes a control device, the control device is used to control the movement of the shooting device; the target working parameters include zoom parameters of the shooting device Or at least one of the direction of the optical axis of the photographing device or the position of the photographing device.
The method according to any one of claims 1 to 8, characterized in that the target subject state includes the position of the subject object, the target gesture state includes the target gesture position; Status, obtain the target working parameters of at least one piece of equipment in the shooting composition system, including:

Obtain the historical subject state of the subject in the historical image captured by the shooting device and the historical gesture state triggered by the subject;

Based on the position of the main object and the target gesture position, calculate a first distance between the main object and the gesture corresponding to the target gesture state in the current frame image;

Based on the historical position of the subject in the historical subject state and the historical gesture position in the historical gesture state, calculate the distance between the gesture corresponding to the subject in the historical image and the historical gesture state. second distance;

According to the difference between the first distance and the second distance, a target operating parameter of at least one device in the shooting composition system is determined.
The method according to claim 11, characterized in that the shooting composition system further includes a control device, the control device is used to control the movement of the shooting device; the target working parameters include zoom parameters of the shooting device Or at least one of the direction of the optical axis of the photographing device or the position of the photographing device.
The method according to any one of claims 1 to 7, characterized in that the target gesture state includes a target gesture type; and according to the target gesture state, obtaining the information of at least one device in the shooting composition system Target operating parameters, including:

When the target gesture type conforms to the preset gesture type, the preset working parameters corresponding to the preset gesture type are obtained and used as the target working parameters. The preset working parameters include the preset parameters of the shooting device. Set the zoom factor.
The method according to claim 13, characterized in that the shooting composition system further includes a control device, the control device is used to control the movement of the shooting device; accordingly, the preset working parameters further include the At least one of a preset optical axis orientation of the photographing device or a preset position of the photographing device.
A shooting and composition device, characterized in that the device includes:

A data acquisition module, used to obtain the current frame image captured by the shooting device, and obtain the target of the subject in the current frame image. The state of the subject and the current gesture state of at least one gesture;

A gesture update module, configured to obtain a set of historical gesture trajectories, update the set of historical gesture trajectories based on the current gesture state of the at least one gesture, and obtain a set of current gesture trajectories;

A gesture determination module, configured to determine the current gesture trajectory triggered by the subject object in the current gesture trajectory set according to the state of the target subject object, and determine the current gesture trajectory triggered by the subject object in the current gesture trajectory. Triggered target gesture state;

An equipment adjustment module is configured to obtain a target working parameter of at least one device in the shooting and composition system according to the target gesture state, and adjust the corresponding device in the shooting and composition system according to the target working parameter.
A computer device includes a memory and a processor, the memory stores a computer program, and is characterized in that when the processor executes the computer program, the steps of the method described in any one of claims 1 to 14 are implemented.
A computer-readable storage medium with a computer program stored thereon, characterized in that when the computer program is executed by a processor, the steps of the method described in any one of claims 1 to 14 are implemented.
A computer program product, comprising a computer program, characterized in that, when executed by a processor, the computer program implements the steps of the method according to any one of claims 1 to 14.