WO2020056903A1

WO2020056903A1 - Information generating method and device

Info

Publication number: WO2020056903A1
Application number: PCT/CN2018/115974
Authority: WO
Inventors: 吴兴龙
Original assignee: 北京字节跳动网络技术有限公司
Priority date: 2018-09-21
Filing date: 2018-11-16
Publication date: 2020-03-26
Also published as: CN109308469A; CN109308469B

Abstract

An information generating method and device. The method comprises: acquiring position information of a first face bounding box obtained by performing face detection on a current frame of a target video in advance, and acquiring position information of a second face bounding box obtained by performing face detection on a frame previous to the current frame in advance (201); determining the intersection over union of the first face bounding box and the second face bounding box on the basis of the obtained position information (202); on the basis of the intersection over union, determining the weight of the obtained position information of each face bounding box (203); and on the basis of the determined weight and the obtained position information, determining target position information of the first face bounding box and updating the position of the first face bounding box (204). The present method improves the smoothing effect of a face bounding box.

Description

Method and device for generating information

This patent application claims the priority of a Chinese patent application filed on September 21, 2018, with application number 201811110674.1, the applicant being Beijing BYTE Network Technology Co., Ltd., and the invention name "Method and Device for Generating Information" The entire application is incorporated herein by reference.

Technical field

Embodiments of the present application relate to the field of computer technology, and in particular, to a method and an apparatus for generating information.

Background technique

Face detection refers to the process of searching for a given image using a certain strategy to determine whether it contains a face object, and if so, returning the position and size of the face object, and The returned result can be reflected in the form of a face detection frame in the image.

When face detection is performed on a face object in a video, a face detection frame is generated every frame. A related method is to directly perform face detection on each frame to obtain a face detection frame for indicating a face object in each frame.

Summary of the Invention

The embodiments of the present application provide a method and device for generating information.

In a first aspect, an embodiment of the present application provides a method for generating information. The method includes: obtaining position information of a first face detection frame obtained by performing face detection on a current frame of a target video in advance, and To obtain pre-stored position information of the second face detection frame in a previous frame of the current frame; based on the obtained position information, determine the intersection ratio of the first face detection frame and the second face detection frame; based on The intersection ratio determines the weight of the obtained position information of each face detection frame; based on the determined weight and the obtained position information, determines the target position information of the first face detection frame to detect the first face The position of the box is updated.

In some embodiments, determining the weight of the obtained position information of each face detection frame based on the intersection ratio includes: performing the power operation with the intersection ratio as the base and the first preset value as the exponent; The calculation result of the operation is determined as the weight of the position information of the second face detection frame, and the difference between the second preset value and the weight is determined as the weight of the position information of the first face detection frame.

In some embodiments, determining the weight of the obtained position information of each face detection frame based on the intersection ratio includes: using a natural constant as a base number, and taking a difference between a reciprocal of the intersection ratio and a second preset value as an index , Performing a power operation; determining the inverse of the calculation result of the power operation as the weight of the position information of the second face detection frame, and determining the difference between the second preset value and the weight as the weight of the position information of the first face detection frame .

In some embodiments, the position information of the first face detection frame includes designated diagonal vertex coordinates of the first face detection frame, and the position information of the second face detection frame includes designated diagonal vertices of the second face detection frame. Coordinates; and determining target position information of the first face detection frame based on the determined weight and the obtained position information to update the position of the first face detection frame, including: updating the position of the first face detection frame The weight of the position information is used as the weight of the specified diagonal vertex coordinates of the first face detection frame, and the weight of the position information of the second face detection frame is used as the weight of the specified diagonal vertex coordinates of the second face detection frame. The weighted calculation result of the designated diagonal vertex coordinates of a face detection frame and the designated diagonal vertex coordinates of a second face detection frame is determined as the target diagonal vertex coordinates of the first face detection frame to detect the first face. The position of the box is updated.

In some embodiments, the designated diagonal vertex coordinates of the first face detection frame include a first vertex coordinate and a second vertex coordinate, and the designated diagonal vertex coordinates of the second face detection frame include a third vertex coordinate and a fourth vertex. Coordinates; and determining the weighted calculation result of the designated diagonal vertex coordinates of the first face detection frame and the designated diagonal vertex coordinates of the second face detection frame as the target diagonal vertex coordinates of the first face detection frame, including: The weighted calculation result of the abscissa of the first vertex coordinate and the abscissa of the third vertex coordinate is determined as the first target abscissa; the weighted calculation result of the ordinate of the first vertex coordinate and the ordinate of the third vertex coordinate is determined as The first target ordinate; the weighted calculation result of the abscissa of the second vertex coordinate and the abscissa of the fourth vertex coordinate is determined as the second target abscissa; the ordinate of the second vertex coordinate and the ordinate of the fourth vertex coordinate The weighted calculation result is determined as the second target ordinate; the coordinates formed by the first target abscissa and the first target ordinate, the second target abscissa and the second target The coordinate formed by the ordinate is determined as the target diagonal vertex coordinate of the first face detection frame.

In a second aspect, an embodiment of the present application provides an apparatus for generating information. The apparatus includes: an obtaining unit configured to obtain first face detection obtained by performing face detection on a current frame of a target video in advance; Position information of the frame, and acquiring position information of a second face detection frame in a previous frame of a current frame that is stored in advance; a first determining unit configured to determine the first face detection based on the acquired position information The intersection ratio of the frame and the second face detection frame; the second determination unit is configured to determine the weight of the position information of each face detection frame obtained based on the intersection ratio; the update unit is configured to be based on the determined Determine the target position information of the first face detection frame to update the position of the first face detection frame.

In some embodiments, the second determination unit includes: a first operation module configured to perform a power operation with a cross-ratio as a base and a first preset value as an exponent; the first determination module is configured to convert The calculation result of the power operation is determined as the weight of the position information of the second face detection frame, and the difference between the second preset value and the weight is determined as the weight of the position information of the first face detection frame.

In some embodiments, the second determining unit includes: a second operation module configured to perform a power operation with a natural constant as a base, and a difference between a reciprocal of the intersection ratio and a second preset value as an exponent; a second The determining module is configured to determine the inverse of the calculation result of the power operation as the weight of the position information of the second face detection frame, and determine the difference between the second preset value and the weight as the position information of the first face detection frame. Weights.

In some embodiments, the position information of the first face detection frame includes designated diagonal vertex coordinates of the first face detection frame, and the position information of the second face detection frame includes designated diagonal vertices of the second face detection frame. Coordinates; and an update unit, further configured to: use the weight of the position information of the first face detection frame as the weight of the designated diagonal vertex coordinates of the first face detection frame, and set the position information of the second face detection frame to The weight is the weight of the designated diagonal vertex coordinates of the second face detection frame, and the weighted calculation result of the designated diagonal vertex coordinates of the first face detection frame and the designated diagonal vertex coordinates of the second face detection frame is determined as The coordinates of the diagonal diagonal vertices of a face detection frame to update the position of the first face detection frame.

In some embodiments, the designated diagonal vertex coordinates of the first face detection frame include a first vertex coordinate and a second vertex coordinate, and the designated diagonal vertex coordinates of the second face detection frame include a third vertex coordinate and a fourth vertex. And an update unit, further configured to: determine a weighted calculation result of the abscissa of the first vertex coordinate and the abscissa of the third vertex coordinate as the first target abscissa; and determine the ordinate of the first vertex coordinate and the third The weighted calculation result of the ordinate of the vertex coordinate is determined as the first target ordinate; the weighted calculation result of the abscissa of the second vertex coordinate and the abscissa of the fourth vertex coordinate is determined as the second target abscissa; the second vertex coordinate The weighted calculation result of the ordinate of the ordinate of the fourth coordinate and the ordinate of the fourth vertex is determined as the second target ordinate; the coordinates formed by the first target abscissa and the first target ordinate, the second target abscissa and the second target ordinate The coordinates formed by the coordinates are determined as the target diagonal vertex coordinates of the first face detection frame.

According to a third aspect, an embodiment of the present application provides an electronic device including: one or more processors; a storage device that stores one or more programs thereon; when one or more programs are processed by one or more processors Execution causes one or more processors to implement the method as in any one of the first aspects described above.

In a fourth aspect, an embodiment of the present application provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processor, the method as in any one of the foregoing first embodiments is implemented.

The method and device for generating information provided in the embodiments of the present application are performed by performing position information on a first face detection frame of a current frame of a target video and position information on a second face detection frame of a previous frame. Obtained, so that the intersection ratio of the first face detection frame and the second face detection frame can be determined based on the obtained position information. After that, based on the intersection ratio, the weight of the obtained position information of each face detection frame is determined, and the weight of the obtained position information of each face detection frame can be determined. Finally, the target position information of the first face detection frame may be determined based on the determined weight and the obtained position information to update the position of the first face detection frame. Therefore, the position of the face detection frame in the subsequent frames can be adjusted based on the intersection of the face detection frames in the two frames before and after. The position of the face detection frame in the subsequent frame considers the position of the face detection frame in the previous frame, and the entire area of the face detection frame in the previous frame is considered instead of a single coordinate, thereby reducing the The jitter of the face detection frame improves the smoothness and stability of the movement of the face detection frame in the video.

BRIEF DESCRIPTION OF THE DRAWINGS

Other features, objects, and advantages of the present application will become more apparent by reading the detailed description of the non-limiting embodiments with reference to the following drawings:

FIG. 1 is an exemplary system architecture diagram to which an embodiment of the present application can be applied; FIG.

2 is a flowchart of an embodiment of a method for generating information according to the present application;

3 is a schematic diagram of an application scenario of a method for generating information according to the present application;

4 is a flowchart of still another embodiment of a method for generating information according to the present application;

5 is a schematic structural diagram of an embodiment of an apparatus for generating information according to the present application;

FIG. 6 is a schematic structural diagram of a computer system suitable for implementing an electronic device according to an embodiment of the present application.

detailed description

The following describes the present application in detail with reference to the accompanying drawings and embodiments. It can be understood that the specific embodiments described herein are only used to explain the related invention, rather than limiting the invention. It should also be noted that, for the convenience of description, only the parts related to the related invention are shown in the drawings.

It should be noted that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined with each other. The application will be described in detail below with reference to the drawings and embodiments.

FIG. 1 illustrates an exemplary system architecture 100 to which a method for detecting a key point of a face or an apparatus for detecting a key point of a face of the present application can be applied.

As shown in FIG. 1, the system architecture 100 may include

terminal devices

101, 102, and 103, a network 104, and a server 105. The network 104 is a medium for providing a communication link between the

terminal devices

101, 102, 103 and the server 105. The network 104 may include various types of connections, such as wired, wireless communication links, or fiber optic cables, and so on.

The user can use the

terminal devices

101, 102, 103 to interact with the server 105 through the network 104 to receive or send messages and the like. Various communication client applications can be installed on the

terminal devices

101, 102, and 103, such as voice interaction applications, shopping applications, search applications, instant communication tools, email clients, social platform software, and the like.

The

terminal devices

101, 102, and 103 may be hardware or software. When the

terminal devices

101, 102, and 103 are hardware, they can be various electronic devices that have a display screen and support web browsing, including but not limited to smartphones, tablets, e-book readers, MP3 players (Moving Pictures Experts Group) Audio Layer III, moving picture expert compression standard audio layer 3), MP4 (Moving Picture Experts Group Audio, Layer 4 IV, moving picture expert compression standard audio layer 4) player, laptop portable computer and desktop computer, etc. When the

terminal devices

101, 102, and 103 are software, they can be installed in the electronic devices listed above. It can be implemented as multiple software or software modules (for example, to provide distributed services), or it can be implemented as a single software or software module. It is not specifically limited here.

When the

terminal devices

101, 102, and 103 are hardware, an image acquisition device may also be installed thereon. The image acquisition device can be various devices that can implement the function of acquiring images, such as cameras, sensors, and so on. Users can use the image capture device on the

terminal devices

101, 102, 103 to capture video.

The

terminal device

101, 102, 103 can perform face detection and other processing on the video that it plays or frames recorded by the user; it can also analyze the face detection results (such as the position information of the face detection frame), etc. Process and update the position of the face detection frame.

The server 105 may be a server providing various services, such as a video processing server for storing, managing, or analyzing videos uploaded by the

terminal devices

101, 102, and 103. The video processing server can store a large number of videos, and can send videos to the

terminal devices

101, 102, and 103.

It should be noted that the server 105 may be hardware or software. When the server is hardware, it can be implemented as a distributed server cluster consisting of multiple servers or as a single server. When the server is software, it can be implemented as multiple software or software modules (for example, to provide distributed services), or it can be implemented as a single software or software module. It is not specifically limited here.

It should be noted that the methods for generating information provided by the embodiments of the present application are generally executed by the

terminal devices

101, 102, and 103. Accordingly, the devices for generating information are generally provided in the

terminal devices

101, 102, and 103.

It should be noted that, in a case where the

terminal devices

101, 102, and 103 can implement related functions of the server 105, the server 105 may not be provided in the system architecture 100.

It should also be noted that the server 105 may also perform face detection and other processing on its stored videos or videos uploaded by the

terminal devices

101, 102, and 103, and return the processing results to the

terminal devices

101, 102, and 103. At this time, the method for generating information provided in the embodiment of the present application may also be executed by the server 105, and accordingly, the apparatus for generating information may also be set in the server 105.

It should be understood that the numbers of terminal devices, networks, and servers in FIG. 1 are merely exemplary. According to implementation needs, there can be any number of terminal devices, networks, and servers.

With continued reference to FIG. 2, a flowchart 200 of one embodiment of a method for generating information according to the present application is shown. The method for generating information includes the following steps:

Step 201: Obtain position information of a first face detection frame obtained by performing face detection on a current frame of a target video in advance, and obtain a second face detection frame in a previous frame of a current frame that is stored in advance. location information.

In this embodiment, an execution subject of the method for generating information (for example, the

terminal devices

101, 102, and 103 shown in FIG. 1) may record or play a video. The video that it plays may be a video that is stored locally in advance; it may also be a video that is obtained from a server (such as the server 105 shown in FIG. 1) through a wired connection or a wireless connection. Here, when recording a video, the above-mentioned execution body may be installed or connected with an image acquisition device (for example, a camera). It should be noted that the above wireless connection methods may include, but are not limited to, 3G / 4G connection, WiFi connection, Bluetooth connection, WiMAX connection, Zigbee connection, UWB (ultra wideband) connection, and other wireless connection methods now known or developed in the future. .

In this embodiment, the execution subject may obtain position information of a first face detection frame obtained by performing face detection on a current frame of a target video in advance, and obtain a previously stored previous frame of the current frame. Position information of the second face detection frame. The target video may be a video currently being played or a video being recorded by a user. It is not limited here.

Here, the current frame of the target video may be a frame in the target video whose position detection frame is to be updated. As an example, the above-mentioned execution subject may perform face detection on each frame of the target video in sequence according to the timestamp order of the frames. After performing face detection on each frame except the first frame, The obtained face detection frame is subjected to position correction. The current frame to be corrected for the position of the face detection frame can be referred to as the current frame of the target video. Take the following two scenarios as examples:

In one scenario, the target video may be a video being played by the execution subject. During the playback of the target video, the execution subject may perform face detection on each frame to be played one by one to obtain the position information of the face detection frame of the frame. When the frame is a non-first frame, after obtaining the position information of the face detection frame, the position information of the face detection frame of the frame may be corrected to further play the frame. The frame at which the position of the face detection frame is about to be corrected at the current moment may be the current frame.

In another scenario, the target video may be a video being recorded by the above-mentioned execution subject. During the recording of the target video, the execution subject may perform face detection on each captured frame one by one to obtain the position information of the face detection frame of the frame. After the first frame is captured, for each frame captured next, after performing face detection on the frame, the obtained face detection frame can be position corrected, and then the frame is displayed. The latest frame acquired at the current time and which has not been subjected to the position correction of the face detection frame may be the current frame.

It should be noted that the above-mentioned executing subject may perform face detection on the frame of the target video in various ways. As an example, a pre-trained face detection model may be stored in the execution subject. The execution subject may input a frame in a frame in the target video into a pre-trained face detection model, and obtain position information of the face detection frame of the frame. The above-mentioned face detection model may be used to detect an area where a face object is located in the image (it may be represented by a face detection frame, and here, the face detection frame may be a rectangular frame). In practice, the face detection detection model can output the position information of the face detection frame. Here, the face detection model may be obtained by performing supervised training on an existing convolutional neural network based on a sample set (including a face image and a label for indicating the position of a face object region) using a machine learning method. Among them, the convolutional neural network can use various existing structures, such as DenseBox, VGGNet, ResNet, SegNet, and so on. It should be noted that the above-mentioned machine learning method and supervised training method are well-known technologies that are widely studied and applied at present, and will not be repeated here.

It should be noted that the position information of the face detection frame may be information for indicating and uniquely determining the position of the face detection frame in the frame.

Optionally, the position information of the face detection frame may include coordinates of four vertices of the face detection frame.

Optionally, the position information of the face detection frame may include the coordinates of any set of diagonal vertices of the face detection frame. For example, the coordinates of the upper left vertex and the coordinates of the lower right vertex.

Optionally, the position information of the face detection frame may include the coordinates of any vertex of the face detection frame and the length and width of the face detection frame.

It should be noted that the position information is not limited to the above list, and may also include other information that can be used to indicate and uniquely determine the position of the face detection frame.

Step 202: Determine an intersection ratio of the first face detection frame and the second face detection frame based on the obtained position information.

In this embodiment, the execution subject may determine the first face detection frame and the second face based on the obtained position information of the first face detection frame and the position information of the second face detection frame. Intersection-over-Union (IOU) of detection frames.

In practice, the intersection ratio of two rectangles may be the ratio of the area of the area where the two rectangles intersect to the area of the area where the two rectangles intersect. Here, the area of the area where the two rectangles meet is equal to the sum of the areas of the two rectangles minus the area of the area where the two rectangles intersect. In practice, the intersection ratio is a number in the interval [0,1].

In this embodiment, since the position information of the face detection frame is used, the position of the face detection frame in the frame can be determined. Therefore, by using the position information of the first face detection frame, the coordinates of each vertex of the first face detection frame in the current frame can be determined. Based on the position information of the second face detection frame, the coordinates of each vertex of the second face detection frame in the previous frame of the current frame can be determined. As an example, if the position information of the face detection frame can include the coordinates of any vertex (such as the upper left vertex) of the face detection frame and the length and width of the face detection frame, the horizontal coordinate of the upper left vertex can be related to the length Add and add the vertical coordinates of the upper left vertex and the width to obtain the coordinates of the upper right vertex, the lower left vertex, and the lower right vertex, respectively.

In this embodiment, the coordinates of each vertex of the first face detection frame and the coordinates of each vertex of the second face detection frame can be obtained. Therefore, the vertex coordinates of the first face detection frame and the vertex coordinates of the second face detection frame can be used to determine the length and width of the rectangle where the first face detection frame and the second face detection frame intersect. Further, the area of the intersecting rectangles (may be referred to as an intersecting area) can be obtained. After that, the sum of the areas of the first face detection frame and the second face detection frame (which can be referred to as the total area) can be calculated. Then, the difference between the total area and the intersecting area can be calculated (may be referred to as the merging area). Finally, the ratio of the intersection area and the merging area can be determined as the intersection ratio of the first face detection frame and the second face detection frame.

It should be noted that it should be noted that the above-mentioned cross-ratio calculation method is a well-known technology widely studied and applied at present, and will not be described again here.

Step 203: Determine the weight of the obtained position information of each face detection frame based on the intersection and union ratio.

In this embodiment, the execution subject may determine the weight of the position information of the first face detection frame and the weight of the second face detection frame based on the intersection ratio determined in step 202, respectively. For details, please refer to the following steps:

In the first step, the intersection ratio can be calculated in a formula established in advance, and the calculation result is determined as the weight of the position information of the second face detection frame. The foregoing formulas established in advance may be various formulas that satisfy preset conditions, and are not limited herein. The above preset conditions include: the larger the cross-ratio, the larger the calculation result of the above formula; the smaller the cross-ratio, the smaller the calculation result of the above formula. When the intersection ratio is 0, the calculation result is 0; when the intersection ratio is 1, the calculation result is 1.

In a second step, the difference between the preset value (for example, 1) and the weight of the position information of the second face detection frame may be determined as the weight of the position information of the first face detection frame.

It should be noted that the order of determining the weight of the position information of the first face detection frame and the weight of the position information of the second face detection frame is not limited herein. The execution subject may modify the pre-established formula so as to first determine the weight of the position information of the first face detection frame, and then determine the weight of the position information of the second face detection frame.

In some optional implementation manners of this embodiment, the above-mentioned execution body may perform the power operation with the above-mentioned intersection ratio as a base and a first preset value (for example, 6, or 3, etc.) as an index. Here, the first preset value may be determined by a technician based on a large amount of data statistics and experiments. Then, the execution subject may determine the calculation result of the power operation as the weight of the position information of the second face detection frame, and combine the second preset value (for example, 1) with the determined position information of the second face detection frame. The difference between the weights is determined as the weight of the position information of the first face detection frame.

In some optional implementation manners of this embodiment, the execution body may perform a power operation by using a natural constant as a base and a difference between the reciprocal of the intersection ratio and a second preset value (for example, 1) as an index. Then, the inverse of the calculation result of the power operation may be determined as the weight of the position information of the second face detection frame, and the difference between the second preset value and the weight of the determined position information of the second face detection frame. Determined as the weight of the position information of the first face detection frame.

It should be noted that the above-mentioned execution body may also determine the weight of each position information obtained in other ways. It is not limited to the above implementation. For example, a certain preset value (for example, 2 or 3) can be used as the base, and the difference between the reciprocal of the above-mentioned intersection ratio and the second preset value (for example, 1) can be used as an index to perform a power operation. Then, the inverse of the calculation result of the power calculation may be determined as the weight of the position information of the second face detection frame, and the difference between the second preset value and the weight may be determined as the position information of the first face detection frame. the weight of.

In the conventional method, the average value of the coordinates of the corresponding vertex (for example, the upper-left vertex) in the face detection frame in the previous frame and the current frame is usually used as the corrected coordinate of the vertex (upper-left vertex) in the current frame . The corrected coordinates of each vertex of the current frame are thus obtained. In this way, when the face object moves fast, the face detection frame cannot keep up with the movement of the face object, the drag feeling is strong, and the accuracy is low. The position of the face detection frame in the current frame is corrected by using the weight determined based on the crossover ratio in this application. Because the larger the intersection ratio, the slower the face object movement; the smaller the intersection ratio, the faster the face object movement. Therefore, different weights can be calculated according to different cross-ratio ratios. Thereby, the drag feeling is reduced, and the timeliness and accuracy of the face detection frame are improved.

In the conventional method, there is also a method for determining the weight of the coordinates of the vertex of the face detection frame in the previous frame and the distance between the coordinates of corresponding vertices in the face detection frame (for example, all of them are the upper left vertex). However, in this way, the weights of the coordinates of each vertex are independent, and the entire face detection frame cannot be considered. Therefore, the smoothing effect is poor. And using the weight determined based on the intersection ratio in this application, the entire area of the entire face detection frame is considered in the process of determining the intersection ratio, and the weights of the coordinates of each vertex in the same face detection frame are the same, so Consider the face detection frame as a whole. Improved smoothing effect.

Step 204: Determine target position information of the first face detection frame based on the determined weight and the obtained position information to update the position of the first face detection frame.

In this embodiment, the above-mentioned execution subject may determine the target position information of the first face detection frame based on the determined weight and the obtained position information to update the position of the first face detection frame. Here, the execution subject may modify the position information of the first face detection frame based on the determined weight. That is, the vertex coordinates of the first face detection frame are corrected.

In some optional implementation manners of this embodiment, the position information of the face detection frame may include coordinates of four vertices of the face detection frame. At this time, the execution subject may modify the coordinates of the first face detection frame, respectively. Specifically, for each vertex, the following steps can be performed (here, the sitting vertex is used as an example for description, and the remaining vertices will not be described again):

In the first step, the abscissa of the upper left vertex of the first face detection frame and the abscissa of the upper left vertex of the second face detection frame are weighted. That is, the horizontal coordinate of the upper left vertex of the first face detection frame is multiplied by the weight of the position information of the first face detection frame to obtain a first value. Multiply the abscissa of the upper left vertex of the second face detection frame by the weight of the position information of the second face detection frame to obtain a second value. The product of the first value and the second value is determined as the abscissa of the upper left vertex of the first face detection frame after correction.

In the second step, the vertical coordinate of the upper left vertex of the first face detection frame and the vertical coordinate of the upper left vertex of the second face detection frame are weighted. That is, the vertical coordinate of the upper left vertex of the first face detection frame is multiplied by the weight of the position information of the first face detection frame to obtain a third value. Multiply the vertical coordinate of the upper left vertex of the second face detection frame by the weight of the position information of the second face detection frame to obtain a fourth value. The product of the third value and the fourth value is determined as the vertical coordinate of the upper left vertex of the first face detection frame after correction.

In the third step, the abscissa and ordinate obtained in the first step and the second step are respectively aggregated into the coordinates of the upper left vertex of the first face detection frame after correction.

After the coordinates of the vertices of the first face detection frame are corrected, the above-mentioned electronic device may summarize the coordinates of the corrected vertices as target position information. Therefore, the position of the first face detection frame can be updated.

In some optional implementations of this embodiment, the position information of the first face detection frame includes the specified diagonal vertex coordinates of the first face detection frame, and the position information of the second face detection frame includes the first The specified diagonal vertex coordinates of the two-face detection frame. The designated diagonal vertex coordinates of the first face detection frame may include coordinates of a first vertex (for example, an upper left vertex) and coordinates of a second vertex (for example, a lower right vertex). The designated diagonal vertex coordinates of the second face detection frame may include coordinates of a third vertex (for example, an upper left vertex) and coordinates of a fourth vertex (for example, a lower right vertex). At this time, the execution subject may use the weight of the position information of the first face detection frame as the weight of the specified diagonal vertex coordinates of the first face detection frame, and the weight of the position information of the second face detection frame. As a weight of the designated diagonal vertex coordinates of the second face detection frame, a weighted calculation result of the designated diagonal vertex coordinates of the first face detection frame and the designated diagonal vertex coordinates of the second face detection frame is determined. The coordinates of the diagonal vertices of the target of the first face detection frame to update the position of the first face detection frame.

Optionally, the target diagonal vertex coordinates of the first face detection frame can be calculated in the following sequence of operations:

First, a weighted calculation result of the abscissa of the first vertex coordinate and the abscissa of the third vertex coordinate may be determined as the first target abscissa;

Then, a weighted calculation result of the ordinate of the first vertex coordinate and the ordinate of the third vertex coordinate may be determined as the first target ordinate;

Then, a weighted calculation result of the abscissa of the second vertex coordinate and the abscissa of the fourth vertex coordinate may be determined as the second target abscissa;

Next, the weighted calculation result of the ordinate of the second vertex coordinate and the ordinate of the fourth vertex coordinate may be determined as the second target ordinate;

Finally, the coordinates formed by the first target abscissa and the first target ordinate, and the coordinates formed by the second target abscissa and the second target ordinate can be determined as the targets of the first face detection frame. Diagonal vertex coordinates. Since the coordinates of a set of diagonal apex points are known, the position of the rectangular frame can be uniquely determined. Thereby, the position of the first face detection frame can be updated.

It should be noted that, in this implementation manner, other operation sequences may also be used to calculate the target diagonal vertex coordinates of the first face detection frame, which will not be repeated here.

It should be noted that, in this implementation manner, after the target diagonal vertex coordinates are calculated, another set of diagonal vertex coordinates of the first face detection frame may be calculated according to the target diagonal vertex coordinates. Thereby, the coordinates of the four vertices of the first face detection frame are obtained.

In some optional implementations of this embodiment, the position information of the face detection frame may include the coordinates of any vertex of the face detection frame and the length and width of the face detection frame. At this time, the execution subject may first determine the coordinates of a diagonal vertex of the vertex based on the coordinates of the vertex, the length, and the width. Alternatively, determine the coordinates of the remaining three vertices. Then, the target position information of the first face detection frame can be determined by using the operation steps described in the above two implementation manners. Thereby, the position of the first face detection frame is updated.

With continued reference to FIG. 3, FIG. 3 is a schematic diagram of an application scenario of the method for generating information according to this embodiment. In the application scenario of FIG. 3, a user uses a self-timer mode of the terminal device 301 to record a target video.

After capturing the first frame, the terminal device uses the stored face detection model to perform face point detection on the first frame, and obtains the position information 302 of the face detection frame in the first frame.

After the terminal device captures the second frame, it uses the stored face detection model to perform face detection on the second frame. Then, the position information 303 of the face detection frame of the second frame is obtained. At the same time, the position information 302 of the face detection frame in the first frame is obtained. Then, based on the position information 302 and the position information 303, an intersection ratio of the first face detection frame and the second face detection frame may be determined. After that, the weight of the obtained position information of each face detection frame can be determined based on the intersection ratio, and the weight of the obtained position information 302 and the weight of the position information 303 can be determined. Finally, the target position information 304 of the face detection frame of the second frame (that is, the final position information of the face detection frame of the second frame) may be determined based on the determined weight and the obtained position information 302 and position information 303. .

After capturing the third frame, the terminal device uses the stored face detection model to perform face point detection on the third frame. Then, the position information 305 of the face detection frame of the third frame is acquired. At the same time, the position information (that is, the target position information 304) of the face detection frame in the updated second frame is acquired. Then, based on the target position information 304 and the position information 305, the intersection ratio of the second face detection frame and the third face detection frame may be determined. After that, the weights of the obtained position information of each face detection frame can be determined based on the intersection ratio, and the weights of the obtained target position information 304 and the weights of the position information 305 can be determined. Finally, the target position information 306 of the face detection frame of the third frame (that is, the final position information of the face detection frame of the third frame may be determined based on the determined weight and the obtained target position information 304 and position information 305). ).

And so on. Finally, the terminal device 301 can obtain the position information of the face detection frame in each frame in the recorded video.

The method provided by the foregoing embodiment of the present application obtains the position information of the first face detection frame of the current frame of the target video and the position information of the second face detection frame of the previous frame by the pre-generated target video, so that it can be based on The obtained position information determines an intersection ratio of the first face detection frame and the second face detection frame. After that, based on the intersection ratio, the weight of the obtained position information of each face detection frame is determined, and the weight of the obtained position information of each face detection frame can be determined. Finally, the target position information of the first face detection frame may be determined based on the determined weight and the obtained position information to update the position of the first face detection frame. Therefore, the position of the face detection frame in the subsequent frames can be adjusted based on the intersection of the face detection frames in the two frames before and after. Because the position of the face detection frame in the subsequent frame considers the position of the face detection frame in the previous frame, and the entire area of the face detection frame in the previous frame is considered instead of a single coordinate, thereby reducing the The jitter of the face detection frame improves the smoothing effect and movement stability of the face detection frame in the video.

With further reference to FIG. 4, a flowchart 400 of yet another embodiment of a method for generating information is shown. The process 400 of the method for generating information includes the following steps:

Step 401: Obtain position information of a first face detection frame obtained by performing face detection on a current frame of a target video in advance, and obtain a second face detection frame in a previous frame of a current frame that is stored in advance. location information.

terminal devices

101, 102, and 103 shown in FIG. 1) may obtain a first face obtained by performing face detection on a current frame of a target video in advance. The position information of the detection frame, and the position information of the second face detection frame obtained by performing face detection on the previous frame of the current frame in advance.

In this embodiment, the position information of the first face detection frame may include designated diagonal vertex coordinates (such as the coordinates of the upper-left vertex and the lower-right vertex) of the first face detection frame, and the second face detection frame. The position information of can include the designated diagonal vertex coordinates of the second face detection frame.

Step 402: Determine the intersection ratio of the first face detection frame and the second face detection frame based on the obtained position information.

In this embodiment, the above-mentioned executing subject can determine the coordinates of the remaining vertices of the first face detection frame in the current frame by using the position information of the first face detection frame, so that the first face detection frame can be obtained. The coordinates of each vertex. Similarly, by using the position information of the second face detection frame, the coordinates of each vertex of the second face detection frame in the previous frame of the current frame can be determined. Then, the vertex coordinates of the first face detection frame and the vertex coordinates of the second face detection frame can be used to determine the length and width of the rectangle where the first face detection frame and the second face detection frame intersect. Further, the area of the intersecting rectangles (may be referred to as an intersecting area) can be obtained. After that, the sum of the areas of the first face detection frame and the second face detection frame (which can be referred to as the total area) can be calculated. Then, the difference between the total area and the intersecting area can be calculated (may be referred to as the merging area). Finally, the ratio of the intersection area and the merging area can be determined as the intersection ratio of the first face detection frame and the second face detection frame.

Step 403: Perform a power operation using the natural constant as a base and the difference between the reciprocal of the intersection and the second preset value as an index.

In this embodiment, the execution body may perform a power operation by using a natural constant as a base and a difference between the reciprocal of the intersection ratio and a second preset value (for example, 1) as an index.

Step 404: Determine the inverse of the calculation result of the power operation as the weight of the position information of the second face detection frame, and determine the difference between the second preset value and the weight as the weight of the position information of the first face detection frame.

In this embodiment, the execution subject may determine the inverse of the calculation result of the power operation as the weight of the position information of the second face detection frame, and combine the second preset value (for example, 1) with the determined weight. The difference is determined as the weight of the position information of the first face detection frame.

Step 405: Use the weight of the position information of the first face detection frame as the weight of the designated diagonal vertex coordinates of the first face detection frame, and use the weight of the position information of the second face detection frame as the second face detection frame. The weight of the specified diagonal vertex coordinates of the first face detection frame, and the weighted calculation result of the specified diagonal vertex coordinates of the first face detection frame and the specified diagonal vertex coordinates of the second face detection frame is determined as the target pair of the first face detection frame Angular vertex coordinates to update the position of the first face detection frame.

In this embodiment, the execution subject may use the weight of the position information of the first face detection frame as the weight of the designated diagonal vertex coordinates of the first face detection frame. Use the weight of the position information of the second face detection frame as the weight of the designated diagonal vertex coordinates of the second face detection frame. The weighted calculation result of the designated diagonal vertex coordinates of the first face detection frame and the designated diagonal vertex coordinates of the second face detection frame is determined as the target diagonal vertex coordinates of the first face detection frame to The position of the face detection frame is updated. The designated diagonal vertex coordinates of the first face detection frame may include coordinates of a first vertex (for example, an upper left vertex) and coordinates of a second vertex (for example, a lower right vertex). The designated diagonal vertex coordinates of the second face detection frame may include coordinates of a third vertex (for example, an upper left vertex) and coordinates of a fourth vertex (for example, a lower right vertex).

Specifically, the weighted calculation result of the abscissa of the first vertex coordinate and the abscissa of the third vertex coordinate may be determined as the first target abscissa first. Then, a weighted calculation result of the ordinate of the first vertex coordinate and the ordinate of the third vertex coordinate may be determined as the first target ordinate. Then, a weighted calculation result of the abscissa of the second vertex coordinate and the abscissa of the fourth vertex coordinate may be determined as the second target abscissa. Then, a weighted calculation result of the ordinate of the second vertex coordinate and the ordinate of the fourth vertex coordinate may be determined as the second target ordinate. Finally, the coordinates formed by the first target abscissa and the first target ordinate, and the coordinates formed by the second target abscissa and the second target ordinate can be determined as the targets of the first face detection frame. Diagonal vertex coordinates. Since the coordinates of a set of diagonal vertices are known, the position of the rectangular frame can be uniquely determined. Thereby, the position of the first face detection frame can be updated.

As can be seen from FIG. 4, compared with the embodiment corresponding to FIG. 2, the process 400 of the method for generating information in this embodiment highlights the weighting steps of determining the face detection frame of the current frame and the previous frame, respectively. . When the intersection of the first face detection frame and the second face detection frame is relatively small, the movement range of the face object from the previous frame to the current frame is relatively large. At this time, with the weight determined by the method of this embodiment, the weight of the position information of the first face detection frame (the face detection frame of the current frame) is larger, and the second face detection frame (the person of the previous frame) Face detection frame) has a smaller weight. When the intersection of the first face detection frame and the second face detection frame is relatively large, the movement range of the face object from the previous frame to the current frame is small. With the weight determined by the method of this embodiment, the weight of the position information of the first face detection frame is small, and the weight of the position information of the second face detection frame is large. Thereby, the face detection frame can be moved smoothly, the jitter of the face detection frame in the video is further reduced, and the smoothness effect and movement stability of the face detection frame in the video are further improved.

With further reference to FIG. 5, as an implementation of the methods shown in the foregoing figures, this application provides an embodiment of an apparatus for generating information. The apparatus embodiment corresponds to the method embodiment shown in FIG. 2. The device can be specifically applied to various electronic devices.

As shown in FIG. 5, the apparatus 500 for generating information according to this embodiment includes: an obtaining unit 501 configured to obtain a first face detection frame obtained by performing face detection on a current frame of a target video in advance; The position information of the second face detection frame in the previous frame of the current frame, and the first determination unit 502 is configured to determine the first person based on the obtained position information. The intersection ratio of the face detection frame and the second face detection frame; the second determination unit 503 is configured to determine the weight of the position information of each face detection frame obtained based on the intersection ratio; an update unit 504, And configured to determine target position information of the first face detection frame based on the determined weight and the obtained position information to update the position of the first face detection frame.

In some optional implementation manners of this embodiment, the foregoing second determination unit 503 may include a first operation module and a first determination module (not shown in the figure). The first operation module may be configured to perform the power operation using the intersection ratio as a base and a first preset value as an exponent. The first determination module may be configured to determine a calculation result of the power operation as a weight of the position information of the second face detection frame, and determine a difference between the second preset value and the weight as the first face detection. The weight of the box's position information.

In some optional implementation manners of this embodiment, the foregoing second determination unit 503 may include a second operation module and a second determination module (not shown in the figure). The second operation module may be configured to perform a power operation using a natural constant as a base and a difference between a reciprocal of the intersection ratio and a second preset value as an index. The second determining module may be configured to determine a reciprocal of a power operation calculation result as a weight of the position information of the second face detection frame, and determine a difference between the second preset value and the weight as the first person. Weight of the position information of the face detection frame.

In some optional implementation manners of this embodiment, the position information of the first face detection frame may include designated diagonal vertex coordinates of the first face detection frame, and the position information of the second face detection frame may be Including the designated diagonal vertex coordinates of the second face detection frame. And, the updating unit 504 may be further configured to: use the weight of the position information of the first face detection frame as the weight of the designated diagonal vertex coordinates of the first face detection frame, and use the second face detection frame as a weight. The weight of the position information is used as the weight of the designated diagonal vertex coordinates of the second face detection frame, and the designated diagonal vertex coordinates of the first face detection frame and the designated diagonal vertex coordinates of the second face detection frame are used. The weighted calculation result of is determined as the target diagonal vertex coordinates of the first face detection frame to update the position of the first face detection frame.

In some optional implementations of this embodiment, the specified diagonal vertex coordinates of the first face detection frame may include first vertex coordinates and second vertex coordinates, and the specified diagonal vertices of the second face detection frame. The coordinates may include a third vertex coordinate and a fourth vertex coordinate. And, the updating unit 504 may be further configured to: determine a weighted calculation result of the abscissa of the first vertex coordinate and the abscissa of the third vertex coordinate as the first target abscissa; and determine the ordinate of the first vertex coordinate The weighted calculation result of the ordinate with the third vertex coordinate is determined as the first target ordinate; the weighted calculation result of the abscissa of the above second vertex coordinate and the abscissa of the fourth vertex coordinate is determined as the second target abscissa; The weighted calculation result of the vertical coordinate of the second vertex coordinate and the vertical coordinate of the fourth vertex coordinate is determined as the second target vertical coordinate; the coordinate formed by the first target horizontal coordinate and the first target vertical coordinate, and the second The coordinates formed by the target horizontal coordinate and the second target vertical coordinate are determined as the target diagonal vertex coordinates of the first face detection frame.

The device provided by the foregoing embodiment of the present application obtains, through the obtaining unit 501, the position information of the first face detection frame of the current frame of the target video and the position information of the second face detection frame of the previous frame. Therefore, the first determining unit 502 may determine an intersection ratio of the first face detection frame and the second face detection frame based on the obtained position information. After that, the second determining unit 503 determines the weight of the obtained position information of each face detection frame based on the intersection ratio, and can determine the weight of the obtained position information of each face detection frame. Finally, the updating unit 504 may determine the target position information of the first face detection frame based on the determined weight and the obtained position information to update the position of the first face detection frame. Therefore, the position of the face detection frame in the subsequent frames can be adjusted based on the intersection of the face detection frames in the two frames before and after. Because the position of the face detection frame in the subsequent frame considers the position of the face detection frame in the previous frame, and the entire area of the face detection frame in the previous frame is considered instead of a single coordinate, thereby reducing the The jitter of the face detection frame improves the smoothing effect and movement stability of the face detection frame in the video.

Reference is now made to FIG. 6, which illustrates a schematic structural diagram of a computer system 600 suitable for implementing an electronic device according to an embodiment of the present application. The electronic device shown in FIG. 6 is only an example, and should not impose any limitation on the functions and scope of use of the embodiments of the present application.

As shown in FIG. 6, the computer system 600 includes a central processing unit (CPU) 601, which can be loaded into a random access memory (RAM) 603 according to a program stored in a read-only memory (ROM) 602 or from a storage portion 608. Instead, perform various appropriate actions and processes. In the RAM 603, various programs and data required for the operation of the system 600 are also stored. The CPU 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604. An input / output (I / O) interface 605 is also connected to the bus 604.

The following components are connected to the I / O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a cathode ray tube (CRT), a liquid crystal display (LCD), and the speaker; a storage portion 608 including a hard disk and the like And a communication section 609 including a network interface card such as a LAN card, a modem, and the like. The communication section 609 performs communication processing via a network such as the Internet. The driver 610 is also connected to the I / O interface 605 as necessary. A removable medium 611, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is installed on the drive 610 as necessary, so that a computer program read therefrom is installed into the storage section 608 as necessary.

In particular, according to an embodiment of the present disclosure, the process described above with reference to the flowchart may be implemented as a computer software program. For example, embodiments of the present disclosure include a computer program product including a computer program carried on a computer-readable medium, the computer program containing program code for performing a method shown in a flowchart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication portion 609, and / or installed from a removable medium 611. When the computer program is executed by a central processing unit (CPU) 601, the above-mentioned functions defined in the method of the present application are executed. It should be noted that the computer-readable medium described in this application may be a computer-readable signal medium or a computer-readable storage medium or any combination of the foregoing. The computer-readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programming read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing. In this application, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in combination with an instruction execution system, apparatus, or device. In this application, a computer-readable signal medium may include a data signal that is included in baseband or propagated as part of a carrier wave, and which carries computer-readable program code. Such a propagated data signal may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. The computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, and the computer-readable medium may send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device . Program code embodied on a computer-readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagram may represent a module, a program segment, or a part of code, which contains one or more functions to implement a specified logical function Executable instructions. It should also be noted that in some alternative implementations, the functions noted in the blocks may also occur in a different order than those marked in the drawings. For example, two successively represented boxes may actually be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, can be implemented by a dedicated hardware-based system that performs the specified function or operation , Or it can be implemented with a combination of dedicated hardware and computer instructions.

The units described in the embodiments of the present application may be implemented by software or hardware. The described unit may also be provided in a processor, for example, it may be described as: a processor includes an acquisition unit, a first determination unit, a second determination unit, and an update unit. The names of these units do not constitute a limitation on the unit itself in some cases. For example, the update unit may also be described as a “unit that updates the position of the second face detection frame”.

As another aspect, the present application also provides a computer-readable medium, which may be included in the device described in the foregoing embodiments; or may exist alone without being assembled into the device. The computer-readable medium carries one or more programs, and when the one or more programs are executed by the device, the device is caused to obtain a first human face obtained by performing face detection on a current frame of a target video in advance. Position information of the detection frame, and acquiring position information of a second face detection frame obtained by performing face detection on a previous frame of the current frame in advance; determining the first face detection based on the acquired position information The intersection ratio of the frame and the second face detection frame; based on the intersection ratio, determining the weight of the obtained position information of each face detection frame; based on the determined weight and the obtained position information, determining the first Target position information of a face detection frame to update the position of the first face detection frame.

The above description is only a preferred embodiment of the present application and an explanation of the applied technical principles. Those skilled in the art should understand that the scope of the invention involved in this application is not limited to the technical solution of the specific combination of the above technical features, but it should also cover the above technical features or Other technical solutions formed by arbitrarily combining their equivalent features. For example, a technical solution formed by replacing the above features with technical features disclosed in this application (but not limited to) with similar functions.

Claims

A method for generating information, including:

Obtain position information of a first face detection frame obtained by performing face detection on a current frame of a target video in advance, and obtain a position of a second face detection frame in a previous frame of the current frame that is stored in advance information;

Determining an intersection ratio of the first face detection frame and the second face detection frame based on the obtained position information;

Determining the weight of the obtained position information of each face detection frame based on the intersection ratio;

Based on the determined weight and the obtained position information, target position information of the first face detection frame is determined to update the position of the first face detection frame.
The method for generating information according to claim 1, wherein the determining a weight of the position information of each face detection frame based on the intersection ratio comprises:

Performing the power operation using the intersection ratio as a base and the first preset value as an exponent;

Determine the calculation result of the power operation as the weight of the position information of the second face detection frame, and determine the difference between the second preset value and the weight as the weight of the position information of the first face detection frame .
The method for generating information according to claim 1, wherein the determining a weight of the position information of each face detection frame based on the intersection ratio comprises:

Perform a power operation with a natural constant as a base, and a difference between a reciprocal of the intersection ratio and a second preset value as an index;

Determining the inverse of the calculation result of the power operation as the weight of the position information of the second face detection frame, and determining the difference between the second preset value and the weight as the position of the first face detection frame The weight of the information.
The method for generating information according to claim 1, wherein the position information of the first face detection frame includes designated diagonal vertex coordinates of the first face detection frame, and the second face detection Frame position information includes designated diagonal vertex coordinates of the second face detection frame; and

Determining the target position information of the first face detection frame based on the determined weight and the obtained position information to update the position of the first face detection frame includes:

Taking the weight of the position information of the first face detection frame as the weight of the designated diagonal vertex coordinates of the first face detection frame, and the weight of the position information of the second face detection frame as the first The weight of the designated diagonal vertex coordinates of the two face detection frames, and a weighted calculation result of the designated diagonal vertex coordinates of the first face detection frame and the designated diagonal vertex coordinates of the second face detection frame is determined as Coordinates of the target diagonal vertices of the first face detection frame to update the position of the first face detection frame.
The method for generating information according to claim 4, wherein the designated diagonal vertex coordinates of the first face detection frame include a first vertex coordinate and a second vertex coordinate, and the Specify diagonal vertex coordinates including third vertex coordinates and fourth vertex coordinates; and

Determining a weighted calculation result of the designated diagonal vertex coordinates of the first face detection frame and the designated diagonal vertex coordinates of the second face detection frame as the target diagonal of the first face detection frame Vertex coordinates, including:

Determining a weighted calculation result of the abscissa of the first vertex coordinate and the abscissa of the third vertex coordinate as the first target abscissa;

Determining a weighted calculation result of the ordinate of the first vertex coordinate and the ordinate of the third vertex coordinate as the first target ordinate;

Determining a weighted calculation result of the abscissa of the second vertex coordinate and the abscissa of the fourth vertex coordinate as the second target abscissa;

Determining a weighted calculation result of the ordinate of the second vertex coordinate and the ordinate of the fourth vertex coordinate as the second target ordinate;

Determining the coordinates formed by the first target abscissa and the first target ordinate, and the coordinates formed by the second target abscissa and the second target ordinate as the first face detection frame The target diagonal vertex coordinates.
An apparatus for generating information includes:

An obtaining unit configured to obtain position information of a first face detection frame obtained by performing face detection on a current frame of a target video in advance, and obtaining a second stored in a previous frame of the current frame in advance Position information of the face detection frame;

A first determining unit configured to determine an intersection ratio of the first face detection frame and the second face detection frame based on the obtained position information;

A second determining unit configured to determine a weight of the obtained position information of each face detection frame based on the intersection ratio;

The updating unit is configured to determine target position information of the first face detection frame based on the determined weight and the obtained position information to update the position of the first face detection frame.
The apparatus for generating information according to claim 6, wherein the second determining unit comprises:

A first operation module configured to perform a power operation using the intersection ratio as a base and a first preset value as an exponent;

A first determining module configured to determine a calculation result of a power operation as a weight of position information of the second face detection frame, and determine a difference between a second preset value and the weight as the first person Weight of the position information of the face detection frame.
The apparatus for generating information according to claim 6, wherein the second determining unit comprises:

A second operation module configured to perform a power operation using a natural constant as a base and a difference between a reciprocal of the intersection ratio and a second preset value as an index;

A second determining module configured to determine a reciprocal of a power operation calculation result as a weight of the position information of the second face detection frame, and determine a difference between the second preset value and the weight as the weight Weight of the position information of the first face detection frame.
The apparatus for generating information according to claim 6, wherein the position information of the first face detection frame includes designated diagonal vertex coordinates of the first face detection frame, and the second face detection Frame position information includes designated diagonal vertex coordinates of the second face detection frame; and

The update unit is further configured to:

Taking the weight of the position information of the first face detection frame as the weight of the designated diagonal vertex coordinates of the first face detection frame, and the weight of the position information of the second face detection frame as the first The weight of the designated diagonal vertex coordinates of the two face detection frames, and a weighted calculation result of the designated diagonal vertex coordinates of the first face detection frame and the designated diagonal vertex coordinates of the second face detection frame is determined as Coordinates of the target diagonal vertices of the first face detection frame to update the position of the first face detection frame.
The apparatus for generating information according to claim 9, wherein the designated diagonal vertex coordinates of the first face detection frame include a first vertex coordinate and a second vertex coordinate, and the Specify diagonal vertex coordinates including third vertex coordinates and fourth vertex coordinates; and

The update unit is further configured to:

Determining a weighted calculation result of the abscissa of the first vertex coordinate and the abscissa of the third vertex coordinate as the first target abscissa;

Determining a weighted calculation result of the ordinate of the first vertex coordinate and the ordinate of the third vertex coordinate as the first target ordinate;

Determining a weighted calculation result of the abscissa of the second vertex coordinate and the abscissa of the fourth vertex coordinate as the second target abscissa;

Determining a weighted calculation result of the ordinate of the second vertex coordinate and the ordinate of the fourth vertex coordinate as the second target ordinate;

Determining the coordinates formed by the first target abscissa and the first target ordinate, and the coordinates formed by the second target abscissa and the second target ordinate as the first face detection frame The target diagonal vertex coordinates.
An electronic device includes:

One or more processors;

A storage device on which one or more programs are stored,

When the one or more programs are executed by the one or more processors, the one or more processors implement the method according to any one of claims 1-5.
A computer-readable medium having stored thereon a computer program, wherein the program, when executed by a processor, implements the method of any one of claims 1-5.