WO2020056903A1 - Information generating method and device - Google Patents

Information generating method and device Download PDF

Info

Publication number
WO2020056903A1
WO2020056903A1 PCT/CN2018/115974 CN2018115974W WO2020056903A1 WO 2020056903 A1 WO2020056903 A1 WO 2020056903A1 CN 2018115974 W CN2018115974 W CN 2018115974W WO 2020056903 A1 WO2020056903 A1 WO 2020056903A1
Authority
WO
WIPO (PCT)
Prior art keywords
face detection
detection frame
position information
weight
target
Prior art date
Application number
PCT/CN2018/115974
Other languages
French (fr)
Chinese (zh)
Inventor
吴兴龙
Original Assignee
北京字节跳动网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字节跳动网络技术有限公司 filed Critical 北京字节跳动网络技术有限公司
Publication of WO2020056903A1 publication Critical patent/WO2020056903A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Definitions

  • Embodiments of the present application relate to the field of computer technology, and in particular, to a method and an apparatus for generating information.
  • Face detection refers to the process of searching for a given image using a certain strategy to determine whether it contains a face object, and if so, returning the position and size of the face object, and The returned result can be reflected in the form of a face detection frame in the image.
  • a related method is to directly perform face detection on each frame to obtain a face detection frame for indicating a face object in each frame.
  • the embodiments of the present application provide a method and device for generating information.
  • an embodiment of the present application provides a method for generating information.
  • the method includes: obtaining position information of a first face detection frame obtained by performing face detection on a current frame of a target video in advance, and To obtain pre-stored position information of the second face detection frame in a previous frame of the current frame; based on the obtained position information, determine the intersection ratio of the first face detection frame and the second face detection frame; based on The intersection ratio determines the weight of the obtained position information of each face detection frame; based on the determined weight and the obtained position information, determines the target position information of the first face detection frame to detect the first face The position of the box is updated.
  • determining the weight of the obtained position information of each face detection frame based on the intersection ratio includes: performing the power operation with the intersection ratio as the base and the first preset value as the exponent; The calculation result of the operation is determined as the weight of the position information of the second face detection frame, and the difference between the second preset value and the weight is determined as the weight of the position information of the first face detection frame.
  • determining the weight of the obtained position information of each face detection frame based on the intersection ratio includes: using a natural constant as a base number, and taking a difference between a reciprocal of the intersection ratio and a second preset value as an index , Performing a power operation; determining the inverse of the calculation result of the power operation as the weight of the position information of the second face detection frame, and determining the difference between the second preset value and the weight as the weight of the position information of the first face detection frame .
  • the position information of the first face detection frame includes designated diagonal vertex coordinates of the first face detection frame
  • the position information of the second face detection frame includes designated diagonal vertices of the second face detection frame. Coordinates; and determining target position information of the first face detection frame based on the determined weight and the obtained position information to update the position of the first face detection frame, including: updating the position of the first face detection frame
  • the weight of the position information is used as the weight of the specified diagonal vertex coordinates of the first face detection frame
  • the weight of the position information of the second face detection frame is used as the weight of the specified diagonal vertex coordinates of the second face detection frame.
  • the weighted calculation result of the designated diagonal vertex coordinates of a face detection frame and the designated diagonal vertex coordinates of a second face detection frame is determined as the target diagonal vertex coordinates of the first face detection frame to detect the first face.
  • the position of the box is updated.
  • the designated diagonal vertex coordinates of the first face detection frame include a first vertex coordinate and a second vertex coordinate
  • the designated diagonal vertex coordinates of the second face detection frame include a third vertex coordinate and a fourth vertex. Coordinates; and determining the weighted calculation result of the designated diagonal vertex coordinates of the first face detection frame and the designated diagonal vertex coordinates of the second face detection frame as the target diagonal vertex coordinates of the first face detection frame, including: The weighted calculation result of the abscissa of the first vertex coordinate and the abscissa of the third vertex coordinate is determined as the first target abscissa; the weighted calculation result of the ordinate of the first vertex coordinate and the ordinate of the third vertex coordinate is determined as The first target ordinate; the weighted calculation result of the abscissa of the second vertex coordinate and the abscissa of the fourth vertex coordinate is determined as the second target abscissa; the ordinate of the second vertex
  • an embodiment of the present application provides an apparatus for generating information.
  • the apparatus includes: an obtaining unit configured to obtain first face detection obtained by performing face detection on a current frame of a target video in advance; Position information of the frame, and acquiring position information of a second face detection frame in a previous frame of a current frame that is stored in advance; a first determining unit configured to determine the first face detection based on the acquired position information The intersection ratio of the frame and the second face detection frame; the second determination unit is configured to determine the weight of the position information of each face detection frame obtained based on the intersection ratio; the update unit is configured to be based on the determined Determine the target position information of the first face detection frame to update the position of the first face detection frame.
  • the second determination unit includes: a first operation module configured to perform a power operation with a cross-ratio as a base and a first preset value as an exponent; the first determination module is configured to convert The calculation result of the power operation is determined as the weight of the position information of the second face detection frame, and the difference between the second preset value and the weight is determined as the weight of the position information of the first face detection frame.
  • the second determining unit includes: a second operation module configured to perform a power operation with a natural constant as a base, and a difference between a reciprocal of the intersection ratio and a second preset value as an exponent; a second The determining module is configured to determine the inverse of the calculation result of the power operation as the weight of the position information of the second face detection frame, and determine the difference between the second preset value and the weight as the position information of the first face detection frame. Weights.
  • the position information of the first face detection frame includes designated diagonal vertex coordinates of the first face detection frame
  • the position information of the second face detection frame includes designated diagonal vertices of the second face detection frame.
  • Coordinates; and an update unit further configured to: use the weight of the position information of the first face detection frame as the weight of the designated diagonal vertex coordinates of the first face detection frame, and set the position information of the second face detection frame to The weight is the weight of the designated diagonal vertex coordinates of the second face detection frame, and the weighted calculation result of the designated diagonal vertex coordinates of the first face detection frame and the designated diagonal vertex coordinates of the second face detection frame is determined as The coordinates of the diagonal diagonal vertices of a face detection frame to update the position of the first face detection frame.
  • the designated diagonal vertex coordinates of the first face detection frame include a first vertex coordinate and a second vertex coordinate
  • the designated diagonal vertex coordinates of the second face detection frame include a third vertex coordinate and a fourth vertex.
  • an update unit further configured to: determine a weighted calculation result of the abscissa of the first vertex coordinate and the abscissa of the third vertex coordinate as the first target abscissa; and determine the ordinate of the first vertex coordinate and the third The weighted calculation result of the ordinate of the vertex coordinate is determined as the first target ordinate; the weighted calculation result of the abscissa of the second vertex coordinate and the abscissa of the fourth vertex coordinate is determined as the second target abscissa; the second vertex coordinate The weighted calculation result of the ordinate of the ordinate of the fourth coordinate and the ordinate of the fourth vertex is determined as the second target ordinate; the coordinates formed by the first target abscis
  • an embodiment of the present application provides an electronic device including: one or more processors; a storage device that stores one or more programs thereon; when one or more programs are processed by one or more processors Execution causes one or more processors to implement the method as in any one of the first aspects described above.
  • an embodiment of the present application provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processor, the method as in any one of the foregoing first embodiments is implemented.
  • the method and device for generating information provided in the embodiments of the present application are performed by performing position information on a first face detection frame of a current frame of a target video and position information on a second face detection frame of a previous frame. Obtained, so that the intersection ratio of the first face detection frame and the second face detection frame can be determined based on the obtained position information. After that, based on the intersection ratio, the weight of the obtained position information of each face detection frame is determined, and the weight of the obtained position information of each face detection frame can be determined. Finally, the target position information of the first face detection frame may be determined based on the determined weight and the obtained position information to update the position of the first face detection frame.
  • the position of the face detection frame in the subsequent frames can be adjusted based on the intersection of the face detection frames in the two frames before and after.
  • the position of the face detection frame in the subsequent frame considers the position of the face detection frame in the previous frame, and the entire area of the face detection frame in the previous frame is considered instead of a single coordinate, thereby reducing the
  • the jitter of the face detection frame improves the smoothness and stability of the movement of the face detection frame in the video.
  • FIG. 1 is an exemplary system architecture diagram to which an embodiment of the present application can be applied;
  • FIG. 1 is an exemplary system architecture diagram to which an embodiment of the present application can be applied;
  • FIG. 2 is a flowchart of an embodiment of a method for generating information according to the present application
  • FIG. 3 is a schematic diagram of an application scenario of a method for generating information according to the present application.
  • FIG. 4 is a flowchart of still another embodiment of a method for generating information according to the present application.
  • FIG. 5 is a schematic structural diagram of an embodiment of an apparatus for generating information according to the present application.
  • FIG. 6 is a schematic structural diagram of a computer system suitable for implementing an electronic device according to an embodiment of the present application.
  • FIG. 1 illustrates an exemplary system architecture 100 to which a method for detecting a key point of a face or an apparatus for detecting a key point of a face of the present application can be applied.
  • the system architecture 100 may include terminal devices 101, 102, and 103, a network 104, and a server 105.
  • the network 104 is a medium for providing a communication link between the terminal devices 101, 102, 103 and the server 105.
  • the network 104 may include various types of connections, such as wired, wireless communication links, or fiber optic cables, and so on.
  • the user can use the terminal devices 101, 102, 103 to interact with the server 105 through the network 104 to receive or send messages and the like.
  • Various communication client applications can be installed on the terminal devices 101, 102, and 103, such as voice interaction applications, shopping applications, search applications, instant communication tools, email clients, social platform software, and the like.
  • the terminal devices 101, 102, and 103 may be hardware or software.
  • the terminal devices 101, 102, and 103 can be various electronic devices that have a display screen and support web browsing, including but not limited to smartphones, tablets, e-book readers, MP3 players (Moving Pictures Experts Group) Audio Layer III, moving picture expert compression standard audio layer 3), MP4 (Moving Picture Experts Group Audio, Layer 4 IV, moving picture expert compression standard audio layer 4) player, laptop portable computer and desktop computer, etc.
  • MP3 players Motion Pictures Experts Group Audio Layer III, moving picture expert compression standard audio layer 3
  • MP4 Motion Picture Experts Group Audio, Layer 4 IV, moving picture expert compression standard audio layer 4
  • player laptop portable computer and desktop computer, etc.
  • laptop portable computer and desktop computer etc.
  • the terminal devices 101, 102, and 103 are software, they can be installed in the electronic devices listed above. It can be implemented as multiple software or software modules (for example, to provide distributed services), or it can be implemented as a single software or software module. It is not specifically limited
  • an image acquisition device may also be installed thereon.
  • the image acquisition device can be various devices that can implement the function of acquiring images, such as cameras, sensors, and so on. Users can use the image capture device on the terminal devices 101, 102, 103 to capture video.
  • the terminal device 101, 102, 103 can perform face detection and other processing on the video that it plays or frames recorded by the user; it can also analyze the face detection results (such as the position information of the face detection frame), etc. Process and update the position of the face detection frame.
  • the server 105 may be a server providing various services, such as a video processing server for storing, managing, or analyzing videos uploaded by the terminal devices 101, 102, and 103.
  • the video processing server can store a large number of videos, and can send videos to the terminal devices 101, 102, and 103.
  • the server 105 may be hardware or software.
  • the server can be implemented as a distributed server cluster consisting of multiple servers or as a single server.
  • the server can be implemented as multiple software or software modules (for example, to provide distributed services), or it can be implemented as a single software or software module. It is not specifically limited here.
  • the methods for generating information provided by the embodiments of the present application are generally executed by the terminal devices 101, 102, and 103. Accordingly, the devices for generating information are generally provided in the terminal devices 101, 102, and 103.
  • the server 105 may not be provided in the system architecture 100.
  • the server 105 may also perform face detection and other processing on its stored videos or videos uploaded by the terminal devices 101, 102, and 103, and return the processing results to the terminal devices 101, 102, and 103.
  • the method for generating information provided in the embodiment of the present application may also be executed by the server 105, and accordingly, the apparatus for generating information may also be set in the server 105.
  • terminal devices, networks, and servers in FIG. 1 are merely exemplary. According to implementation needs, there can be any number of terminal devices, networks, and servers.
  • a flowchart 200 of one embodiment of a method for generating information according to the present application is shown.
  • the method for generating information includes the following steps:
  • Step 201 Obtain position information of a first face detection frame obtained by performing face detection on a current frame of a target video in advance, and obtain a second face detection frame in a previous frame of a current frame that is stored in advance. location information.
  • an execution subject of the method for generating information may record or play a video.
  • the video that it plays may be a video that is stored locally in advance; it may also be a video that is obtained from a server (such as the server 105 shown in FIG. 1) through a wired connection or a wireless connection.
  • a server such as the server 105 shown in FIG. 1
  • the above-mentioned execution body may be installed or connected with an image acquisition device (for example, a camera).
  • wireless connection methods may include, but are not limited to, 3G / 4G connection, WiFi connection, Bluetooth connection, WiMAX connection, Zigbee connection, UWB (ultra wideband) connection, and other wireless connection methods now known or developed in the future. .
  • the execution subject may obtain position information of a first face detection frame obtained by performing face detection on a current frame of a target video in advance, and obtain a previously stored previous frame of the current frame. Position information of the second face detection frame.
  • the target video may be a video currently being played or a video being recorded by a user. It is not limited here.
  • the current frame of the target video may be a frame in the target video whose position detection frame is to be updated.
  • the above-mentioned execution subject may perform face detection on each frame of the target video in sequence according to the timestamp order of the frames. After performing face detection on each frame except the first frame, The obtained face detection frame is subjected to position correction.
  • the current frame to be corrected for the position of the face detection frame can be referred to as the current frame of the target video. Take the following two scenarios as examples:
  • the target video may be a video being played by the execution subject.
  • the execution subject may perform face detection on each frame to be played one by one to obtain the position information of the face detection frame of the frame.
  • the position information of the face detection frame of the frame may be corrected to further play the frame.
  • the frame at which the position of the face detection frame is about to be corrected at the current moment may be the current frame.
  • the target video may be a video being recorded by the above-mentioned execution subject.
  • the execution subject may perform face detection on each captured frame one by one to obtain the position information of the face detection frame of the frame. After the first frame is captured, for each frame captured next, after performing face detection on the frame, the obtained face detection frame can be position corrected, and then the frame is displayed.
  • the latest frame acquired at the current time and which has not been subjected to the position correction of the face detection frame may be the current frame.
  • a pre-trained face detection model may be stored in the execution subject.
  • the execution subject may input a frame in a frame in the target video into a pre-trained face detection model, and obtain position information of the face detection frame of the frame.
  • the above-mentioned face detection model may be used to detect an area where a face object is located in the image (it may be represented by a face detection frame, and here, the face detection frame may be a rectangular frame).
  • the face detection detection model can output the position information of the face detection frame.
  • the face detection model may be obtained by performing supervised training on an existing convolutional neural network based on a sample set (including a face image and a label for indicating the position of a face object region) using a machine learning method.
  • the convolutional neural network can use various existing structures, such as DenseBox, VGGNet, ResNet, SegNet, and so on. It should be noted that the above-mentioned machine learning method and supervised training method are well-known technologies that are widely studied and applied at present, and will not be repeated here.
  • the position information of the face detection frame may be information for indicating and uniquely determining the position of the face detection frame in the frame.
  • the position information of the face detection frame may include coordinates of four vertices of the face detection frame.
  • the position information of the face detection frame may include the coordinates of any set of diagonal vertices of the face detection frame. For example, the coordinates of the upper left vertex and the coordinates of the lower right vertex.
  • the position information of the face detection frame may include the coordinates of any vertex of the face detection frame and the length and width of the face detection frame.
  • position information is not limited to the above list, and may also include other information that can be used to indicate and uniquely determine the position of the face detection frame.
  • Step 202 Determine an intersection ratio of the first face detection frame and the second face detection frame based on the obtained position information.
  • the execution subject may determine the first face detection frame and the second face based on the obtained position information of the first face detection frame and the position information of the second face detection frame.
  • the intersection ratio of two rectangles may be the ratio of the area of the area where the two rectangles intersect to the area of the area where the two rectangles intersect.
  • the area of the area where the two rectangles meet is equal to the sum of the areas of the two rectangles minus the area of the area where the two rectangles intersect.
  • the intersection ratio is a number in the interval [0,1].
  • the position of the face detection frame in the frame can be determined. Therefore, by using the position information of the first face detection frame, the coordinates of each vertex of the first face detection frame in the current frame can be determined. Based on the position information of the second face detection frame, the coordinates of each vertex of the second face detection frame in the previous frame of the current frame can be determined.
  • the position information of the face detection frame can include the coordinates of any vertex (such as the upper left vertex) of the face detection frame and the length and width of the face detection frame
  • the horizontal coordinate of the upper left vertex can be related to the length Add and add the vertical coordinates of the upper left vertex and the width to obtain the coordinates of the upper right vertex, the lower left vertex, and the lower right vertex, respectively.
  • the coordinates of each vertex of the first face detection frame and the coordinates of each vertex of the second face detection frame can be obtained. Therefore, the vertex coordinates of the first face detection frame and the vertex coordinates of the second face detection frame can be used to determine the length and width of the rectangle where the first face detection frame and the second face detection frame intersect. Further, the area of the intersecting rectangles (may be referred to as an intersecting area) can be obtained. After that, the sum of the areas of the first face detection frame and the second face detection frame (which can be referred to as the total area) can be calculated. Then, the difference between the total area and the intersecting area can be calculated (may be referred to as the merging area). Finally, the ratio of the intersection area and the merging area can be determined as the intersection ratio of the first face detection frame and the second face detection frame.
  • Step 203 Determine the weight of the obtained position information of each face detection frame based on the intersection and union ratio.
  • the execution subject may determine the weight of the position information of the first face detection frame and the weight of the second face detection frame based on the intersection ratio determined in step 202, respectively.
  • the execution subject may determine the weight of the position information of the first face detection frame and the weight of the second face detection frame based on the intersection ratio determined in step 202, respectively.
  • the intersection ratio can be calculated in a formula established in advance, and the calculation result is determined as the weight of the position information of the second face detection frame.
  • the foregoing formulas established in advance may be various formulas that satisfy preset conditions, and are not limited herein.
  • the above preset conditions include: the larger the cross-ratio, the larger the calculation result of the above formula; the smaller the cross-ratio, the smaller the calculation result of the above formula.
  • the difference between the preset value (for example, 1) and the weight of the position information of the second face detection frame may be determined as the weight of the position information of the first face detection frame.
  • the order of determining the weight of the position information of the first face detection frame and the weight of the position information of the second face detection frame is not limited herein.
  • the execution subject may modify the pre-established formula so as to first determine the weight of the position information of the first face detection frame, and then determine the weight of the position information of the second face detection frame.
  • the above-mentioned execution body may perform the power operation with the above-mentioned intersection ratio as a base and a first preset value (for example, 6, or 3, etc.) as an index.
  • the first preset value may be determined by a technician based on a large amount of data statistics and experiments.
  • the execution subject may determine the calculation result of the power operation as the weight of the position information of the second face detection frame, and combine the second preset value (for example, 1) with the determined position information of the second face detection frame. The difference between the weights is determined as the weight of the position information of the first face detection frame.
  • the execution body may perform a power operation by using a natural constant as a base and a difference between the reciprocal of the intersection ratio and a second preset value (for example, 1) as an index. Then, the inverse of the calculation result of the power operation may be determined as the weight of the position information of the second face detection frame, and the difference between the second preset value and the weight of the determined position information of the second face detection frame. Determined as the weight of the position information of the first face detection frame.
  • a second preset value for example, 1
  • the above-mentioned execution body may also determine the weight of each position information obtained in other ways. It is not limited to the above implementation. For example, a certain preset value (for example, 2 or 3) can be used as the base, and the difference between the reciprocal of the above-mentioned intersection ratio and the second preset value (for example, 1) can be used as an index to perform a power operation. Then, the inverse of the calculation result of the power calculation may be determined as the weight of the position information of the second face detection frame, and the difference between the second preset value and the weight may be determined as the position information of the first face detection frame. the weight of.
  • a certain preset value for example, 2 or 3
  • the difference between the reciprocal of the above-mentioned intersection ratio and the second preset value for example, 1
  • the inverse of the calculation result of the power calculation may be determined as the weight of the position information of the second face detection frame
  • the difference between the second preset value and the weight may be determined as the position information of the first face detection
  • the average value of the coordinates of the corresponding vertex (for example, the upper-left vertex) in the face detection frame in the previous frame and the current frame is usually used as the corrected coordinate of the vertex (upper-left vertex) in the current frame .
  • the corrected coordinates of each vertex of the current frame are thus obtained.
  • the position of the face detection frame in the current frame is corrected by using the weight determined based on the crossover ratio in this application. Because the larger the intersection ratio, the slower the face object movement; the smaller the intersection ratio, the faster the face object movement. Therefore, different weights can be calculated according to different cross-ratio ratios. Thereby, the drag feeling is reduced, and the timeliness and accuracy of the face detection frame are improved.
  • the conventional method there is also a method for determining the weight of the coordinates of the vertex of the face detection frame in the previous frame and the distance between the coordinates of corresponding vertices in the face detection frame (for example, all of them are the upper left vertex).
  • the weights of the coordinates of each vertex are independent, and the entire face detection frame cannot be considered. Therefore, the smoothing effect is poor.
  • the entire area of the entire face detection frame is considered in the process of determining the intersection ratio, and the weights of the coordinates of each vertex in the same face detection frame are the same, so Consider the face detection frame as a whole. Improved smoothing effect.
  • Step 204 Determine target position information of the first face detection frame based on the determined weight and the obtained position information to update the position of the first face detection frame.
  • the above-mentioned execution subject may determine the target position information of the first face detection frame based on the determined weight and the obtained position information to update the position of the first face detection frame.
  • the execution subject may modify the position information of the first face detection frame based on the determined weight. That is, the vertex coordinates of the first face detection frame are corrected.
  • the position information of the face detection frame may include coordinates of four vertices of the face detection frame.
  • the execution subject may modify the coordinates of the first face detection frame, respectively. Specifically, for each vertex, the following steps can be performed (here, the sitting vertex is used as an example for description, and the remaining vertices will not be described again):
  • the abscissa of the upper left vertex of the first face detection frame and the abscissa of the upper left vertex of the second face detection frame are weighted. That is, the horizontal coordinate of the upper left vertex of the first face detection frame is multiplied by the weight of the position information of the first face detection frame to obtain a first value. Multiply the abscissa of the upper left vertex of the second face detection frame by the weight of the position information of the second face detection frame to obtain a second value. The product of the first value and the second value is determined as the abscissa of the upper left vertex of the first face detection frame after correction.
  • the vertical coordinate of the upper left vertex of the first face detection frame and the vertical coordinate of the upper left vertex of the second face detection frame are weighted. That is, the vertical coordinate of the upper left vertex of the first face detection frame is multiplied by the weight of the position information of the first face detection frame to obtain a third value. Multiply the vertical coordinate of the upper left vertex of the second face detection frame by the weight of the position information of the second face detection frame to obtain a fourth value. The product of the third value and the fourth value is determined as the vertical coordinate of the upper left vertex of the first face detection frame after correction.
  • the abscissa and ordinate obtained in the first step and the second step are respectively aggregated into the coordinates of the upper left vertex of the first face detection frame after correction.
  • the above-mentioned electronic device may summarize the coordinates of the corrected vertices as target position information. Therefore, the position of the first face detection frame can be updated.
  • the position information of the first face detection frame includes the specified diagonal vertex coordinates of the first face detection frame
  • the position information of the second face detection frame includes the first The specified diagonal vertex coordinates of the two-face detection frame.
  • the designated diagonal vertex coordinates of the first face detection frame may include coordinates of a first vertex (for example, an upper left vertex) and coordinates of a second vertex (for example, a lower right vertex).
  • the designated diagonal vertex coordinates of the second face detection frame may include coordinates of a third vertex (for example, an upper left vertex) and coordinates of a fourth vertex (for example, a lower right vertex).
  • the execution subject may use the weight of the position information of the first face detection frame as the weight of the specified diagonal vertex coordinates of the first face detection frame, and the weight of the position information of the second face detection frame.
  • a weight of the designated diagonal vertex coordinates of the second face detection frame a weighted calculation result of the designated diagonal vertex coordinates of the first face detection frame and the designated diagonal vertex coordinates of the second face detection frame is determined.
  • the coordinates of the diagonal vertices of the target of the first face detection frame to update the position of the first face detection frame.
  • the target diagonal vertex coordinates of the first face detection frame can be calculated in the following sequence of operations:
  • a weighted calculation result of the abscissa of the first vertex coordinate and the abscissa of the third vertex coordinate may be determined as the first target abscissa;
  • a weighted calculation result of the ordinate of the first vertex coordinate and the ordinate of the third vertex coordinate may be determined as the first target ordinate;
  • a weighted calculation result of the abscissa of the second vertex coordinate and the abscissa of the fourth vertex coordinate may be determined as the second target abscissa;
  • the weighted calculation result of the ordinate of the second vertex coordinate and the ordinate of the fourth vertex coordinate may be determined as the second target ordinate;
  • the coordinates formed by the first target abscissa and the first target ordinate, and the coordinates formed by the second target abscissa and the second target ordinate can be determined as the targets of the first face detection frame.
  • Diagonal vertex coordinates Since the coordinates of a set of diagonal apex points are known, the position of the rectangular frame can be uniquely determined. Thereby, the position of the first face detection frame can be updated.
  • another set of diagonal vertex coordinates of the first face detection frame may be calculated according to the target diagonal vertex coordinates. Thereby, the coordinates of the four vertices of the first face detection frame are obtained.
  • the position information of the face detection frame may include the coordinates of any vertex of the face detection frame and the length and width of the face detection frame.
  • the execution subject may first determine the coordinates of a diagonal vertex of the vertex based on the coordinates of the vertex, the length, and the width. Alternatively, determine the coordinates of the remaining three vertices. Then, the target position information of the first face detection frame can be determined by using the operation steps described in the above two implementation manners. Thereby, the position of the first face detection frame is updated.
  • FIG. 3 is a schematic diagram of an application scenario of the method for generating information according to this embodiment.
  • a user uses a self-timer mode of the terminal device 301 to record a target video.
  • the terminal device After capturing the first frame, uses the stored face detection model to perform face point detection on the first frame, and obtains the position information 302 of the face detection frame in the first frame.
  • the terminal device After the terminal device captures the second frame, it uses the stored face detection model to perform face detection on the second frame. Then, the position information 303 of the face detection frame of the second frame is obtained. At the same time, the position information 302 of the face detection frame in the first frame is obtained. Then, based on the position information 302 and the position information 303, an intersection ratio of the first face detection frame and the second face detection frame may be determined. After that, the weight of the obtained position information of each face detection frame can be determined based on the intersection ratio, and the weight of the obtained position information 302 and the weight of the position information 303 can be determined. Finally, the target position information 304 of the face detection frame of the second frame (that is, the final position information of the face detection frame of the second frame) may be determined based on the determined weight and the obtained position information 302 and position information 303. .
  • the terminal device After capturing the third frame, the terminal device uses the stored face detection model to perform face point detection on the third frame. Then, the position information 305 of the face detection frame of the third frame is acquired. At the same time, the position information (that is, the target position information 304) of the face detection frame in the updated second frame is acquired. Then, based on the target position information 304 and the position information 305, the intersection ratio of the second face detection frame and the third face detection frame may be determined. After that, the weights of the obtained position information of each face detection frame can be determined based on the intersection ratio, and the weights of the obtained target position information 304 and the weights of the position information 305 can be determined. Finally, the target position information 306 of the face detection frame of the third frame (that is, the final position information of the face detection frame of the third frame may be determined based on the determined weight and the obtained target position information 304 and position information 305). ).
  • the terminal device 301 can obtain the position information of the face detection frame in each frame in the recorded video.
  • the method provided by the foregoing embodiment of the present application obtains the position information of the first face detection frame of the current frame of the target video and the position information of the second face detection frame of the previous frame by the pre-generated target video, so that it can be based on
  • the obtained position information determines an intersection ratio of the first face detection frame and the second face detection frame.
  • the weight of the obtained position information of each face detection frame is determined, and the weight of the obtained position information of each face detection frame can be determined.
  • the target position information of the first face detection frame may be determined based on the determined weight and the obtained position information to update the position of the first face detection frame.
  • the position of the face detection frame in the subsequent frames can be adjusted based on the intersection of the face detection frames in the two frames before and after. Because the position of the face detection frame in the subsequent frame considers the position of the face detection frame in the previous frame, and the entire area of the face detection frame in the previous frame is considered instead of a single coordinate, thereby reducing the The jitter of the face detection frame improves the smoothing effect and movement stability of the face detection frame in the video.
  • a flowchart 400 of yet another embodiment of a method for generating information is shown.
  • the process 400 of the method for generating information includes the following steps:
  • Step 401 Obtain position information of a first face detection frame obtained by performing face detection on a current frame of a target video in advance, and obtain a second face detection frame in a previous frame of a current frame that is stored in advance. location information.
  • an execution subject of the method for generating information may obtain a first face obtained by performing face detection on a current frame of a target video in advance.
  • the position information of the detection frame, and the position information of the second face detection frame obtained by performing face detection on the previous frame of the current frame in advance.
  • the position information of the first face detection frame may include designated diagonal vertex coordinates (such as the coordinates of the upper-left vertex and the lower-right vertex) of the first face detection frame, and the second face detection frame.
  • the position information of can include the designated diagonal vertex coordinates of the second face detection frame.
  • Step 402 Determine the intersection ratio of the first face detection frame and the second face detection frame based on the obtained position information.
  • the above-mentioned executing subject can determine the coordinates of the remaining vertices of the first face detection frame in the current frame by using the position information of the first face detection frame, so that the first face detection frame can be obtained.
  • the coordinates of each vertex can be obtained.
  • the vertex coordinates of the first face detection frame and the vertex coordinates of the second face detection frame can be used to determine the length and width of the rectangle where the first face detection frame and the second face detection frame intersect.
  • the area of the intersecting rectangles (may be referred to as an intersecting area) can be obtained.
  • the sum of the areas of the first face detection frame and the second face detection frame (which can be referred to as the total area) can be calculated.
  • the difference between the total area and the intersecting area can be calculated (may be referred to as the merging area).
  • the ratio of the intersection area and the merging area can be determined as the intersection ratio of the first face detection frame and the second face detection frame.
  • Step 403 Perform a power operation using the natural constant as a base and the difference between the reciprocal of the intersection and the second preset value as an index.
  • the execution body may perform a power operation by using a natural constant as a base and a difference between the reciprocal of the intersection ratio and a second preset value (for example, 1) as an index.
  • Step 404 Determine the inverse of the calculation result of the power operation as the weight of the position information of the second face detection frame, and determine the difference between the second preset value and the weight as the weight of the position information of the first face detection frame.
  • the execution subject may determine the inverse of the calculation result of the power operation as the weight of the position information of the second face detection frame, and combine the second preset value (for example, 1) with the determined weight. The difference is determined as the weight of the position information of the first face detection frame.
  • Step 405 Use the weight of the position information of the first face detection frame as the weight of the designated diagonal vertex coordinates of the first face detection frame, and use the weight of the position information of the second face detection frame as the second face detection frame.
  • the weight of the specified diagonal vertex coordinates of the first face detection frame, and the weighted calculation result of the specified diagonal vertex coordinates of the first face detection frame and the specified diagonal vertex coordinates of the second face detection frame is determined as the target pair of the first face detection frame Angular vertex coordinates to update the position of the first face detection frame.
  • the execution subject may use the weight of the position information of the first face detection frame as the weight of the designated diagonal vertex coordinates of the first face detection frame.
  • the weighted calculation result of the designated diagonal vertex coordinates of the first face detection frame and the designated diagonal vertex coordinates of the second face detection frame is determined as the target diagonal vertex coordinates of the first face detection frame to
  • the position of the face detection frame is updated.
  • the designated diagonal vertex coordinates of the first face detection frame may include coordinates of a first vertex (for example, an upper left vertex) and coordinates of a second vertex (for example, a lower right vertex).
  • the designated diagonal vertex coordinates of the second face detection frame may include coordinates of a third vertex (for example, an upper left vertex) and coordinates of a fourth vertex (for example, a lower right vertex).
  • the weighted calculation result of the abscissa of the first vertex coordinate and the abscissa of the third vertex coordinate may be determined as the first target abscissa first. Then, a weighted calculation result of the ordinate of the first vertex coordinate and the ordinate of the third vertex coordinate may be determined as the first target ordinate. Then, a weighted calculation result of the abscissa of the second vertex coordinate and the abscissa of the fourth vertex coordinate may be determined as the second target abscissa. Then, a weighted calculation result of the ordinate of the second vertex coordinate and the ordinate of the fourth vertex coordinate may be determined as the second target ordinate.
  • the coordinates formed by the first target abscissa and the first target ordinate, and the coordinates formed by the second target abscissa and the second target ordinate can be determined as the targets of the first face detection frame.
  • Diagonal vertex coordinates Since the coordinates of a set of diagonal vertices are known, the position of the rectangular frame can be uniquely determined. Thereby, the position of the first face detection frame can be updated.
  • the process 400 of the method for generating information in this embodiment highlights the weighting steps of determining the face detection frame of the current frame and the previous frame, respectively.
  • the weight of the position information of the first face detection frame (the face detection frame of the current frame) is larger, and the second face detection frame (the person of the previous frame) Face detection frame) has a smaller weight.
  • the weight of the position information of the first face detection frame is small, and the weight of the position information of the second face detection frame is large. Thereby, the face detection frame can be moved smoothly, the jitter of the face detection frame in the video is further reduced, and the smoothness effect and movement stability of the face detection frame in the video are further improved.
  • this application provides an embodiment of an apparatus for generating information.
  • the apparatus embodiment corresponds to the method embodiment shown in FIG. 2.
  • the device can be specifically applied to various electronic devices.
  • the apparatus 500 for generating information includes: an obtaining unit 501 configured to obtain a first face detection frame obtained by performing face detection on a current frame of a target video in advance; The position information of the second face detection frame in the previous frame of the current frame, and the first determination unit 502 is configured to determine the first person based on the obtained position information.
  • the foregoing second determination unit 503 may include a first operation module and a first determination module (not shown in the figure).
  • the first operation module may be configured to perform the power operation using the intersection ratio as a base and a first preset value as an exponent.
  • the first determination module may be configured to determine a calculation result of the power operation as a weight of the position information of the second face detection frame, and determine a difference between the second preset value and the weight as the first face detection. The weight of the box's position information.
  • the foregoing second determination unit 503 may include a second operation module and a second determination module (not shown in the figure).
  • the second operation module may be configured to perform a power operation using a natural constant as a base and a difference between a reciprocal of the intersection ratio and a second preset value as an index.
  • the second determining module may be configured to determine a reciprocal of a power operation calculation result as a weight of the position information of the second face detection frame, and determine a difference between the second preset value and the weight as the first person. Weight of the position information of the face detection frame.
  • the position information of the first face detection frame may include designated diagonal vertex coordinates of the first face detection frame
  • the position information of the second face detection frame may be Including the designated diagonal vertex coordinates of the second face detection frame.
  • the updating unit 504 may be further configured to: use the weight of the position information of the first face detection frame as the weight of the designated diagonal vertex coordinates of the first face detection frame, and use the second face detection frame as a weight.
  • the weight of the position information is used as the weight of the designated diagonal vertex coordinates of the second face detection frame, and the designated diagonal vertex coordinates of the first face detection frame and the designated diagonal vertex coordinates of the second face detection frame are used.
  • the weighted calculation result of is determined as the target diagonal vertex coordinates of the first face detection frame to update the position of the first face detection frame.
  • the specified diagonal vertex coordinates of the first face detection frame may include first vertex coordinates and second vertex coordinates, and the specified diagonal vertices of the second face detection frame.
  • the coordinates may include a third vertex coordinate and a fourth vertex coordinate.
  • the updating unit 504 may be further configured to: determine a weighted calculation result of the abscissa of the first vertex coordinate and the abscissa of the third vertex coordinate as the first target abscissa; and determine the ordinate of the first vertex coordinate
  • the weighted calculation result of the ordinate with the third vertex coordinate is determined as the first target ordinate
  • the weighted calculation result of the abscissa of the above second vertex coordinate and the abscissa of the fourth vertex coordinate is determined as the second target abscissa
  • the weighted calculation result of the vertical coordinate of the second vertex coordinate and the vertical coordinate of the fourth vertex coordinate is determined as the second target vertical coordinate
  • the second The coordinates formed by the target horizontal coordinate and the second target vertical coordinate are determined as the target diagonal vertex coordinates of the first face detection frame.
  • the device provided by the foregoing embodiment of the present application obtains, through the obtaining unit 501, the position information of the first face detection frame of the current frame of the target video and the position information of the second face detection frame of the previous frame. Therefore, the first determining unit 502 may determine an intersection ratio of the first face detection frame and the second face detection frame based on the obtained position information. After that, the second determining unit 503 determines the weight of the obtained position information of each face detection frame based on the intersection ratio, and can determine the weight of the obtained position information of each face detection frame. Finally, the updating unit 504 may determine the target position information of the first face detection frame based on the determined weight and the obtained position information to update the position of the first face detection frame.
  • the position of the face detection frame in the subsequent frames can be adjusted based on the intersection of the face detection frames in the two frames before and after. Because the position of the face detection frame in the subsequent frame considers the position of the face detection frame in the previous frame, and the entire area of the face detection frame in the previous frame is considered instead of a single coordinate, thereby reducing the The jitter of the face detection frame improves the smoothing effect and movement stability of the face detection frame in the video.
  • FIG. 6 illustrates a schematic structural diagram of a computer system 600 suitable for implementing an electronic device according to an embodiment of the present application.
  • the electronic device shown in FIG. 6 is only an example, and should not impose any limitation on the functions and scope of use of the embodiments of the present application.
  • the computer system 600 includes a central processing unit (CPU) 601, which can be loaded into a random access memory (RAM) 603 according to a program stored in a read-only memory (ROM) 602 or from a storage portion 608. Instead, perform various appropriate actions and processes.
  • RAM random access memory
  • ROM read-only memory
  • various programs and data required for the operation of the system 600 are also stored.
  • the CPU 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604.
  • An input / output (I / O) interface 605 is also connected to the bus 604.
  • the following components are connected to the I / O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a cathode ray tube (CRT), a liquid crystal display (LCD), and the speaker; a storage portion 608 including a hard disk and the like; a communication section 609 including a network interface card such as a LAN card, a modem, and the like.
  • the communication section 609 performs communication processing via a network such as the Internet.
  • the driver 610 is also connected to the I / O interface 605 as necessary.
  • a removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is installed on the drive 610 as necessary, so that a computer program read therefrom is installed into the storage section 608 as necessary.
  • the process described above with reference to the flowchart may be implemented as a computer software program.
  • embodiments of the present disclosure include a computer program product including a computer program carried on a computer-readable medium, the computer program containing program code for performing a method shown in a flowchart.
  • the computer program may be downloaded and installed from a network through the communication portion 609, and / or installed from a removable medium 611.
  • CPU central processing unit
  • the computer-readable medium described in this application may be a computer-readable signal medium or a computer-readable storage medium or any combination of the foregoing.
  • the computer-readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programming read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in combination with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal that is included in baseband or propagated as part of a carrier wave, and which carries computer-readable program code. Such a propagated data signal may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, and the computer-readable medium may send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device .
  • Program code embodied on a computer-readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • each block in the flowchart or block diagram may represent a module, a program segment, or a part of code, which contains one or more functions to implement a specified logical function Executable instructions.
  • the functions noted in the blocks may also occur in a different order than those marked in the drawings. For example, two successively represented boxes may actually be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending on the functions involved.
  • each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts can be implemented by a dedicated hardware-based system that performs the specified function or operation , Or it can be implemented with a combination of dedicated hardware and computer instructions.
  • the units described in the embodiments of the present application may be implemented by software or hardware.
  • the described unit may also be provided in a processor, for example, it may be described as: a processor includes an acquisition unit, a first determination unit, a second determination unit, and an update unit.
  • the names of these units do not constitute a limitation on the unit itself in some cases.
  • the update unit may also be described as a “unit that updates the position of the second face detection frame”.
  • the present application also provides a computer-readable medium, which may be included in the device described in the foregoing embodiments; or may exist alone without being assembled into the device.
  • the computer-readable medium carries one or more programs, and when the one or more programs are executed by the device, the device is caused to obtain a first human face obtained by performing face detection on a current frame of a target video in advance.
  • Position information of the detection frame and acquiring position information of a second face detection frame obtained by performing face detection on a previous frame of the current frame in advance; determining the first face detection based on the acquired position information The intersection ratio of the frame and the second face detection frame; based on the intersection ratio, determining the weight of the obtained position information of each face detection frame; based on the determined weight and the obtained position information, determining the first Target position information of a face detection frame to update the position of the first face detection frame.

Abstract

An information generating method and device. The method comprises: acquiring position information of a first face bounding box obtained by performing face detection on a current frame of a target video in advance, and acquiring position information of a second face bounding box obtained by performing face detection on a frame previous to the current frame in advance (201); determining the intersection over union of the first face bounding box and the second face bounding box on the basis of the obtained position information (202); on the basis of the intersection over union, determining the weight of the obtained position information of each face bounding box (203); and on the basis of the determined weight and the obtained position information, determining target position information of the first face bounding box and updating the position of the first face bounding box (204). The present method improves the smoothing effect of a face bounding box.

Description

用于生成信息的方法和装置Method and device for generating information
本专利申请要求于2018年9月21日提交的、申请号为201811110674.1、申请人为北京字节跳动网络技术有限公司、发明名称为“用于生成信息的方法和装置”的中国专利申请的优先权,该申请的全文以引用的方式并入本申请中。This patent application claims the priority of a Chinese patent application filed on September 21, 2018, with application number 201811110674.1, the applicant being Beijing BYTE Network Technology Co., Ltd., and the invention name "Method and Device for Generating Information" The entire application is incorporated herein by reference.
技术领域Technical field
本申请实施例涉及计算机技术领域,具体涉及用于生成信息的方法和装置。Embodiments of the present application relate to the field of computer technology, and in particular, to a method and an apparatus for generating information.
背景技术Background technique
人脸检测(Face Detection)是指对于任意一幅给定的图像,采用一定的策略对其进行搜索以确定其中是否含有人脸对象,如果是则返回人脸对象的位置、大小的过程,且所返回的结果可以在图像中以人脸检测框的形式体现。Face detection refers to the process of searching for a given image using a certain strategy to determine whether it contains a face object, and if so, returning the position and size of the face object, and The returned result can be reflected in the form of a face detection frame in the image.
对视频中的人脸对象进行人脸检测时,每一帧都会生成一个人脸检测框。相关的方式是直接对各帧进行人脸检测,得到各帧中的用于指示人脸对象的人脸检测框。When face detection is performed on a face object in a video, a face detection frame is generated every frame. A related method is to directly perform face detection on each frame to obtain a face detection frame for indicating a face object in each frame.
发明内容Summary of the Invention
本申请实施例提出了用于生成信息的方法和装置。The embodiments of the present application provide a method and device for generating information.
第一方面,本申请实施例提供了一种用于生成信息的方法,该方法包括:获取预先对目标视频的当前帧进行人脸检测后所得到的第一人脸检测框的位置信息,以及,获取预先存储的当前帧的上一帧中的第二人脸检测框的位置信息;基于所获取的位置信息,确定第一人脸检测框和第二人脸检测框的交并比;基于交并比,确定所获取的各人脸检测框的位置信息的权重;基于所确定的权重和所获取的位置信息, 确定第一人脸检测框的目标位置信息,以对第一人脸检测框的位置进行更新。In a first aspect, an embodiment of the present application provides a method for generating information. The method includes: obtaining position information of a first face detection frame obtained by performing face detection on a current frame of a target video in advance, and To obtain pre-stored position information of the second face detection frame in a previous frame of the current frame; based on the obtained position information, determine the intersection ratio of the first face detection frame and the second face detection frame; based on The intersection ratio determines the weight of the obtained position information of each face detection frame; based on the determined weight and the obtained position information, determines the target position information of the first face detection frame to detect the first face The position of the box is updated.
在一些实施例中,基于交并比,确定所获取的各人脸检测框的位置信息的权重,包括:以交并比作为底数,以第一预设数值作为指数,进行幂运算;将幂运算的计算结果确定为第二人脸检测框的位置信息的权重,将第二预设数值与权重的差值确定为第一人脸检测框的位置信息的权重。In some embodiments, determining the weight of the obtained position information of each face detection frame based on the intersection ratio includes: performing the power operation with the intersection ratio as the base and the first preset value as the exponent; The calculation result of the operation is determined as the weight of the position information of the second face detection frame, and the difference between the second preset value and the weight is determined as the weight of the position information of the first face detection frame.
在一些实施例中,基于交并比,确定所获取的各人脸检测框的位置信息的权重,包括:以自然常数作为底数,以交并比的倒数与第二预设数值的差作为指数,进行幂运算;将幂运算计算结果的倒数确定为第二人脸检测框的位置信息的权重,将第二预设数值与权重的差值确定为第一人脸检测框的位置信息的权重。In some embodiments, determining the weight of the obtained position information of each face detection frame based on the intersection ratio includes: using a natural constant as a base number, and taking a difference between a reciprocal of the intersection ratio and a second preset value as an index , Performing a power operation; determining the inverse of the calculation result of the power operation as the weight of the position information of the second face detection frame, and determining the difference between the second preset value and the weight as the weight of the position information of the first face detection frame .
在一些实施例中,第一人脸检测框的位置信息包括第一人脸检测框的指定对角顶点坐标,第二人脸检测框的位置信息包括第二人脸检测框的指定对角顶点坐标;以及基于所确定的权重和所获取的位置信息,确定第一人脸检测框的目标位置信息,以对第一人脸检测框的位置进行更新,包括:将第一人脸检测框的位置信息的权重作为第一人脸检测框的指定对角顶点坐标的权重,将第二人脸检测框的位置信息的权重作为第二人脸检测框的指定对角顶点坐标的权重,将第一人脸检测框的指定对角顶点坐标与第二人脸检测框的指定对角顶点坐标的加权计算结果确定为第一人脸检测框的目标对角顶点坐标,以对第一人脸检测框的位置进行更新。In some embodiments, the position information of the first face detection frame includes designated diagonal vertex coordinates of the first face detection frame, and the position information of the second face detection frame includes designated diagonal vertices of the second face detection frame. Coordinates; and determining target position information of the first face detection frame based on the determined weight and the obtained position information to update the position of the first face detection frame, including: updating the position of the first face detection frame The weight of the position information is used as the weight of the specified diagonal vertex coordinates of the first face detection frame, and the weight of the position information of the second face detection frame is used as the weight of the specified diagonal vertex coordinates of the second face detection frame. The weighted calculation result of the designated diagonal vertex coordinates of a face detection frame and the designated diagonal vertex coordinates of a second face detection frame is determined as the target diagonal vertex coordinates of the first face detection frame to detect the first face. The position of the box is updated.
在一些实施例中,第一人脸检测框的指定对角顶点坐标包括第一顶点坐标和第二顶点坐标,第二人脸检测框的指定对角顶点坐标包括第三顶点坐标和第四顶点坐标;以及将第一人脸检测框的指定对角顶点坐标与第二人脸检测框的指定对角顶点坐标的加权计算结果确定为第一人脸检测框的目标对角顶点坐标,包括:将第一顶点坐标的横坐标与第三顶点坐标的横坐标的加权计算结果确定为第一目标横坐标;将第一顶点坐标的纵坐标与第三顶点坐标的纵坐标的加权计算结果确定为第一目标纵坐标;将第二顶点坐标的横坐标与第四顶点坐标的横 坐标的加权计算结果确定为第二目标横坐标;将第二顶点坐标的纵坐标与第四顶点坐标的纵坐标的加权计算结果确定为第二目标纵坐标;将第一目标横坐标与第一目标纵坐标所构成的坐标、第二目标横坐标与第二目标纵坐标所构成的坐标确定为第一人脸检测框的目标对角顶点坐标。In some embodiments, the designated diagonal vertex coordinates of the first face detection frame include a first vertex coordinate and a second vertex coordinate, and the designated diagonal vertex coordinates of the second face detection frame include a third vertex coordinate and a fourth vertex. Coordinates; and determining the weighted calculation result of the designated diagonal vertex coordinates of the first face detection frame and the designated diagonal vertex coordinates of the second face detection frame as the target diagonal vertex coordinates of the first face detection frame, including: The weighted calculation result of the abscissa of the first vertex coordinate and the abscissa of the third vertex coordinate is determined as the first target abscissa; the weighted calculation result of the ordinate of the first vertex coordinate and the ordinate of the third vertex coordinate is determined as The first target ordinate; the weighted calculation result of the abscissa of the second vertex coordinate and the abscissa of the fourth vertex coordinate is determined as the second target abscissa; the ordinate of the second vertex coordinate and the ordinate of the fourth vertex coordinate The weighted calculation result is determined as the second target ordinate; the coordinates formed by the first target abscissa and the first target ordinate, the second target abscissa and the second target The coordinate formed by the ordinate is determined as the target diagonal vertex coordinate of the first face detection frame.
第二方面,本申请实施例提供了一种用于生成信息的装置,该装置包括:获取单元,被配置成获取预先对目标视频的当前帧进行人脸检测后所得到的第一人脸检测框的位置信息,以及,获取预先存储的当前帧的上一帧中的第二人脸检测框的位置信息;第一确定单元,被配置成基于所获取的位置信息,确定第一人脸检测框和第二人脸检测框的交并比;第二确定单元,被配置成基于交并比,确定所获取的各人脸检测框的位置信息的权重;更新单元,被配置成基于所确定的权重和所获取的位置信息,确定第一人脸检测框的目标位置信息,以对第一人脸检测框的位置进行更新。In a second aspect, an embodiment of the present application provides an apparatus for generating information. The apparatus includes: an obtaining unit configured to obtain first face detection obtained by performing face detection on a current frame of a target video in advance; Position information of the frame, and acquiring position information of a second face detection frame in a previous frame of a current frame that is stored in advance; a first determining unit configured to determine the first face detection based on the acquired position information The intersection ratio of the frame and the second face detection frame; the second determination unit is configured to determine the weight of the position information of each face detection frame obtained based on the intersection ratio; the update unit is configured to be based on the determined Determine the target position information of the first face detection frame to update the position of the first face detection frame.
在一些实施例中,第二确定单元,包括:第一运算模块,被配置成以交并比作为底数,以第一预设数值作为指数,进行幂运算;第一确定模块,被配置成将幂运算的计算结果确定为第二人脸检测框的位置信息的权重,将第二预设数值与权重的差值确定为第一人脸检测框的位置信息的权重。In some embodiments, the second determination unit includes: a first operation module configured to perform a power operation with a cross-ratio as a base and a first preset value as an exponent; the first determination module is configured to convert The calculation result of the power operation is determined as the weight of the position information of the second face detection frame, and the difference between the second preset value and the weight is determined as the weight of the position information of the first face detection frame.
在一些实施例中,第二确定单元,包括:第二运算模块,被配置成以自然常数作为底数,以交并比的倒数与第二预设数值的差作为指数,进行幂运算;第二确定模块,被配置成将幂运算计算结果的倒数确定为第二人脸检测框的位置信息的权重,将第二预设数值与权重的差值确定为第一人脸检测框的位置信息的权重。In some embodiments, the second determining unit includes: a second operation module configured to perform a power operation with a natural constant as a base, and a difference between a reciprocal of the intersection ratio and a second preset value as an exponent; a second The determining module is configured to determine the inverse of the calculation result of the power operation as the weight of the position information of the second face detection frame, and determine the difference between the second preset value and the weight as the position information of the first face detection frame. Weights.
在一些实施例中,第一人脸检测框的位置信息包括第一人脸检测框的指定对角顶点坐标,第二人脸检测框的位置信息包括第二人脸检测框的指定对角顶点坐标;以及更新单元,进一步被配置成:将第一人脸检测框的位置信息的权重作为第一人脸检测框的指定对角顶点坐标的权重,将第二人脸检测框的位置信息的权重作为第二人脸检测框的指定对角顶点坐标的权重,将第一人脸检测框的指定对角顶点坐标 与第二人脸检测框的指定对角顶点坐标的加权计算结果确定为第一人脸检测框的目标对角顶点坐标,以对第一人脸检测框的位置进行更新。In some embodiments, the position information of the first face detection frame includes designated diagonal vertex coordinates of the first face detection frame, and the position information of the second face detection frame includes designated diagonal vertices of the second face detection frame. Coordinates; and an update unit, further configured to: use the weight of the position information of the first face detection frame as the weight of the designated diagonal vertex coordinates of the first face detection frame, and set the position information of the second face detection frame to The weight is the weight of the designated diagonal vertex coordinates of the second face detection frame, and the weighted calculation result of the designated diagonal vertex coordinates of the first face detection frame and the designated diagonal vertex coordinates of the second face detection frame is determined as The coordinates of the diagonal diagonal vertices of a face detection frame to update the position of the first face detection frame.
在一些实施例中,第一人脸检测框的指定对角顶点坐标包括第一顶点坐标和第二顶点坐标,第二人脸检测框的指定对角顶点坐标包括第三顶点坐标和第四顶点坐标;以及更新单元,进一步被配置成:将第一顶点坐标的横坐标与第三顶点坐标的横坐标的加权计算结果确定为第一目标横坐标;将第一顶点坐标的纵坐标与第三顶点坐标的纵坐标的加权计算结果确定为第一目标纵坐标;将第二顶点坐标的横坐标与第四顶点坐标的横坐标的加权计算结果确定为第二目标横坐标;将第二顶点坐标的纵坐标与第四顶点坐标的纵坐标的加权计算结果确定为第二目标纵坐标;将第一目标横坐标与第一目标纵坐标所构成的坐标、第二目标横坐标与第二目标纵坐标所构成的坐标确定为第一人脸检测框的目标对角顶点坐标。In some embodiments, the designated diagonal vertex coordinates of the first face detection frame include a first vertex coordinate and a second vertex coordinate, and the designated diagonal vertex coordinates of the second face detection frame include a third vertex coordinate and a fourth vertex. And an update unit, further configured to: determine a weighted calculation result of the abscissa of the first vertex coordinate and the abscissa of the third vertex coordinate as the first target abscissa; and determine the ordinate of the first vertex coordinate and the third The weighted calculation result of the ordinate of the vertex coordinate is determined as the first target ordinate; the weighted calculation result of the abscissa of the second vertex coordinate and the abscissa of the fourth vertex coordinate is determined as the second target abscissa; the second vertex coordinate The weighted calculation result of the ordinate of the ordinate of the fourth coordinate and the ordinate of the fourth vertex is determined as the second target ordinate; the coordinates formed by the first target abscissa and the first target ordinate, the second target abscissa and the second target ordinate The coordinates formed by the coordinates are determined as the target diagonal vertex coordinates of the first face detection frame.
第三方面,本申请实施例提供了一种电子设备,包括:一个或多个处理器;存储装置,其上存储有一个或多个程序,当一个或多个程序被一个或多个处理器执行,使得一个或多个处理器实现如上述第一方面中任一实施例的方法。According to a third aspect, an embodiment of the present application provides an electronic device including: one or more processors; a storage device that stores one or more programs thereon; when one or more programs are processed by one or more processors Execution causes one or more processors to implement the method as in any one of the first aspects described above.
第四方面,本申请实施例提供了一种计算机可读介质,其上存储有计算机程序,该程序被处理器执行时实现如上述第一方面中任一实施例的方法。In a fourth aspect, an embodiment of the present application provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processor, the method as in any one of the foregoing first embodiments is implemented.
本申请实施例提供的用于生成信息的方法和装置,通过对预先生成的目标视频的当前帧的第一人脸检测框的位置信息以及上一帧的第二人脸检测框的位置信息进行获取,从而可以基于所获取的位置信息,确定第一人脸检测框和第二人脸检测框的交并比。之后,基于交并比,确定所获取的各人脸检测框的位置信息的权重,可以确定出所获取的各人脸检测框的位置信息的权重。最后,可以基于所确定的权重和所获取的位置信息,确定第一人脸检测框的目标位置信息,以对第一人脸检测框的位置进行更新。从而,可以基于前后两帧的人脸检测框的交并比对后帧的人脸检测框的位置进行调整。后帧的人脸检测框的位置考虑了前帧的人脸检测框的位置,并且,考虑了前帧的人脸检测框 的整体面积而非某个单一坐标,由此,降低了视频中的人脸检测框的抖动,提高了视频中的人脸检测框移动的平滑性和稳定性。The method and device for generating information provided in the embodiments of the present application are performed by performing position information on a first face detection frame of a current frame of a target video and position information on a second face detection frame of a previous frame. Obtained, so that the intersection ratio of the first face detection frame and the second face detection frame can be determined based on the obtained position information. After that, based on the intersection ratio, the weight of the obtained position information of each face detection frame is determined, and the weight of the obtained position information of each face detection frame can be determined. Finally, the target position information of the first face detection frame may be determined based on the determined weight and the obtained position information to update the position of the first face detection frame. Therefore, the position of the face detection frame in the subsequent frames can be adjusted based on the intersection of the face detection frames in the two frames before and after. The position of the face detection frame in the subsequent frame considers the position of the face detection frame in the previous frame, and the entire area of the face detection frame in the previous frame is considered instead of a single coordinate, thereby reducing the The jitter of the face detection frame improves the smoothness and stability of the movement of the face detection frame in the video.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本申请的其它特征、目的和优点将会变得更明显:Other features, objects, and advantages of the present application will become more apparent by reading the detailed description of the non-limiting embodiments with reference to the following drawings:
图1是本申请的一个实施例可以应用于其中的示例性系统架构图;FIG. 1 is an exemplary system architecture diagram to which an embodiment of the present application can be applied; FIG.
图2是根据本申请的用于生成信息的方法的一个实施例的流程图;2 is a flowchart of an embodiment of a method for generating information according to the present application;
图3是根据本申请的用于生成信息的方法的一个应用场景的示意图;3 is a schematic diagram of an application scenario of a method for generating information according to the present application;
图4是根据本申请的用于生成信息的方法的又一个实施例的流程图;4 is a flowchart of still another embodiment of a method for generating information according to the present application;
图5是根据本申请的用于生成信息的装置的一个实施例的结构示意图;5 is a schematic structural diagram of an embodiment of an apparatus for generating information according to the present application;
图6是适于用来实现本申请实施例的电子设备的计算机系统的结构示意图。FIG. 6 is a schematic structural diagram of a computer system suitable for implementing an electronic device according to an embodiment of the present application.
具体实施方式detailed description
下面结合附图和实施例对本申请作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释相关发明,而非对该发明的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与有关发明相关的部分。The following describes the present application in detail with reference to the accompanying drawings and embodiments. It can be understood that the specific embodiments described herein are only used to explain the related invention, rather than limiting the invention. It should also be noted that, for the convenience of description, only the parts related to the related invention are shown in the drawings.
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本申请。It should be noted that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined with each other. The application will be described in detail below with reference to the drawings and embodiments.
图1示出了可以应用本申请的用于检测人脸关键点的方法或用于检测人脸关键点的装置的示例性系统架构100。FIG. 1 illustrates an exemplary system architecture 100 to which a method for detecting a key point of a face or an apparatus for detecting a key point of a face of the present application can be applied.
如图1所示,系统架构100可以包括终端设备101、102、103,网络104和服务器105。网络104用以在终端设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类 型,例如有线、无线通信链路或者光纤电缆等等。As shown in FIG. 1, the system architecture 100 may include terminal devices 101, 102, and 103, a network 104, and a server 105. The network 104 is a medium for providing a communication link between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various types of connections, such as wired, wireless communication links, or fiber optic cables, and so on.
用户可以使用终端设备101、102、103通过网络104与服务器105交互,以接收或发送消息等。终端设备101、102、103上可以安装有各种通讯客户端应用,例如语音交互类应用、购物类应用、搜索类应用、即时通信工具、邮箱客户端、社交平台软件等。The user can use the terminal devices 101, 102, 103 to interact with the server 105 through the network 104 to receive or send messages and the like. Various communication client applications can be installed on the terminal devices 101, 102, and 103, such as voice interaction applications, shopping applications, search applications, instant communication tools, email clients, social platform software, and the like.
终端设备101、102、103可以是硬件,也可以是软件。当终端设备101、102、103为硬件时,可以是具有显示屏并且支持网页浏览的各种电子设备,包括但不限于智能手机、平板电脑、电子书阅读器、MP3播放器(Moving Picture Experts Group Audio Layer III,动态影像专家压缩标准音频层面3)、MP4(Moving Picture Experts Group Audio Layer IV,动态影像专家压缩标准音频层面4)播放器、膝上型便携计算机和台式计算机等等。当终端设备101、102、103为软件时,可以安装在上述所列举的电子设备中。其可以实现成多个软件或软件模块(例如用来提供分布式服务),也可以实现成单个软件或软件模块。在此不做具体限定。The terminal devices 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, and 103 are hardware, they can be various electronic devices that have a display screen and support web browsing, including but not limited to smartphones, tablets, e-book readers, MP3 players (Moving Pictures Experts Group) Audio Layer III, moving picture expert compression standard audio layer 3), MP4 (Moving Picture Experts Group Audio, Layer 4 IV, moving picture expert compression standard audio layer 4) player, laptop portable computer and desktop computer, etc. When the terminal devices 101, 102, and 103 are software, they can be installed in the electronic devices listed above. It can be implemented as multiple software or software modules (for example, to provide distributed services), or it can be implemented as a single software or software module. It is not specifically limited here.
当终端设备101、102、103为硬件时,其上还可以安装有图像采集设备。图像采集设备可以是各种能实现采集图像功能的设备,如摄像头、传感器等等。用户可以利用终端设备101、102、103上的图像采集设备,来采集视频。When the terminal devices 101, 102, and 103 are hardware, an image acquisition device may also be installed thereon. The image acquisition device can be various devices that can implement the function of acquiring images, such as cameras, sensors, and so on. Users can use the image capture device on the terminal devices 101, 102, 103 to capture video.
终端设备101、102、103可以对其所播放的视频或用户所录制的视频中的帧进行人脸检测等处理;也可以对人脸检测结果(例如人脸检测框的位置信息)进行分析等处理,并更新人脸检测框的位置。The terminal device 101, 102, 103 can perform face detection and other processing on the video that it plays or frames recorded by the user; it can also analyze the face detection results (such as the position information of the face detection frame), etc. Process and update the position of the face detection frame.
服务器105可以是提供各种服务的服务器,例如用于对终端设备101、102、103上传的视频进行存储、管理或者分析的视频处理服务器。视频处理服务器可以存储有大量的视频,并可以向终端设备101、102、103发送视频。The server 105 may be a server providing various services, such as a video processing server for storing, managing, or analyzing videos uploaded by the terminal devices 101, 102, and 103. The video processing server can store a large number of videos, and can send videos to the terminal devices 101, 102, and 103.
需要说明的是,服务器105可以是硬件,也可以是软件。当服务器为硬件时,可以实现成多个服务器组成的分布式服务器集群,也可以实现成单个服务器。当服务器为软件时,可以实现成多个软件或软件模块(例如用来提供分布式服务),也可以实现成单个软件或软件模 块。在此不做具体限定。It should be noted that the server 105 may be hardware or software. When the server is hardware, it can be implemented as a distributed server cluster consisting of multiple servers or as a single server. When the server is software, it can be implemented as multiple software or software modules (for example, to provide distributed services), or it can be implemented as a single software or software module. It is not specifically limited here.
需要说明的是,本申请实施例所提供的用于生成信息的方法一般由终端设备101、102、103执行,相应地,用于生成信息的装置一般设置于终端设备101、102、103中。It should be noted that the methods for generating information provided by the embodiments of the present application are generally executed by the terminal devices 101, 102, and 103. Accordingly, the devices for generating information are generally provided in the terminal devices 101, 102, and 103.
需要指出的是,在终端设备101、102、103可以实现服务器105的相关功能的情况下,系统架构100中可以不设置服务器105。It should be noted that, in a case where the terminal devices 101, 102, and 103 can implement related functions of the server 105, the server 105 may not be provided in the system architecture 100.
还需要指出的是,服务器105也可以对其所存储的视频或者终端设备101、102、103所上传的视频进行人脸检测等处理,并将处理结果返回给终端设备101、102、103。此时,本申请实施例所提供的用于生成信息的方法也可以由服务器105执行,相应地,用于生成信息的装置也可以设置于服务器105中。It should also be noted that the server 105 may also perform face detection and other processing on its stored videos or videos uploaded by the terminal devices 101, 102, and 103, and return the processing results to the terminal devices 101, 102, and 103. At this time, the method for generating information provided in the embodiment of the present application may also be executed by the server 105, and accordingly, the apparatus for generating information may also be set in the server 105.
应该理解,图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。It should be understood that the numbers of terminal devices, networks, and servers in FIG. 1 are merely exemplary. According to implementation needs, there can be any number of terminal devices, networks, and servers.
继续参考图2,示出了根据本申请的用于生成信息的方法的一个实施例的流程200。该用于生成信息的方法,包括以下步骤:With continued reference to FIG. 2, a flowchart 200 of one embodiment of a method for generating information according to the present application is shown. The method for generating information includes the following steps:
步骤201,获取预先对目标视频的当前帧进行人脸检测后所得到的第一人脸检测框的位置信息,以及,获取预先存储的当前帧的上一帧中的第二人脸检测框的位置信息。Step 201: Obtain position information of a first face detection frame obtained by performing face detection on a current frame of a target video in advance, and obtain a second face detection frame in a previous frame of a current frame that is stored in advance. location information.
在本实施例中,用于生成信息的方法的执行主体例如图1所示的终端设备101、102、103)可以进行视频的录制或播放。其所播放的视频可以是预先存储在本地的视频;也可以是通过有线连接或者无线连接方式,从服务器(例如图1所示的服务器105)中获取的视频。此处,当进行视频的录制时,上述执行主体可以安装或连接有图像采集装置(例如摄像头)。需要指出的是,上述无线连接方式可以包括但不限于3G/4G连接、WiFi连接、蓝牙连接、WiMAX连接、Zigbee连接、UWB(ultra wideband)连接、以及其他现在已知或将来开发的无线连接方式。In this embodiment, an execution subject of the method for generating information (for example, the terminal devices 101, 102, and 103 shown in FIG. 1) may record or play a video. The video that it plays may be a video that is stored locally in advance; it may also be a video that is obtained from a server (such as the server 105 shown in FIG. 1) through a wired connection or a wireless connection. Here, when recording a video, the above-mentioned execution body may be installed or connected with an image acquisition device (for example, a camera). It should be noted that the above wireless connection methods may include, but are not limited to, 3G / 4G connection, WiFi connection, Bluetooth connection, WiMAX connection, Zigbee connection, UWB (ultra wideband) connection, and other wireless connection methods now known or developed in the future. .
在本实施例中,上述执行主体可以获取预先对目标视频的当前帧进行人脸检测后所得到的第一人脸检测框的位置信息,以及,获取预 先存储的上述当前帧的上一帧中的第二人脸检测框的位置信息。其中,上述目标视频可以是当前正在播放的视频,也可以是用户正在录制的视频。此处不作限定。In this embodiment, the execution subject may obtain position information of a first face detection frame obtained by performing face detection on a current frame of a target video in advance, and obtain a previously stored previous frame of the current frame. Position information of the second face detection frame. The target video may be a video currently being played or a video being recorded by a user. It is not limited here.
此处,目标视频的当前帧,可以是目标视频中待对其人脸检测框进行位置更新的帧。作为示例,上述执行主体可以按照帧的时间戳的顺序,依次对目标视频中的每一帧进行人脸检测,在对除了首帧之外的其他每一帧进行人脸检测后,可以对所得到的人脸检测框进行位置修正。当前待进行人脸检测框的位置的修正的帧,即可以称为目标视频的当前帧。以如下两种场景为例:Here, the current frame of the target video may be a frame in the target video whose position detection frame is to be updated. As an example, the above-mentioned execution subject may perform face detection on each frame of the target video in sequence according to the timestamp order of the frames. After performing face detection on each frame except the first frame, The obtained face detection frame is subjected to position correction. The current frame to be corrected for the position of the face detection frame can be referred to as the current frame of the target video. Take the following two scenarios as examples:
在一种场景中,目标视频可以是上述执行主体正在播放的视频。在目标视频的播放过程中,上述执行主体可以逐一地对每一个待播放的帧进行人脸检测,得到该帧的人脸检测框的位置信息。当该帧为非首帧时,在得到人脸检测框的位置信息后,可以对该帧的人脸检测框的位置信息进行修正,进而进行该帧的播放。在当前时刻即将进行人脸检测框位置修正的帧,可以为当前帧。In one scenario, the target video may be a video being played by the execution subject. During the playback of the target video, the execution subject may perform face detection on each frame to be played one by one to obtain the position information of the face detection frame of the frame. When the frame is a non-first frame, after obtaining the position information of the face detection frame, the position information of the face detection frame of the frame may be corrected to further play the frame. The frame at which the position of the face detection frame is about to be corrected at the current moment may be the current frame.
在另一种场景中,目标视频可以是上述执行主体正在录制的视频。在目标视频的录制过程中,上述执行主体可以逐一地对每一个已捕获的帧进行人脸检测,得到该帧的人脸检测框的位置信息。在首帧捕获之后,对于接下来捕获的每一帧,对该帧进行人脸检测后,可以对所得到的人脸检测框进行位置修正,进而对该帧进行显示。在当前时刻所获取的最新的且尚未进行人脸检测框的位置修正的帧,可以为当前帧。In another scenario, the target video may be a video being recorded by the above-mentioned execution subject. During the recording of the target video, the execution subject may perform face detection on each captured frame one by one to obtain the position information of the face detection frame of the frame. After the first frame is captured, for each frame captured next, after performing face detection on the frame, the obtained face detection frame can be position corrected, and then the frame is displayed. The latest frame acquired at the current time and which has not been subjected to the position correction of the face detection frame may be the current frame.
需要说明的是,上述执行主体可以利用各种方式对目标视频的帧进行人脸检测。作为示例,上述执行主体中可以存储有预先训练的人脸检测模型。上述执行主体可以将目标视频中的帧中的帧输入至预先训练的人脸检测模型,得到该帧的人脸检测框的位置信息。其中,上述人脸检测模型可以用于检测图像中的人脸对象所在区域(可以以人脸检测框来表征,此处,人脸检测框可以是矩形框)。实践中,人脸检测检测模型可以输出人脸检测框的位置信息。这里,人脸检测模型可以是利用机器学习方法,基于样本集(包含人脸图像和用于指示人脸 对象区域的位置的标注),对现有的卷积神经网络进行有监督训练得到的。其中,卷积神经网络可以使用各种现有的结构,例如DenseBox、VGGNet、ResNet、SegNet等。需要说明的是,上述机器学习方法、有监督训练方法是目前广泛研究和应用的公知技术,在此不再赘述。It should be noted that the above-mentioned executing subject may perform face detection on the frame of the target video in various ways. As an example, a pre-trained face detection model may be stored in the execution subject. The execution subject may input a frame in a frame in the target video into a pre-trained face detection model, and obtain position information of the face detection frame of the frame. The above-mentioned face detection model may be used to detect an area where a face object is located in the image (it may be represented by a face detection frame, and here, the face detection frame may be a rectangular frame). In practice, the face detection detection model can output the position information of the face detection frame. Here, the face detection model may be obtained by performing supervised training on an existing convolutional neural network based on a sample set (including a face image and a label for indicating the position of a face object region) using a machine learning method. Among them, the convolutional neural network can use various existing structures, such as DenseBox, VGGNet, ResNet, SegNet, and so on. It should be noted that the above-mentioned machine learning method and supervised training method are well-known technologies that are widely studied and applied at present, and will not be repeated here.
需要指出的是,人脸检测框的位置信息可以是用于指示和唯一确定人脸检测框在帧中的位置的信息。It should be noted that the position information of the face detection frame may be information for indicating and uniquely determining the position of the face detection frame in the frame.
可选的,人脸检测框的位置信息可以包括人脸检测框的四个顶点的坐标。Optionally, the position information of the face detection frame may include coordinates of four vertices of the face detection frame.
可选的,人脸检测框的位置信息可以包括人脸检测框的任一组对角顶点的坐标。例如左上顶点的坐标和右下顶点的坐标。Optionally, the position information of the face detection frame may include the coordinates of any set of diagonal vertices of the face detection frame. For example, the coordinates of the upper left vertex and the coordinates of the lower right vertex.
可选的,人脸检测框的位置信息可以包括人脸检测框的任一个顶点的坐标以及人脸检测框的长度和宽度。Optionally, the position information of the face detection frame may include the coordinates of any vertex of the face detection frame and the length and width of the face detection frame.
需要说明的是,位置信息不限于上述列举,还可以包含可用于指示和唯一确定人脸检测框的位置的其他信息。It should be noted that the position information is not limited to the above list, and may also include other information that can be used to indicate and uniquely determine the position of the face detection frame.
步骤202,基于所获取的位置信息,确定第一人脸检测框和第二人脸检测框的交并比。Step 202: Determine an intersection ratio of the first face detection frame and the second face detection frame based on the obtained position information.
在本实施例中,上述执行主体可以基于所获取的上述第一人脸检测框的位置信息和上述第二人脸检测框的位置信息,确定上述第一人脸检测框和上述第二人脸检测框的交并比(Intersection-over-Union,IOU)。In this embodiment, the execution subject may determine the first face detection frame and the second face based on the obtained position information of the first face detection frame and the position information of the second face detection frame. Intersection-over-Union (IOU) of detection frames.
实践中,两个矩形的交并比可以是两个矩形的相交的区域的面积与这两个矩形的相并的区域的面积的比值。此处,两个矩形的相并的区域的面积等于两个矩形的面积之和减去这两个矩形的相交的区域的面积。实践中,交并比为在区间[0,1]的数。In practice, the intersection ratio of two rectangles may be the ratio of the area of the area where the two rectangles intersect to the area of the area where the two rectangles intersect. Here, the area of the area where the two rectangles meet is equal to the sum of the areas of the two rectangles minus the area of the area where the two rectangles intersect. In practice, the intersection ratio is a number in the interval [0,1].
在本实施例中,由于通过人脸检测框的位置信息,可以确定出人脸检测框在帧中的位置。因此,通过第一人脸检测框的位置信息,可以确定出第一人脸检测框在当前帧中的各顶点的坐标。通过第二人脸检测框的位置信息,可以确定出第二人脸检测框在当前帧的上一帧中的各顶点的坐标。作为示例,如果人脸检测框的位置信息可以包括人脸检测框的任一顶点(例如左上顶点)的坐标以及人脸检测框的长度 和宽度,则可以将左上顶点的横坐标与该长度相加、将左上顶点的纵坐标与该宽度相加,分别得到右上顶点、左下顶点、右下顶点的坐标。In this embodiment, since the position information of the face detection frame is used, the position of the face detection frame in the frame can be determined. Therefore, by using the position information of the first face detection frame, the coordinates of each vertex of the first face detection frame in the current frame can be determined. Based on the position information of the second face detection frame, the coordinates of each vertex of the second face detection frame in the previous frame of the current frame can be determined. As an example, if the position information of the face detection frame can include the coordinates of any vertex (such as the upper left vertex) of the face detection frame and the length and width of the face detection frame, the horizontal coordinate of the upper left vertex can be related to the length Add and add the vertical coordinates of the upper left vertex and the width to obtain the coordinates of the upper right vertex, the lower left vertex, and the lower right vertex, respectively.
在本实施例中,由于可以得到第一人脸检测框的各顶点坐标,以及第二人脸检测框的各顶点坐标。因此,可以利用第一人脸检测框的顶点坐标与第二人脸检测框的顶点坐标,确定出第一人脸检测框与第二人脸检测框相交的矩形的长度和宽度。进而,可以得到相交的矩形的面积(可称为相交面积)。之后,可以计算第一人脸检测框与第二人脸检测框的面积之和(可称为总面积)。而后,可以计算总面积与相交面积的差(可称为相并面积)。最后,可以将相交面积与相并面积的比值确定为上述第一人脸检测框与上述第二人脸检测框的交并比。In this embodiment, the coordinates of each vertex of the first face detection frame and the coordinates of each vertex of the second face detection frame can be obtained. Therefore, the vertex coordinates of the first face detection frame and the vertex coordinates of the second face detection frame can be used to determine the length and width of the rectangle where the first face detection frame and the second face detection frame intersect. Further, the area of the intersecting rectangles (may be referred to as an intersecting area) can be obtained. After that, the sum of the areas of the first face detection frame and the second face detection frame (which can be referred to as the total area) can be calculated. Then, the difference between the total area and the intersecting area can be calculated (may be referred to as the merging area). Finally, the ratio of the intersection area and the merging area can be determined as the intersection ratio of the first face detection frame and the second face detection frame.
需要说明的是,需要说明的是,上述交并比计算方法是目前广泛研究和应用的公知技术,在此不再赘述。It should be noted that it should be noted that the above-mentioned cross-ratio calculation method is a well-known technology widely studied and applied at present, and will not be described again here.
步骤203,基于交并比,确定所获取的各人脸检测框的位置信息的权重。Step 203: Determine the weight of the obtained position information of each face detection frame based on the intersection and union ratio.
在本实施例中,上述执行主体可以基于步骤202所确定出的交并比,分别确定第一人脸检测框的位置信息的权重和第二人脸检测框的权重。具体可以参见如下步骤:In this embodiment, the execution subject may determine the weight of the position information of the first face detection frame and the weight of the second face detection frame based on the intersection ratio determined in step 202, respectively. For details, please refer to the following steps:
第一步,可以将交并比待入至预先建立的公式中进行计算,将计算结果确定为第二人脸检测框的位置信息的权重。上述预先建立的公式,可以是满足预设条件的各种公式,此处不作限定。上述预设条件包括:交并比越大,上述公式的计算结果越大;交并比越小,上述公式的计算结果越小。当交并比为0时,计算结果为0;当交并比为1时,计算结果为1。In the first step, the intersection ratio can be calculated in a formula established in advance, and the calculation result is determined as the weight of the position information of the second face detection frame. The foregoing formulas established in advance may be various formulas that satisfy preset conditions, and are not limited herein. The above preset conditions include: the larger the cross-ratio, the larger the calculation result of the above formula; the smaller the cross-ratio, the smaller the calculation result of the above formula. When the intersection ratio is 0, the calculation result is 0; when the intersection ratio is 1, the calculation result is 1.
第二步,可以将预设值(例如1)与第二人脸检测框的位置信息的权重的差值确定为第一人脸检测框的位置信息的权重。In a second step, the difference between the preset value (for example, 1) and the weight of the position information of the second face detection frame may be determined as the weight of the position information of the first face detection frame.
需要说明的是,第一人脸检测框的位置信息的权重与第二人脸检测框的位置信息的权重的确定顺序,此处不作限定。上述执行主体可以对上述预先建立的公式进行修改,以便首先确定第一人脸检测框的位置信息的权重,之后确定第二人脸检测框的位置信息的权重。It should be noted that the order of determining the weight of the position information of the first face detection frame and the weight of the position information of the second face detection frame is not limited herein. The execution subject may modify the pre-established formula so as to first determine the weight of the position information of the first face detection frame, and then determine the weight of the position information of the second face detection frame.
在本实施例的一些可选的实现方式中,上述执行主体可以以上述 交并比作为底数,以第一预设数值(例如6,或者3等)作为指数,进行幂运算。此处,第一预设数值可以是技术人员基于大量数据统计和试验所确定的。而后,上述执行主体可以将幂运算的计算结果确定为上述第二人脸检测框的位置信息的权重,将第二预设数值(例如1)与所确定的第二人脸检测框的位置信息的权重的差值确定为上述第一人脸检测框的位置信息的权重。In some optional implementation manners of this embodiment, the above-mentioned execution body may perform the power operation with the above-mentioned intersection ratio as a base and a first preset value (for example, 6, or 3, etc.) as an index. Here, the first preset value may be determined by a technician based on a large amount of data statistics and experiments. Then, the execution subject may determine the calculation result of the power operation as the weight of the position information of the second face detection frame, and combine the second preset value (for example, 1) with the determined position information of the second face detection frame. The difference between the weights is determined as the weight of the position information of the first face detection frame.
在本实施例的一些可选的实现方式中,上述执行主体可以以自然常数作为底数,以上述交并比的倒数与第二预设数值(例如1)的差作为指数,进行幂运算。而后,可以将幂运算计算结果的倒数确定为上述第二人脸检测框的位置信息的权重,将上述第二预设数值与所确定的第二人脸检测框的位置信息的权重的差值确定为上述第一人脸检测框的位置信息的权重。In some optional implementation manners of this embodiment, the execution body may perform a power operation by using a natural constant as a base and a difference between the reciprocal of the intersection ratio and a second preset value (for example, 1) as an index. Then, the inverse of the calculation result of the power operation may be determined as the weight of the position information of the second face detection frame, and the difference between the second preset value and the weight of the determined position information of the second face detection frame. Determined as the weight of the position information of the first face detection frame.
需要说明的是,上述执行主体还可以利用其他方式确定所获取的各位置信息的权重。不限于上述实现方式。例如,可以以某个预设数值(例如2或者3)作为底数,以上述交并比的倒数与第二预设数值(例如1)的差作为指数,进行幂运算。而后,可以将幂运算计算结果的倒数确定为上述第二人脸检测框的位置信息的权重,将上述第二预设数值与上述权重的差值确定为上述第一人脸检测框的位置信息的权重。It should be noted that the above-mentioned execution body may also determine the weight of each position information obtained in other ways. It is not limited to the above implementation. For example, a certain preset value (for example, 2 or 3) can be used as the base, and the difference between the reciprocal of the above-mentioned intersection ratio and the second preset value (for example, 1) can be used as an index to perform a power operation. Then, the inverse of the calculation result of the power calculation may be determined as the weight of the position information of the second face detection frame, and the difference between the second preset value and the weight may be determined as the position information of the first face detection frame. the weight of.
在以往的方式中,通常以上一帧和当前帧中的人脸检测框中相应顶点(例如均为左上顶点)的坐标的平均值,作为当前帧的该顶点(左上顶点)的修正后的坐标。由此得到当前帧的各顶点的修正后的坐标。这种方式在人脸对象移动得快的时候,人脸检测框无法跟上人脸对象的运动,拖动感较强,准确性较低。而利用本申请中基于交并比所确定的权重,对当前帧中的人脸检测框的位置进行修正。由于交并比越大,人脸对象运动越慢;交并比越小,人脸对象运动越快。因此,可以根据交并比的不同计算得到不同的权重。从而降低拖动感,提高了人脸检测框的时效性和准确性。In the conventional method, the average value of the coordinates of the corresponding vertex (for example, the upper-left vertex) in the face detection frame in the previous frame and the current frame is usually used as the corrected coordinate of the vertex (upper-left vertex) in the current frame . The corrected coordinates of each vertex of the current frame are thus obtained. In this way, when the face object moves fast, the face detection frame cannot keep up with the movement of the face object, the drag feeling is strong, and the accuracy is low. The position of the face detection frame in the current frame is corrected by using the weight determined based on the crossover ratio in this application. Because the larger the intersection ratio, the slower the face object movement; the smaller the intersection ratio, the faster the face object movement. Therefore, different weights can be calculated according to different cross-ratio ratios. Thereby, the drag feeling is reduced, and the timeliness and accuracy of the face detection frame are improved.
在以往的方式中,还存在以上一帧和当前帧中的人脸检测框中相应顶点(例如均为左上顶点)的坐标的距离,确定人脸检测框的该顶 点坐标的权重的方式。但这种方式,各个顶点的坐标的权重是独立的,无法对人脸检测框整体进行考虑。因此,平滑效果较差。而利用本申请中基于交并比所确定的权重,在确定交并比的过程中考虑整人脸检测框整体的面积,并且同一人脸检测框中的各顶点坐标的权重是相同的,因此,以人脸检测框作为整体来考虑。提升了平滑效果。In the conventional method, there is also a method for determining the weight of the coordinates of the vertex of the face detection frame in the previous frame and the distance between the coordinates of corresponding vertices in the face detection frame (for example, all of them are the upper left vertex). However, in this way, the weights of the coordinates of each vertex are independent, and the entire face detection frame cannot be considered. Therefore, the smoothing effect is poor. And using the weight determined based on the intersection ratio in this application, the entire area of the entire face detection frame is considered in the process of determining the intersection ratio, and the weights of the coordinates of each vertex in the same face detection frame are the same, so Consider the face detection frame as a whole. Improved smoothing effect.
步骤204,基于所确定的权重和所获取的位置信息,确定第一人脸检测框的目标位置信息,以对第一人脸检测框的位置进行更新。Step 204: Determine target position information of the first face detection frame based on the determined weight and the obtained position information to update the position of the first face detection frame.
在本实施例中,上述执行主体可以基于所确定的权重和所获取的位置信息,确定第一人脸检测框的目标位置信息,以对第一人脸检测框的位置进行更新。此处,上述执行主体可以基于所确定的权重,对第一人脸检测框的位置信息进行修正。即,修正第一人脸检测框的顶点坐标。In this embodiment, the above-mentioned execution subject may determine the target position information of the first face detection frame based on the determined weight and the obtained position information to update the position of the first face detection frame. Here, the execution subject may modify the position information of the first face detection frame based on the determined weight. That is, the vertex coordinates of the first face detection frame are corrected.
在本实施例的一些可选的实现方式中,人脸检测框的位置信息可以包括人脸检测框的四个顶点的坐标。此时,上述执行主体可以分别对第一人脸检测框的坐标进行修正。具体地,对于每一个顶点,可以按照如下步骤执行(此处以坐上顶点为例进行描述,其余顶点不再赘述):In some optional implementation manners of this embodiment, the position information of the face detection frame may include coordinates of four vertices of the face detection frame. At this time, the execution subject may modify the coordinates of the first face detection frame, respectively. Specifically, for each vertex, the following steps can be performed (here, the sitting vertex is used as an example for description, and the remaining vertices will not be described again):
第一步,将第一人脸检测框的左上顶点的横坐标与第二人脸检测框的左上顶点的横坐标进行加权。即,将第一人脸检测框的左上顶点的横坐标乘以第一人脸检测框的位置信息的权重,得到第一数值。将第二人脸检测框的左上顶点的横坐标乘以第二人脸检测框的位置信息的权重,得到第二数值。将第一数值与第二数值的乘积确定为修正后的第一人脸检测框的左上顶点的横坐标。In the first step, the abscissa of the upper left vertex of the first face detection frame and the abscissa of the upper left vertex of the second face detection frame are weighted. That is, the horizontal coordinate of the upper left vertex of the first face detection frame is multiplied by the weight of the position information of the first face detection frame to obtain a first value. Multiply the abscissa of the upper left vertex of the second face detection frame by the weight of the position information of the second face detection frame to obtain a second value. The product of the first value and the second value is determined as the abscissa of the upper left vertex of the first face detection frame after correction.
第二步,将第一人脸检测框的左上顶点的纵坐标与第二人脸检测框的左上顶点的纵坐标进行加权。即,将第一人脸检测框的左上顶点的纵坐标乘以第一人脸检测框的位置信息的权重,得到第三数值。将第二人脸检测框的左上顶点的纵坐标乘以第二人脸检测框的位置信息的权重,得到第四数值。将第三数值与第四数值的乘积确定为修正后的第一人脸检测框的左上顶点的纵坐标。In the second step, the vertical coordinate of the upper left vertex of the first face detection frame and the vertical coordinate of the upper left vertex of the second face detection frame are weighted. That is, the vertical coordinate of the upper left vertex of the first face detection frame is multiplied by the weight of the position information of the first face detection frame to obtain a third value. Multiply the vertical coordinate of the upper left vertex of the second face detection frame by the weight of the position information of the second face detection frame to obtain a fourth value. The product of the third value and the fourth value is determined as the vertical coordinate of the upper left vertex of the first face detection frame after correction.
第三步,将上述第一步和第二步所分别得到的横坐标和纵坐标汇 总为修正后的第一人脸检测框的左上顶点的坐标。In the third step, the abscissa and ordinate obtained in the first step and the second step are respectively aggregated into the coordinates of the upper left vertex of the first face detection frame after correction.
在对第一人脸检测框的各个顶点的坐标进行修正后,上述电子设备可以将修正后的各个顶点的坐标汇总为目标位置信息。从而,可以将第一人脸检测框的位置进行更新。After the coordinates of the vertices of the first face detection frame are corrected, the above-mentioned electronic device may summarize the coordinates of the corrected vertices as target position information. Therefore, the position of the first face detection frame can be updated.
在本实施例的一些可选的实现方式中,第一人脸检测框的位置信息包括上述第一人脸检测框的指定对角顶点坐标,上述第二人脸检测框的位置信息包括上述第二人脸检测框的指定对角顶点坐标。其中,上述第一人脸检测框的指定对角顶点坐标可以包括第一顶点(例如左上顶点)坐标和第二顶点(例如右下顶点)坐标。上述第二人脸检测框的指定对角顶点坐标可以包括第三顶点(例如左上顶点)坐标和第四顶点(例如右下顶点)坐标。此时,上述执行主体可以将上述第一人脸检测框的位置信息的权重作为上述第一人脸检测框的指定对角顶点坐标的权重,将上述第二人脸检测框的位置信息的权重作为上述第二人脸检测框的指定对角顶点坐标的权重,将上述第一人脸检测框的指定对角顶点坐标与上述第二人脸检测框的指定对角顶点坐标的加权计算结果确定为上述第一人脸检测框的目标对角顶点坐标,以对上述第一人脸检测框的位置进行更新。In some optional implementations of this embodiment, the position information of the first face detection frame includes the specified diagonal vertex coordinates of the first face detection frame, and the position information of the second face detection frame includes the first The specified diagonal vertex coordinates of the two-face detection frame. The designated diagonal vertex coordinates of the first face detection frame may include coordinates of a first vertex (for example, an upper left vertex) and coordinates of a second vertex (for example, a lower right vertex). The designated diagonal vertex coordinates of the second face detection frame may include coordinates of a third vertex (for example, an upper left vertex) and coordinates of a fourth vertex (for example, a lower right vertex). At this time, the execution subject may use the weight of the position information of the first face detection frame as the weight of the specified diagonal vertex coordinates of the first face detection frame, and the weight of the position information of the second face detection frame. As a weight of the designated diagonal vertex coordinates of the second face detection frame, a weighted calculation result of the designated diagonal vertex coordinates of the first face detection frame and the designated diagonal vertex coordinates of the second face detection frame is determined. The coordinates of the diagonal vertices of the target of the first face detection frame to update the position of the first face detection frame.
可选的,可以按照如下操作顺序计算第一人脸检测框的目标对角顶点坐标:Optionally, the target diagonal vertex coordinates of the first face detection frame can be calculated in the following sequence of operations:
首先,可以将上述第一顶点坐标的横坐标与第三顶点坐标的横坐标的加权计算结果确定为第一目标横坐标;First, a weighted calculation result of the abscissa of the first vertex coordinate and the abscissa of the third vertex coordinate may be determined as the first target abscissa;
接着,可以将上述第一顶点坐标的纵坐标与第三顶点坐标的纵坐标的加权计算结果确定为第一目标纵坐标;Then, a weighted calculation result of the ordinate of the first vertex coordinate and the ordinate of the third vertex coordinate may be determined as the first target ordinate;
接着,可以将上述第二顶点坐标的横坐标与第四顶点坐标的横坐标的加权计算结果确定为第二目标横坐标;Then, a weighted calculation result of the abscissa of the second vertex coordinate and the abscissa of the fourth vertex coordinate may be determined as the second target abscissa;
接着,可以将上述第二顶点坐标的纵坐标与第四顶点坐标的纵坐标的加权计算结果确定为第二目标纵坐标;Next, the weighted calculation result of the ordinate of the second vertex coordinate and the ordinate of the fourth vertex coordinate may be determined as the second target ordinate;
最后,可以将上述第一目标横坐标与上述第一目标纵坐标所构成的坐标、上述第二目标横坐标与上述第二目标纵坐标所构成的坐标确定为上述第一人脸检测框的目标对角顶点坐标。由于已知一组对角顶 点的坐标,可以唯一确定出矩形框的位置。从而可以实现对上述第一人脸检测框的位置的更新。Finally, the coordinates formed by the first target abscissa and the first target ordinate, and the coordinates formed by the second target abscissa and the second target ordinate can be determined as the targets of the first face detection frame. Diagonal vertex coordinates. Since the coordinates of a set of diagonal apex points are known, the position of the rectangular frame can be uniquely determined. Thereby, the position of the first face detection frame can be updated.
需要说明的是,在本实现方式中,还可以利用其它操作顺序计算第一人脸检测框的目标对角顶点坐标,此处不再赘述。It should be noted that, in this implementation manner, other operation sequences may also be used to calculate the target diagonal vertex coordinates of the first face detection frame, which will not be repeated here.
需要指出的是,在本实现方式中,在计算出目标对角顶点坐标后,还可以根据目标对角顶点坐标,计算出第一人脸检测框的另一组对角顶点坐标。从而得到第一人脸检测框的四个顶点的坐标。It should be noted that, in this implementation manner, after the target diagonal vertex coordinates are calculated, another set of diagonal vertex coordinates of the first face detection frame may be calculated according to the target diagonal vertex coordinates. Thereby, the coordinates of the four vertices of the first face detection frame are obtained.
在本实施例的一些可选的实现方式中,人脸检测框的位置信息可以包括人脸检测框的任一个顶点的坐标以及人脸检测框的长度和宽度。此时,上述执行主体可以首先基于该顶点的坐标、上述长度和上述宽度,确定出该顶点的对角顶点的坐标。或者,确定出其余三个顶点的坐标。而后,可以利用上述两种实现方式中所记载的操作步骤,确定第一人脸检测框的目标位置信息。从而实现对上述第一人脸检测框的位置的更新。In some optional implementations of this embodiment, the position information of the face detection frame may include the coordinates of any vertex of the face detection frame and the length and width of the face detection frame. At this time, the execution subject may first determine the coordinates of a diagonal vertex of the vertex based on the coordinates of the vertex, the length, and the width. Alternatively, determine the coordinates of the remaining three vertices. Then, the target position information of the first face detection frame can be determined by using the operation steps described in the above two implementation manners. Thereby, the position of the first face detection frame is updated.
继续参见图3,图3是根据本实施例的用于生成信息的方法的应用场景的一个示意图。在图3的应用场景中,用户使用终端设备301的自拍模式录制目标视频。With continued reference to FIG. 3, FIG. 3 is a schematic diagram of an application scenario of the method for generating information according to this embodiment. In the application scenario of FIG. 3, a user uses a self-timer mode of the terminal device 301 to record a target video.
终端设备在捕获到第一帧后,利用其所存储的人脸检测模型对第一帧进行了人脸点检测,并得到了第一帧中的人脸检测框的位置信息302。After capturing the first frame, the terminal device uses the stored face detection model to perform face point detection on the first frame, and obtains the position information 302 of the face detection frame in the first frame.
终端设备在捕获到第二帧后,利用其所存储的人脸检测模型对第二帧进行了人脸检测。而后获取了第二帧的人脸检测框的位置信息303。同时,获取了第一帧中的人脸检测框的位置信息302。接着,可以基于位置信息302和位置信息303,确定上述第一人脸检测框和上述第二人脸检测框的交并比。之后,可以基于交并比,确定所获取的各人脸检测框的位置信息的权重,可以确定出所获取的位置信息302的权重和位置信息303的权重。最后,可以基于所确定的权重和所获取的位置信息302和位置信息303,确定第二帧的人脸检测框的目标位置信息304(即第二帧的人脸检测框的最终的位置信息)。After the terminal device captures the second frame, it uses the stored face detection model to perform face detection on the second frame. Then, the position information 303 of the face detection frame of the second frame is obtained. At the same time, the position information 302 of the face detection frame in the first frame is obtained. Then, based on the position information 302 and the position information 303, an intersection ratio of the first face detection frame and the second face detection frame may be determined. After that, the weight of the obtained position information of each face detection frame can be determined based on the intersection ratio, and the weight of the obtained position information 302 and the weight of the position information 303 can be determined. Finally, the target position information 304 of the face detection frame of the second frame (that is, the final position information of the face detection frame of the second frame) may be determined based on the determined weight and the obtained position information 302 and position information 303. .
终端设备在捕获到第三帧后,利用其所存储的人脸检测模型对第三帧进行了人脸点检测。而后获取了第三帧的人脸检测框的位置信息305。同时,获取了更新后的第二帧中的人脸检测框的位置信息(即目标位置信息304)。接着,可以基于目标位置信息304和位置信息305,确定上述第二人脸检测框和上述第三人脸检测框的交并比。之后,可以基于交并比,确定所获取的各人脸检测框的位置信息的权重,可以确定出所获取的目标位置信息304的权重和位置信息305的权重。最后,可以基于所确定的权重和所获取的目标位置信息304和位置信息305,确定第三帧的人脸检测框的目标位置信息306(即第三帧的人脸检测框的最终的位置信息)。After capturing the third frame, the terminal device uses the stored face detection model to perform face point detection on the third frame. Then, the position information 305 of the face detection frame of the third frame is acquired. At the same time, the position information (that is, the target position information 304) of the face detection frame in the updated second frame is acquired. Then, based on the target position information 304 and the position information 305, the intersection ratio of the second face detection frame and the third face detection frame may be determined. After that, the weights of the obtained position information of each face detection frame can be determined based on the intersection ratio, and the weights of the obtained target position information 304 and the weights of the position information 305 can be determined. Finally, the target position information 306 of the face detection frame of the third frame (that is, the final position information of the face detection frame of the third frame may be determined based on the determined weight and the obtained target position information 304 and position information 305). ).
以此类推。最终,终端设备301可得到所录制的视频中的各帧中的人脸检测框的位置信息。And so on. Finally, the terminal device 301 can obtain the position information of the face detection frame in each frame in the recorded video.
本申请的上述实施例提供的方法,通过对预先生成的目标视频的当前帧的第一人脸检测框的位置信息以及上一帧的第二人脸检测框的位置信息进行获取,从而可以基于所获取的位置信息,确定上述第一人脸检测框和上述第二人脸检测框的交并比。之后,基于交并比,确定所获取的各人脸检测框的位置信息的权重,可以确定出所获取的各人脸检测框的位置信息的权重。最后,可以基于所确定的权重和所获取的位置信息,确定上述第一人脸检测框的目标位置信息,以对上述第一人脸检测框的位置进行更新。从而,可以基于前后两帧的人脸检测框的交并比对后帧的人脸检测框的位置进行调整。由于后帧的人脸检测框的位置考虑了前帧的人脸检测框的位置,并且,考虑了前帧的人脸检测框的整体面积而非某个单一坐标,由此,降低了视频中的人脸检测框的抖动,提高了视频中的人脸检测框的平滑效果和移动的稳定性。The method provided by the foregoing embodiment of the present application obtains the position information of the first face detection frame of the current frame of the target video and the position information of the second face detection frame of the previous frame by the pre-generated target video, so that it can be based on The obtained position information determines an intersection ratio of the first face detection frame and the second face detection frame. After that, based on the intersection ratio, the weight of the obtained position information of each face detection frame is determined, and the weight of the obtained position information of each face detection frame can be determined. Finally, the target position information of the first face detection frame may be determined based on the determined weight and the obtained position information to update the position of the first face detection frame. Therefore, the position of the face detection frame in the subsequent frames can be adjusted based on the intersection of the face detection frames in the two frames before and after. Because the position of the face detection frame in the subsequent frame considers the position of the face detection frame in the previous frame, and the entire area of the face detection frame in the previous frame is considered instead of a single coordinate, thereby reducing the The jitter of the face detection frame improves the smoothing effect and movement stability of the face detection frame in the video.
进一步参考图4,其示出了用于生成信息的方法的又一个实施例的流程400。该用于生成信息的方法的流程400,包括以下步骤:With further reference to FIG. 4, a flowchart 400 of yet another embodiment of a method for generating information is shown. The process 400 of the method for generating information includes the following steps:
步骤401,获取预先对目标视频的当前帧进行人脸检测后所得到的第一人脸检测框的位置信息,以及,获取预先存储的当前帧的上一 帧中的第二人脸检测框的位置信息。Step 401: Obtain position information of a first face detection frame obtained by performing face detection on a current frame of a target video in advance, and obtain a second face detection frame in a previous frame of a current frame that is stored in advance. location information.
在本实施例中,用于生成信息的方法的执行主体例如图1所示的终端设备101、102、103)可以获取预先对目标视频的当前帧进行人脸检测后所得到的第一人脸检测框的位置信息,以及,获取预先对当前帧的上一帧进行人脸检测后所得到的第二人脸检测框的位置信息。In this embodiment, an execution subject of the method for generating information (for example, the terminal devices 101, 102, and 103 shown in FIG. 1) may obtain a first face obtained by performing face detection on a current frame of a target video in advance. The position information of the detection frame, and the position information of the second face detection frame obtained by performing face detection on the previous frame of the current frame in advance.
在本实施例中,上述第一人脸检测框的位置信息可以包括上述第一人脸检测框的指定对角顶点坐标(例如左上顶点和右下顶点的坐标),上述第二人脸检测框的位置信息可以包括上述第二人脸检测框的指定对角顶点坐标。In this embodiment, the position information of the first face detection frame may include designated diagonal vertex coordinates (such as the coordinates of the upper-left vertex and the lower-right vertex) of the first face detection frame, and the second face detection frame. The position information of can include the designated diagonal vertex coordinates of the second face detection frame.
步骤402,基于所获取的位置信息,确定第一人脸检测框和第二人脸检测框的交并比。Step 402: Determine the intersection ratio of the first face detection frame and the second face detection frame based on the obtained position information.
在本实施例中,上述执行主体通过第一人脸检测框的位置信息,可以确定出第一人脸检测框在当前帧中的其余各顶点的坐标,从而可以得到第一人脸检测框的各个顶点的坐标。同样的,通过第二人脸检测框的位置信息,可以确定出第二人脸检测框在当前帧的上一帧中的各顶点的坐标。而后,可以利用第一人脸检测框的顶点坐标与第二人脸检测框的顶点坐标,确定出第一人脸检测框与第二人脸检测框相交的矩形的长度和宽度。进而,可以得到相交的矩形的面积(可称为相交面积)。之后,可以计算第一人脸检测框与第二人脸检测框的面积之和(可称为总面积)。而后,可以计算总面积与相交面积的差(可称为相并面积)。最后,可以将相交面积与相并面积的比值确定为上述第一人脸检测框与上述第二人脸检测框的交并比。In this embodiment, the above-mentioned executing subject can determine the coordinates of the remaining vertices of the first face detection frame in the current frame by using the position information of the first face detection frame, so that the first face detection frame can be obtained. The coordinates of each vertex. Similarly, by using the position information of the second face detection frame, the coordinates of each vertex of the second face detection frame in the previous frame of the current frame can be determined. Then, the vertex coordinates of the first face detection frame and the vertex coordinates of the second face detection frame can be used to determine the length and width of the rectangle where the first face detection frame and the second face detection frame intersect. Further, the area of the intersecting rectangles (may be referred to as an intersecting area) can be obtained. After that, the sum of the areas of the first face detection frame and the second face detection frame (which can be referred to as the total area) can be calculated. Then, the difference between the total area and the intersecting area can be calculated (may be referred to as the merging area). Finally, the ratio of the intersection area and the merging area can be determined as the intersection ratio of the first face detection frame and the second face detection frame.
步骤403,以自然常数作为底数,以交并比的倒数与第二预设数值的差作为指数,进行幂运算。Step 403: Perform a power operation using the natural constant as a base and the difference between the reciprocal of the intersection and the second preset value as an index.
在本实施例中,上述执行主体可以以自然常数作为底数,以上述交并比的倒数与第二预设数值(例如1)的差作为指数,进行幂运算。In this embodiment, the execution body may perform a power operation by using a natural constant as a base and a difference between the reciprocal of the intersection ratio and a second preset value (for example, 1) as an index.
步骤404,将幂运算计算结果的倒数确定为第二人脸检测框的位置信息的权重,将第二预设数值与权重的差值确定为第一人脸检测框的位置信息的权重。Step 404: Determine the inverse of the calculation result of the power operation as the weight of the position information of the second face detection frame, and determine the difference between the second preset value and the weight as the weight of the position information of the first face detection frame.
在本实施例中,上述执行主体可以将幂运算计算结果的倒数确定 为上述第二人脸检测框的位置信息的权重,将上述第二预设数值(例如1)与所确定的该权重的差值确定为上述第一人脸检测框的位置信息的权重。In this embodiment, the execution subject may determine the inverse of the calculation result of the power operation as the weight of the position information of the second face detection frame, and combine the second preset value (for example, 1) with the determined weight. The difference is determined as the weight of the position information of the first face detection frame.
步骤405,将第一人脸检测框的位置信息的权重作为第一人脸检测框的指定对角顶点坐标的权重,将第二人脸检测框的位置信息的权重作为第二人脸检测框的指定对角顶点坐标的权重,将第一人脸检测框的指定对角顶点坐标与第二人脸检测框的指定对角顶点坐标的加权计算结果确定为第一人脸检测框的目标对角顶点坐标,以对第一人脸检测框的位置进行更新。Step 405: Use the weight of the position information of the first face detection frame as the weight of the designated diagonal vertex coordinates of the first face detection frame, and use the weight of the position information of the second face detection frame as the second face detection frame. The weight of the specified diagonal vertex coordinates of the first face detection frame, and the weighted calculation result of the specified diagonal vertex coordinates of the first face detection frame and the specified diagonal vertex coordinates of the second face detection frame is determined as the target pair of the first face detection frame Angular vertex coordinates to update the position of the first face detection frame.
在本实施例中,上述执行主体可以将第一人脸检测框的位置信息的权重作为第一人脸检测框的指定对角顶点坐标的权重。将第二人脸检测框的位置信息的权重作为第二人脸检测框的指定对角顶点坐标的权重。将第一人脸检测框的指定对角顶点坐标与第二人脸检测框的指定对角顶点坐标的加权计算结果确定为第一人脸检测框的目标对角顶点坐标,以对第一人脸检测框的位置进行更新。其中,上述第一人脸检测框的指定对角顶点坐标可以包括第一顶点(例如左上顶点)坐标和第二顶点(例如右下顶点)坐标。上述第二人脸检测框的指定对角顶点坐标可以包括第三顶点(例如左上顶点)坐标和第四顶点(例如右下顶点)坐标。In this embodiment, the execution subject may use the weight of the position information of the first face detection frame as the weight of the designated diagonal vertex coordinates of the first face detection frame. Use the weight of the position information of the second face detection frame as the weight of the designated diagonal vertex coordinates of the second face detection frame. The weighted calculation result of the designated diagonal vertex coordinates of the first face detection frame and the designated diagonal vertex coordinates of the second face detection frame is determined as the target diagonal vertex coordinates of the first face detection frame to The position of the face detection frame is updated. The designated diagonal vertex coordinates of the first face detection frame may include coordinates of a first vertex (for example, an upper left vertex) and coordinates of a second vertex (for example, a lower right vertex). The designated diagonal vertex coordinates of the second face detection frame may include coordinates of a third vertex (for example, an upper left vertex) and coordinates of a fourth vertex (for example, a lower right vertex).
具体地,可以首先将上述第一顶点坐标的横坐标与第三顶点坐标的横坐标的加权计算结果确定为第一目标横坐标。接着,可以将上述第一顶点坐标的纵坐标与第三顶点坐标的纵坐标的加权计算结果确定为第一目标纵坐标。接着,可以将上述第二顶点坐标的横坐标与第四顶点坐标的横坐标的加权计算结果确定为第二目标横坐标。接着,可以将上述第二顶点坐标的纵坐标与第四顶点坐标的纵坐标的加权计算结果确定为第二目标纵坐标。最后,可以将上述第一目标横坐标与上述第一目标纵坐标所构成的坐标、上述第二目标横坐标与上述第二目标纵坐标所构成的坐标确定为上述第一人脸检测框的目标对角顶点坐标。由于已知一组对角顶点的坐标,可以唯一确定出矩形框的位置。从而可以实现对上述第一人脸检测框的位置的更新。Specifically, the weighted calculation result of the abscissa of the first vertex coordinate and the abscissa of the third vertex coordinate may be determined as the first target abscissa first. Then, a weighted calculation result of the ordinate of the first vertex coordinate and the ordinate of the third vertex coordinate may be determined as the first target ordinate. Then, a weighted calculation result of the abscissa of the second vertex coordinate and the abscissa of the fourth vertex coordinate may be determined as the second target abscissa. Then, a weighted calculation result of the ordinate of the second vertex coordinate and the ordinate of the fourth vertex coordinate may be determined as the second target ordinate. Finally, the coordinates formed by the first target abscissa and the first target ordinate, and the coordinates formed by the second target abscissa and the second target ordinate can be determined as the targets of the first face detection frame. Diagonal vertex coordinates. Since the coordinates of a set of diagonal vertices are known, the position of the rectangular frame can be uniquely determined. Thereby, the position of the first face detection frame can be updated.
从图4中可以看出,与图2对应的实施例相比,本实施例中的用于生成信息的方法的流程400突出了确定分别当前帧和上一帧的人脸检测框的权重步骤。当第一人脸检测框与第二人脸检测框的交并比较小时,人脸对象从上一帧到当前帧的移动幅度较大。此时,以本实施例的方法所确定的权重,第一人脸检测框(当前帧的人脸检测框)的位置信息的权重较大,而第二人脸检测框(上一帧的人脸检测框)的位置信息的权重较小。当第一人脸检测框与第二人脸检测框的交并比较大时,人脸对象从上一帧到当前帧的移动幅度较小。以本实施例的方法所确定的权重,第一人脸检测框的位置信息的权重较小,而第二人脸检测框的位置信息的权重较大。从而可以使人脸检测框平滑移动,进一步降低了视频中的人脸检测框的抖动,进一步提高了视频中的人脸检测框的平滑效果和移动的稳定性。As can be seen from FIG. 4, compared with the embodiment corresponding to FIG. 2, the process 400 of the method for generating information in this embodiment highlights the weighting steps of determining the face detection frame of the current frame and the previous frame, respectively. . When the intersection of the first face detection frame and the second face detection frame is relatively small, the movement range of the face object from the previous frame to the current frame is relatively large. At this time, with the weight determined by the method of this embodiment, the weight of the position information of the first face detection frame (the face detection frame of the current frame) is larger, and the second face detection frame (the person of the previous frame) Face detection frame) has a smaller weight. When the intersection of the first face detection frame and the second face detection frame is relatively large, the movement range of the face object from the previous frame to the current frame is small. With the weight determined by the method of this embodiment, the weight of the position information of the first face detection frame is small, and the weight of the position information of the second face detection frame is large. Thereby, the face detection frame can be moved smoothly, the jitter of the face detection frame in the video is further reduced, and the smoothness effect and movement stability of the face detection frame in the video are further improved.
进一步参考图5,作为对上述各图所示方法的实现,本申请提供了一种用于生成信息的装置的一个实施例,该装置实施例与图2所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。With further reference to FIG. 5, as an implementation of the methods shown in the foregoing figures, this application provides an embodiment of an apparatus for generating information. The apparatus embodiment corresponds to the method embodiment shown in FIG. 2. The device can be specifically applied to various electronic devices.
如图5所示,本实施例所述的用于生成信息的装置500包括:获取单元501,被配置成获取预先对目标视频的当前帧进行人脸检测后所得到的第一人脸检测框的位置信息,以及,获取预先存储的上述当前帧的上一帧中的第二人脸检测框的位置信息;第一确定单元502,被配置成基于所获取的位置信息,确定上述第一人脸检测框和上述第二人脸检测框的交并比;第二确定单元503,被配置成基于上述交并比,确定所获取的各人脸检测框的位置信息的权重;更新单元504,被配置成基于所确定的权重和所获取的位置信息,确定上述第一人脸检测框的目标位置信息,以对上述第一人脸检测框的位置进行更新。As shown in FIG. 5, the apparatus 500 for generating information according to this embodiment includes: an obtaining unit 501 configured to obtain a first face detection frame obtained by performing face detection on a current frame of a target video in advance; The position information of the second face detection frame in the previous frame of the current frame, and the first determination unit 502 is configured to determine the first person based on the obtained position information. The intersection ratio of the face detection frame and the second face detection frame; the second determination unit 503 is configured to determine the weight of the position information of each face detection frame obtained based on the intersection ratio; an update unit 504, And configured to determine target position information of the first face detection frame based on the determined weight and the obtained position information to update the position of the first face detection frame.
在本实施例的一些可选的实现方式中,上述第二确定单元503可以包括第一运算模块和第一确定模块(图中未示出)。其中,上述第一运算模块可以被配置成以上述交并比作为底数,以第一预设数值作为指数,进行幂运算。上述第一确定模块可以被配置成将幂运算的计算结果确定为上述第二人脸检测框的位置信息的权重,将第二预设数值 与上述权重的差值确定为上述第一人脸检测框的位置信息的权重。In some optional implementation manners of this embodiment, the foregoing second determination unit 503 may include a first operation module and a first determination module (not shown in the figure). The first operation module may be configured to perform the power operation using the intersection ratio as a base and a first preset value as an exponent. The first determination module may be configured to determine a calculation result of the power operation as a weight of the position information of the second face detection frame, and determine a difference between the second preset value and the weight as the first face detection. The weight of the box's position information.
在本实施例的一些可选的实现方式中,上述第二确定单元503可以包括第二运算模块和第二确定模块(图中未示出)。其中,上述第二运算模块可以被配置成以自然常数作为底数,以上述交并比的倒数与第二预设数值的差作为指数,进行幂运算。上述第二确定模块可以被配置成将幂运算计算结果的倒数确定为上述第二人脸检测框的位置信息的权重,将上述第二预设数值与上述权重的差值确定为上述第一人脸检测框的位置信息的权重。In some optional implementation manners of this embodiment, the foregoing second determination unit 503 may include a second operation module and a second determination module (not shown in the figure). The second operation module may be configured to perform a power operation using a natural constant as a base and a difference between a reciprocal of the intersection ratio and a second preset value as an index. The second determining module may be configured to determine a reciprocal of a power operation calculation result as a weight of the position information of the second face detection frame, and determine a difference between the second preset value and the weight as the first person. Weight of the position information of the face detection frame.
在本实施例的一些可选的实现方式中,上述第一人脸检测框的位置信息可以包括上述第一人脸检测框的指定对角顶点坐标,上述第二人脸检测框的位置信息可以包括上述第二人脸检测框的指定对角顶点坐标。以及,上述更新单元504可以进一步被配置成:将上述第一人脸检测框的位置信息的权重作为上述第一人脸检测框的指定对角顶点坐标的权重,将上述第二人脸检测框的位置信息的权重作为上述第二人脸检测框的指定对角顶点坐标的权重,将上述第一人脸检测框的指定对角顶点坐标与上述第二人脸检测框的指定对角顶点坐标的加权计算结果确定为上述第一人脸检测框的目标对角顶点坐标,以对上述第一人脸检测框的位置进行更新。In some optional implementation manners of this embodiment, the position information of the first face detection frame may include designated diagonal vertex coordinates of the first face detection frame, and the position information of the second face detection frame may be Including the designated diagonal vertex coordinates of the second face detection frame. And, the updating unit 504 may be further configured to: use the weight of the position information of the first face detection frame as the weight of the designated diagonal vertex coordinates of the first face detection frame, and use the second face detection frame as a weight. The weight of the position information is used as the weight of the designated diagonal vertex coordinates of the second face detection frame, and the designated diagonal vertex coordinates of the first face detection frame and the designated diagonal vertex coordinates of the second face detection frame are used. The weighted calculation result of is determined as the target diagonal vertex coordinates of the first face detection frame to update the position of the first face detection frame.
在本实施例的一些可选的实现方式中,上述第一人脸检测框的指定对角顶点坐标可以包括第一顶点坐标和第二顶点坐标,上述第二人脸检测框的指定对角顶点坐标可以包括第三顶点坐标和第四顶点坐标。以及,上述更新单元504可以进一步被配置成:将上述第一顶点坐标的横坐标与第三顶点坐标的横坐标的加权计算结果确定为第一目标横坐标;将上述第一顶点坐标的纵坐标与第三顶点坐标的纵坐标的加权计算结果确定为第一目标纵坐标;将上述第二顶点坐标的横坐标与第四顶点坐标的横坐标的加权计算结果确定为第二目标横坐标;将上述第二顶点坐标的纵坐标与第四顶点坐标的纵坐标的加权计算结果确定为第二目标纵坐标;将上述第一目标横坐标与上述第一目标纵坐标所构成的坐标、上述第二目标横坐标与上述第二目标纵坐标所构成的坐标确定为上述第一人脸检测框的目标对角顶点坐标。In some optional implementations of this embodiment, the specified diagonal vertex coordinates of the first face detection frame may include first vertex coordinates and second vertex coordinates, and the specified diagonal vertices of the second face detection frame. The coordinates may include a third vertex coordinate and a fourth vertex coordinate. And, the updating unit 504 may be further configured to: determine a weighted calculation result of the abscissa of the first vertex coordinate and the abscissa of the third vertex coordinate as the first target abscissa; and determine the ordinate of the first vertex coordinate The weighted calculation result of the ordinate with the third vertex coordinate is determined as the first target ordinate; the weighted calculation result of the abscissa of the above second vertex coordinate and the abscissa of the fourth vertex coordinate is determined as the second target abscissa; The weighted calculation result of the vertical coordinate of the second vertex coordinate and the vertical coordinate of the fourth vertex coordinate is determined as the second target vertical coordinate; the coordinate formed by the first target horizontal coordinate and the first target vertical coordinate, and the second The coordinates formed by the target horizontal coordinate and the second target vertical coordinate are determined as the target diagonal vertex coordinates of the first face detection frame.
本申请的上述实施例提供的装置,通过获取单元501对预先生成的目标视频的当前帧的第一人脸检测框的位置信息以及上一帧的第二人脸检测框的位置信息进行获取,从而第一确定单元502可以基于所获取的位置信息,确定上述第一人脸检测框和上述第二人脸检测框的交并比。之后,第二确定单元503基于交并比,确定所获取的各人脸检测框的位置信息的权重,可以确定出所获取的各人脸检测框的位置信息的权重。最后,更新单元504可以基于所确定的权重和所获取的位置信息,确定上述第一人脸检测框的目标位置信息,以对上述第一人脸检测框的位置进行更新。从而,可以基于前后两帧的人脸检测框的交并比对后帧的人脸检测框的位置进行调整。由于后帧的人脸检测框的位置考虑了前帧的人脸检测框的位置,并且,考虑了前帧的人脸检测框的整体面积而非某个单一坐标,由此,降低了视频中的人脸检测框的抖动,提高了视频中的人脸检测框的平滑效果和移动的稳定性。The device provided by the foregoing embodiment of the present application obtains, through the obtaining unit 501, the position information of the first face detection frame of the current frame of the target video and the position information of the second face detection frame of the previous frame. Therefore, the first determining unit 502 may determine an intersection ratio of the first face detection frame and the second face detection frame based on the obtained position information. After that, the second determining unit 503 determines the weight of the obtained position information of each face detection frame based on the intersection ratio, and can determine the weight of the obtained position information of each face detection frame. Finally, the updating unit 504 may determine the target position information of the first face detection frame based on the determined weight and the obtained position information to update the position of the first face detection frame. Therefore, the position of the face detection frame in the subsequent frames can be adjusted based on the intersection of the face detection frames in the two frames before and after. Because the position of the face detection frame in the subsequent frame considers the position of the face detection frame in the previous frame, and the entire area of the face detection frame in the previous frame is considered instead of a single coordinate, thereby reducing the The jitter of the face detection frame improves the smoothing effect and movement stability of the face detection frame in the video.
下面参考图6,其示出了适于用来实现本申请实施例的电子设备的计算机系统600的结构示意图。图6示出的电子设备仅仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。Reference is now made to FIG. 6, which illustrates a schematic structural diagram of a computer system 600 suitable for implementing an electronic device according to an embodiment of the present application. The electronic device shown in FIG. 6 is only an example, and should not impose any limitation on the functions and scope of use of the embodiments of the present application.
如图6所示,计算机系统600包括中央处理单元(CPU)601,其可以根据存储在只读存储器(ROM)602中的程序或者从存储部分608加载到随机访问存储器(RAM)603中的程序而执行各种适当的动作和处理。在RAM 603中,还存储有系统600操作所需的各种程序和数据。CPU 601、ROM 602以及RAM 603通过总线604彼此相连。输入/输出(I/O)接口605也连接至总线604。As shown in FIG. 6, the computer system 600 includes a central processing unit (CPU) 601, which can be loaded into a random access memory (RAM) 603 according to a program stored in a read-only memory (ROM) 602 or from a storage portion 608. Instead, perform various appropriate actions and processes. In the RAM 603, various programs and data required for the operation of the system 600 are also stored. The CPU 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604. An input / output (I / O) interface 605 is also connected to the bus 604.
以下部件连接至I/O接口605:包括键盘、鼠标等的输入部分606;包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分607;包括硬盘等的存储部分608;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分609。通信部分609经由诸如因特网的网络执行通信处理。驱动器610也根据需要连接至I/O接口605。可拆卸介质611,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器610上,以便于从其上读出的计算机程序根据 需要被安装入存储部分608。The following components are connected to the I / O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a cathode ray tube (CRT), a liquid crystal display (LCD), and the speaker; a storage portion 608 including a hard disk and the like And a communication section 609 including a network interface card such as a LAN card, a modem, and the like. The communication section 609 performs communication processing via a network such as the Internet. The driver 610 is also connected to the I / O interface 605 as necessary. A removable medium 611, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is installed on the drive 610 as necessary, so that a computer program read therefrom is installed into the storage section 608 as necessary.
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分609从网络上被下载和安装,和/或从可拆卸介质611被安装。在该计算机程序被中央处理单元(CPU)601执行时,执行本申请的方法中限定的上述功能。需要说明的是,本申请所述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本申请中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本申请中,计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、RF等等,或者上述的任意合适的组合。In particular, according to an embodiment of the present disclosure, the process described above with reference to the flowchart may be implemented as a computer software program. For example, embodiments of the present disclosure include a computer program product including a computer program carried on a computer-readable medium, the computer program containing program code for performing a method shown in a flowchart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication portion 609, and / or installed from a removable medium 611. When the computer program is executed by a central processing unit (CPU) 601, the above-mentioned functions defined in the method of the present application are executed. It should be noted that the computer-readable medium described in this application may be a computer-readable signal medium or a computer-readable storage medium or any combination of the foregoing. The computer-readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programming read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing. In this application, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in combination with an instruction execution system, apparatus, or device. In this application, a computer-readable signal medium may include a data signal that is included in baseband or propagated as part of a carrier wave, and which carries computer-readable program code. Such a propagated data signal may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. The computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, and the computer-readable medium may send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device . Program code embodied on a computer-readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
附图中的流程图和框图,图示了按照本申请各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码 的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagram may represent a module, a program segment, or a part of code, which contains one or more functions to implement a specified logical function Executable instructions. It should also be noted that in some alternative implementations, the functions noted in the blocks may also occur in a different order than those marked in the drawings. For example, two successively represented boxes may actually be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, can be implemented by a dedicated hardware-based system that performs the specified function or operation , Or it can be implemented with a combination of dedicated hardware and computer instructions.
描述于本申请实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的单元也可以设置在处理器中,例如,可以描述为:一种处理器包括获取单元、第一确定单元、第二确定单元和更新单元。其中,这些单元的名称在某种情况下并不构成对该单元本身的限定,例如,更新单元还可以被描述为“对所述第二人脸检测框的位置进行更新的单元”。The units described in the embodiments of the present application may be implemented by software or hardware. The described unit may also be provided in a processor, for example, it may be described as: a processor includes an acquisition unit, a first determination unit, a second determination unit, and an update unit. The names of these units do not constitute a limitation on the unit itself in some cases. For example, the update unit may also be described as a “unit that updates the position of the second face detection frame”.
作为另一方面,本申请还提供了一种计算机可读介质,该计算机可读介质可以是上述实施例中描述的装置中所包含的;也可以是单独存在,而未装配入该装置中。上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该装置执行时,使得该装置:获取预先对目标视频的当前帧进行人脸检测后所得到的第一人脸检测框的位置信息,以及,获取预先对该当前帧的上一帧进行人脸检测后所得到的第二人脸检测框的位置信息;基于所获取的位置信息,确定该第一人脸检测框和该第二人脸检测框的交并比;基于该交并比,确定所获取的各人脸检测框的位置信息的权重;基于所确定的权重和所获取的位置信息,确定该第一人脸检测框的目标位置信息,以对该第一人脸检测框的位置进行更新。As another aspect, the present application also provides a computer-readable medium, which may be included in the device described in the foregoing embodiments; or may exist alone without being assembled into the device. The computer-readable medium carries one or more programs, and when the one or more programs are executed by the device, the device is caused to obtain a first human face obtained by performing face detection on a current frame of a target video in advance. Position information of the detection frame, and acquiring position information of a second face detection frame obtained by performing face detection on a previous frame of the current frame in advance; determining the first face detection based on the acquired position information The intersection ratio of the frame and the second face detection frame; based on the intersection ratio, determining the weight of the obtained position information of each face detection frame; based on the determined weight and the obtained position information, determining the first Target position information of a face detection frame to update the position of the first face detection frame.
以上描述仅为本申请的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本申请中所涉及的发明范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述发明构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本申请中公开的(但不限于)具 有类似功能的技术特征进行互相替换而形成的技术方案。The above description is only a preferred embodiment of the present application and an explanation of the applied technical principles. Those skilled in the art should understand that the scope of the invention involved in this application is not limited to the technical solution of the specific combination of the above technical features, but it should also cover the above technical features or Other technical solutions formed by arbitrarily combining their equivalent features. For example, a technical solution formed by replacing the above features with technical features disclosed in this application (but not limited to) with similar functions.

Claims (12)

  1. 一种用于生成信息的方法,包括:A method for generating information, including:
    获取预先对目标视频的当前帧进行人脸检测后所得到的第一人脸检测框的位置信息,以及,获取预先存储的所述当前帧的上一帧中的第二人脸检测框的位置信息;Obtain position information of a first face detection frame obtained by performing face detection on a current frame of a target video in advance, and obtain a position of a second face detection frame in a previous frame of the current frame that is stored in advance information;
    基于所获取的位置信息,确定所述第一人脸检测框和所述第二人脸检测框的交并比;Determining an intersection ratio of the first face detection frame and the second face detection frame based on the obtained position information;
    基于所述交并比,确定所获取的各人脸检测框的位置信息的权重;Determining the weight of the obtained position information of each face detection frame based on the intersection ratio;
    基于所确定的权重和所获取的位置信息,确定所述第一人脸检测框的目标位置信息,以对所述第一人脸检测框的位置进行更新。Based on the determined weight and the obtained position information, target position information of the first face detection frame is determined to update the position of the first face detection frame.
  2. 根据权利要求1所述的用于生成信息的方法,其中,所述基于所述交并比,确定所获取的各人脸检测框的位置信息的权重,包括:The method for generating information according to claim 1, wherein the determining a weight of the position information of each face detection frame based on the intersection ratio comprises:
    以所述交并比作为底数,以第一预设数值作为指数,进行幂运算;Performing the power operation using the intersection ratio as a base and the first preset value as an exponent;
    将幂运算的计算结果确定为所述第二人脸检测框的位置信息的权重,将第二预设数值与所述权重的差值确定为所述第一人脸检测框的位置信息的权重。Determine the calculation result of the power operation as the weight of the position information of the second face detection frame, and determine the difference between the second preset value and the weight as the weight of the position information of the first face detection frame .
  3. 根据权利要求1所述的用于生成信息的方法,其中,所述基于所述交并比,确定所获取的各人脸检测框的位置信息的权重,包括:The method for generating information according to claim 1, wherein the determining a weight of the position information of each face detection frame based on the intersection ratio comprises:
    以自然常数作为底数,以所述交并比的倒数与第二预设数值的差作为指数,进行幂运算;Perform a power operation with a natural constant as a base, and a difference between a reciprocal of the intersection ratio and a second preset value as an index;
    将幂运算计算结果的倒数确定为所述第二人脸检测框的位置信息的权重,将所述第二预设数值与所述权重的差值确定为所述第一人脸检测框的位置信息的权重。Determining the inverse of the calculation result of the power operation as the weight of the position information of the second face detection frame, and determining the difference between the second preset value and the weight as the position of the first face detection frame The weight of the information.
  4. 根据权利要求1所述的用于生成信息的方法,其中,所述第一人脸检测框的位置信息包括所述第一人脸检测框的指定对角顶点坐标,所述第二人脸检测框的位置信息包括所述第二人脸检测框的指定对角 顶点坐标;以及The method for generating information according to claim 1, wherein the position information of the first face detection frame includes designated diagonal vertex coordinates of the first face detection frame, and the second face detection Frame position information includes designated diagonal vertex coordinates of the second face detection frame; and
    所述基于所确定的权重和所获取的位置信息,确定所述第一人脸检测框的目标位置信息,以对所述第一人脸检测框的位置进行更新,包括:Determining the target position information of the first face detection frame based on the determined weight and the obtained position information to update the position of the first face detection frame includes:
    将所述第一人脸检测框的位置信息的权重作为所述第一人脸检测框的指定对角顶点坐标的权重,将所述第二人脸检测框的位置信息的权重作为所述第二人脸检测框的指定对角顶点坐标的权重,将所述第一人脸检测框的指定对角顶点坐标与所述第二人脸检测框的指定对角顶点坐标的加权计算结果确定为所述第一人脸检测框的目标对角顶点坐标,以对所述第一人脸检测框的位置进行更新。Taking the weight of the position information of the first face detection frame as the weight of the designated diagonal vertex coordinates of the first face detection frame, and the weight of the position information of the second face detection frame as the first The weight of the designated diagonal vertex coordinates of the two face detection frames, and a weighted calculation result of the designated diagonal vertex coordinates of the first face detection frame and the designated diagonal vertex coordinates of the second face detection frame is determined as Coordinates of the target diagonal vertices of the first face detection frame to update the position of the first face detection frame.
  5. 根据权利要求4所述的用于生成信息的方法,其中,所述第一人脸检测框的指定对角顶点坐标包括第一顶点坐标和第二顶点坐标,所述第二人脸检测框的指定对角顶点坐标包括第三顶点坐标和第四顶点坐标;以及The method for generating information according to claim 4, wherein the designated diagonal vertex coordinates of the first face detection frame include a first vertex coordinate and a second vertex coordinate, and the Specify diagonal vertex coordinates including third vertex coordinates and fourth vertex coordinates; and
    所述将所述第一人脸检测框的指定对角顶点坐标与所述第二人脸检测框的指定对角顶点坐标的加权计算结果确定为所述第一人脸检测框的目标对角顶点坐标,包括:Determining a weighted calculation result of the designated diagonal vertex coordinates of the first face detection frame and the designated diagonal vertex coordinates of the second face detection frame as the target diagonal of the first face detection frame Vertex coordinates, including:
    将所述第一顶点坐标的横坐标与第三顶点坐标的横坐标的加权计算结果确定为第一目标横坐标;Determining a weighted calculation result of the abscissa of the first vertex coordinate and the abscissa of the third vertex coordinate as the first target abscissa;
    将所述第一顶点坐标的纵坐标与第三顶点坐标的纵坐标的加权计算结果确定为第一目标纵坐标;Determining a weighted calculation result of the ordinate of the first vertex coordinate and the ordinate of the third vertex coordinate as the first target ordinate;
    将所述第二顶点坐标的横坐标与第四顶点坐标的横坐标的加权计算结果确定为第二目标横坐标;Determining a weighted calculation result of the abscissa of the second vertex coordinate and the abscissa of the fourth vertex coordinate as the second target abscissa;
    将所述第二顶点坐标的纵坐标与第四顶点坐标的纵坐标的加权计算结果确定为第二目标纵坐标;Determining a weighted calculation result of the ordinate of the second vertex coordinate and the ordinate of the fourth vertex coordinate as the second target ordinate;
    将所述第一目标横坐标与所述第一目标纵坐标所构成的坐标、所述第二目标横坐标与所述第二目标纵坐标所构成的坐标确定为所述第一人脸检测框的目标对角顶点坐标。Determining the coordinates formed by the first target abscissa and the first target ordinate, and the coordinates formed by the second target abscissa and the second target ordinate as the first face detection frame The target diagonal vertex coordinates.
  6. 一种用于生成信息的装置,包括:An apparatus for generating information includes:
    获取单元,被配置成获取预先对目标视频的当前帧进行人脸检测后所得到的第一人脸检测框的位置信息,以及,获取预先存储的所述当前帧的上一帧中的第二人脸检测框的位置信息;An obtaining unit configured to obtain position information of a first face detection frame obtained by performing face detection on a current frame of a target video in advance, and obtaining a second stored in a previous frame of the current frame in advance Position information of the face detection frame;
    第一确定单元,被配置成基于所获取的位置信息,确定所述第一人脸检测框和所述第二人脸检测框的交并比;A first determining unit configured to determine an intersection ratio of the first face detection frame and the second face detection frame based on the obtained position information;
    第二确定单元,被配置成基于所述交并比,确定所获取的各人脸检测框的位置信息的权重;A second determining unit configured to determine a weight of the obtained position information of each face detection frame based on the intersection ratio;
    更新单元,被配置成基于所确定的权重和所获取的位置信息,确定所述第一人脸检测框的目标位置信息,以对所述第一人脸检测框的位置进行更新。The updating unit is configured to determine target position information of the first face detection frame based on the determined weight and the obtained position information to update the position of the first face detection frame.
  7. 根据权利要求6所述的用于生成信息的装置,其中,所述第二确定单元,包括:The apparatus for generating information according to claim 6, wherein the second determining unit comprises:
    第一运算模块,被配置成以所述交并比作为底数,以第一预设数值作为指数,进行幂运算;A first operation module configured to perform a power operation using the intersection ratio as a base and a first preset value as an exponent;
    第一确定模块,被配置成将幂运算的计算结果确定为所述第二人脸检测框的位置信息的权重,将第二预设数值与所述权重的差值确定为所述第一人脸检测框的位置信息的权重。A first determining module configured to determine a calculation result of a power operation as a weight of position information of the second face detection frame, and determine a difference between a second preset value and the weight as the first person Weight of the position information of the face detection frame.
  8. 根据权利要求6所述的用于生成信息的装置,其中,所述第二确定单元,包括:The apparatus for generating information according to claim 6, wherein the second determining unit comprises:
    第二运算模块,被配置成以自然常数作为底数,以所述交并比的倒数与第二预设数值的差作为指数,进行幂运算;A second operation module configured to perform a power operation using a natural constant as a base and a difference between a reciprocal of the intersection ratio and a second preset value as an index;
    第二确定模块,被配置成将幂运算计算结果的倒数确定为所述第二人脸检测框的位置信息的权重,将所述第二预设数值与所述权重的差值确定为所述第一人脸检测框的位置信息的权重。A second determining module configured to determine a reciprocal of a power operation calculation result as a weight of the position information of the second face detection frame, and determine a difference between the second preset value and the weight as the weight Weight of the position information of the first face detection frame.
  9. 根据权利要求6所述的用于生成信息的装置,其中,所述第一人脸检测框的位置信息包括所述第一人脸检测框的指定对角顶点坐标, 所述第二人脸检测框的位置信息包括所述第二人脸检测框的指定对角顶点坐标;以及The apparatus for generating information according to claim 6, wherein the position information of the first face detection frame includes designated diagonal vertex coordinates of the first face detection frame, and the second face detection Frame position information includes designated diagonal vertex coordinates of the second face detection frame; and
    所述更新单元,进一步被配置成:The update unit is further configured to:
    将所述第一人脸检测框的位置信息的权重作为所述第一人脸检测框的指定对角顶点坐标的权重,将所述第二人脸检测框的位置信息的权重作为所述第二人脸检测框的指定对角顶点坐标的权重,将所述第一人脸检测框的指定对角顶点坐标与所述第二人脸检测框的指定对角顶点坐标的加权计算结果确定为所述第一人脸检测框的目标对角顶点坐标,以对所述第一人脸检测框的位置进行更新。Taking the weight of the position information of the first face detection frame as the weight of the designated diagonal vertex coordinates of the first face detection frame, and the weight of the position information of the second face detection frame as the first The weight of the designated diagonal vertex coordinates of the two face detection frames, and a weighted calculation result of the designated diagonal vertex coordinates of the first face detection frame and the designated diagonal vertex coordinates of the second face detection frame is determined as Coordinates of the target diagonal vertices of the first face detection frame to update the position of the first face detection frame.
  10. 根据权利要求9所述的用于生成信息的装置,其中,所述第一人脸检测框的指定对角顶点坐标包括第一顶点坐标和第二顶点坐标,所述第二人脸检测框的指定对角顶点坐标包括第三顶点坐标和第四顶点坐标;以及The apparatus for generating information according to claim 9, wherein the designated diagonal vertex coordinates of the first face detection frame include a first vertex coordinate and a second vertex coordinate, and the Specify diagonal vertex coordinates including third vertex coordinates and fourth vertex coordinates; and
    所述更新单元,进一步被配置成:The update unit is further configured to:
    将所述第一顶点坐标的横坐标与第三顶点坐标的横坐标的加权计算结果确定为第一目标横坐标;Determining a weighted calculation result of the abscissa of the first vertex coordinate and the abscissa of the third vertex coordinate as the first target abscissa;
    将所述第一顶点坐标的纵坐标与第三顶点坐标的纵坐标的加权计算结果确定为第一目标纵坐标;Determining a weighted calculation result of the ordinate of the first vertex coordinate and the ordinate of the third vertex coordinate as the first target ordinate;
    将所述第二顶点坐标的横坐标与第四顶点坐标的横坐标的加权计算结果确定为第二目标横坐标;Determining a weighted calculation result of the abscissa of the second vertex coordinate and the abscissa of the fourth vertex coordinate as the second target abscissa;
    将所述第二顶点坐标的纵坐标与第四顶点坐标的纵坐标的加权计算结果确定为第二目标纵坐标;Determining a weighted calculation result of the ordinate of the second vertex coordinate and the ordinate of the fourth vertex coordinate as the second target ordinate;
    将所述第一目标横坐标与所述第一目标纵坐标所构成的坐标、所述第二目标横坐标与所述第二目标纵坐标所构成的坐标确定为所述第一人脸检测框的目标对角顶点坐标。Determining the coordinates formed by the first target abscissa and the first target ordinate, and the coordinates formed by the second target abscissa and the second target ordinate as the first face detection frame The target diagonal vertex coordinates.
  11. 一种电子设备,包括:An electronic device includes:
    一个或多个处理器;One or more processors;
    存储装置,其上存储有一个或多个程序,A storage device on which one or more programs are stored,
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-5中任一所述的方法。When the one or more programs are executed by the one or more processors, the one or more processors implement the method according to any one of claims 1-5.
  12. 一种计算机可读介质,其上存储有计算机程序,其中,该程序被处理器执行时实现如权利要求1-5中任一所述的方法。A computer-readable medium having stored thereon a computer program, wherein the program, when executed by a processor, implements the method of any one of claims 1-5.
PCT/CN2018/115974 2018-09-21 2018-11-16 Information generating method and device WO2020056903A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811110674.1A CN109308469B (en) 2018-09-21 2018-09-21 Method and apparatus for generating information
CN201811110674.1 2018-09-21

Publications (1)

Publication Number Publication Date
WO2020056903A1 true WO2020056903A1 (en) 2020-03-26

Family

ID=65224012

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/115974 WO2020056903A1 (en) 2018-09-21 2018-11-16 Information generating method and device

Country Status (2)

Country Link
CN (1) CN109308469B (en)
WO (1) WO2020056903A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112990084A (en) * 2021-04-07 2021-06-18 北京字跳网络技术有限公司 Image processing method and device and electronic equipment
WO2022237902A1 (en) * 2021-05-14 2022-11-17 长沙智能驾驶研究院有限公司 Method, apparatus, and device for detecting object, and computer storage medium
CN115661798A (en) * 2022-12-23 2023-01-31 小米汽车科技有限公司 Method and device for determining target area, vehicle and storage medium

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110677585A (en) * 2019-09-30 2020-01-10 Oppo广东移动通信有限公司 Target detection frame output method and device, terminal and storage medium
CN113592874A (en) * 2020-04-30 2021-11-02 杭州海康威视数字技术股份有限公司 Image display method and device and computer equipment
CN112084992B (en) * 2020-09-18 2021-04-13 北京中电兴发科技有限公司 Face frame selection method in face key point detection module
CN112584089B (en) * 2020-12-10 2021-05-07 浙江华创视讯科技有限公司 Face brightness adjusting method and device, computer equipment and storage medium
CN112613462B (en) * 2020-12-29 2022-09-23 安徽大学 Weighted intersection ratio method
CN113283349A (en) * 2021-05-28 2021-08-20 中国公路工程咨询集团有限公司 Traffic infrastructure construction target monitoring system and method based on target anchor frame optimization strategy
CN113538274A (en) * 2021-07-14 2021-10-22 Oppo广东移动通信有限公司 Image beautifying processing method and device, storage medium and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106952288A (en) * 2017-03-31 2017-07-14 西北工业大学 Based on convolution feature and global search detect it is long when block robust tracking method
CN107832741A (en) * 2017-11-28 2018-03-23 北京小米移动软件有限公司 The method, apparatus and computer-readable recording medium of facial modeling
CN108009473A (en) * 2017-10-31 2018-05-08 深圳大学 Based on goal behavior attribute video structural processing method, system and storage device
CN108197568A (en) * 2017-12-31 2018-06-22 广州二元科技有限公司 A kind of processing method for the recognition of face being lifted in digitized video
CN108196680A (en) * 2018-01-25 2018-06-22 盛视科技股份有限公司 A kind of robot vision follower method extracted based on characteristics of human body with retrieving

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9984282B2 (en) * 2015-12-10 2018-05-29 Perfect Corp. Systems and methods for distinguishing facial features for cosmetic application
CN106228112B (en) * 2016-07-08 2019-10-29 深圳市优必选科技有限公司 Face datection tracking and robot head method for controlling rotation and robot
CN106934817B (en) * 2017-02-23 2020-11-10 中国科学院自动化研究所 Multi-attribute-based multi-target tracking method and device
CN107330920B (en) * 2017-06-28 2020-01-03 华中科技大学 Monitoring video multi-target tracking method based on deep learning
CN108388879B (en) * 2018-03-15 2022-04-15 斑马网络技术有限公司 Target detection method, device and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106952288A (en) * 2017-03-31 2017-07-14 西北工业大学 Based on convolution feature and global search detect it is long when block robust tracking method
CN108009473A (en) * 2017-10-31 2018-05-08 深圳大学 Based on goal behavior attribute video structural processing method, system and storage device
CN107832741A (en) * 2017-11-28 2018-03-23 北京小米移动软件有限公司 The method, apparatus and computer-readable recording medium of facial modeling
CN108197568A (en) * 2017-12-31 2018-06-22 广州二元科技有限公司 A kind of processing method for the recognition of face being lifted in digitized video
CN108196680A (en) * 2018-01-25 2018-06-22 盛视科技股份有限公司 A kind of robot vision follower method extracted based on characteristics of human body with retrieving

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112990084A (en) * 2021-04-07 2021-06-18 北京字跳网络技术有限公司 Image processing method and device and electronic equipment
WO2022237902A1 (en) * 2021-05-14 2022-11-17 长沙智能驾驶研究院有限公司 Method, apparatus, and device for detecting object, and computer storage medium
CN115661798A (en) * 2022-12-23 2023-01-31 小米汽车科技有限公司 Method and device for determining target area, vehicle and storage medium

Also Published As

Publication number Publication date
CN109308469A (en) 2019-02-05
CN109308469B (en) 2019-12-10

Similar Documents

Publication Publication Date Title
WO2020056903A1 (en) Information generating method and device
CN105981368B (en) Picture composition and position guidance in an imaging device
EP3271865B1 (en) Detecting segments of a video program
BR112021005620A2 (en) method and system for generating media content, and readable media.
CN108197618B (en) Method and device for generating human face detection model
CN111476871B (en) Method and device for generating video
US10192582B2 (en) Automatic generation of time-lapse videos
JP2017528016A (en) Rule-based video importance analysis
US20160054895A1 (en) Method of providing visual sound image and electronic device implementing the same
WO2020248900A1 (en) Panoramic video processing method and apparatus, and storage medium
US9729792B2 (en) Dynamic image selection
CN112954450B (en) Video processing method and device, electronic equipment and storage medium
CN109271929B (en) Detection method and device
CN104394422A (en) Video segmentation point acquisition method and device
CN111436005B (en) Method and apparatus for displaying image
CN110781823B (en) Screen recording detection method and device, readable medium and electronic equipment
WO2022007565A1 (en) Image processing method and apparatus for augmented reality, electronic device and storage medium
WO2020151156A1 (en) Video stream playing method and system, computer apparatus and readable storage medium
WO2019062631A1 (en) Local dynamic image generation method and device
WO2018166470A1 (en) Animation display method based on frame rate and terminal device
CN109816791B (en) Method and apparatus for generating information
US20210104096A1 (en) Surface geometry object model training and inference
CN111027495A (en) Method and device for detecting key points of human body
RU2754641C2 (en) Method and device for determining direction of rotation of target object, computer-readable media and electronic device
CN111314627B (en) Method and apparatus for processing video frames

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18934390

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 08/07/2021)

122 Ep: pct application non-entry in european phase

Ref document number: 18934390

Country of ref document: EP

Kind code of ref document: A1