WO2023010792A1

WO2023010792A1 - Key point processing method and apparatus, readable storage medium and terminal

Info

Publication number: WO2023010792A1
Application number: PCT/CN2021/142865
Authority: WO
Inventors: 谢富名
Original assignee: 展讯通信（上海）有限公司
Priority date: 2021-08-03
Filing date: 2021-12-30
Publication date: 2023-02-09
Also published as: CN113627306A; CN113627306B

Abstract

A key point processing method and apparatus, a readable storage medium, and a terminal. The method comprises: collecting a previous image frame and a current image frame in a video, and respectively detecting positions of a plurality of key points; for each key point, determining position offsets of the previous image frame and the current image frame, and then filtering the position offsets to obtain a filtered inter-frame offset; and for each key point, adjusting the position of the key point in the current image frame according to the filtered inter-frame offset and the position of the previous image frame. According to the present invention, the face smoothness after key point processing can be improved.

Description

Key point processing method and device, readable storage medium, terminal

This application claims the priority of the Chinese patent application submitted to the China Patent Office on August 3, 2021, with the application number 202110886400.7, and the title of the invention is "key point processing method and device, readable storage medium, terminal", the entire content of which is passed References are incorporated in this application.

technical field

The present invention relates to the technical field of image processing, in particular to a key point processing method and device, a readable storage medium, and a terminal.

Background technique

In video image processing technology, with the popularization of applications such as video beautification and face-lifting, video real-time face changing/makeup changing, and somatosensory games based on video gesture/human bone key point detection, the smoothness and stability of face recognition are required. Higher and higher.

However, due to the complex motion state of the face in the video stream, factors such as changes in face posture, translation of the face in the image plane, and changes in the size of the face due to changes in the distance between the face and the camera may all cause changes in the face size. The problem of insufficient smoothness of key points occurs after changing faces.

There is an urgent need for a key point processing method that can effectively improve the smoothness of face key points.

Contents of the invention

The technical problem solved by the present invention is to provide a key point processing method and device, a readable storage medium, and a terminal, which can improve the smoothness of human faces after key point processing.

In order to solve the above technical problems, the embodiment of the present invention provides a key point processing method, including: collecting the previous frame image and the current frame image in the video, and detecting the positions of multiple key points respectively; for each key point, determine The position offset of the previous frame image and the current frame image, and then filter the position offset to obtain the filtered inter-frame offset; for each key point, according to the filtered inter-frame The offset and the position of the previous frame image, adjust the position of the key point in the current frame image.

Optionally, the following formula is used to determine the position offset between the previous frame image and the current frame image, and then filter the position offset to obtain the filtered inter-frame offset:

diff _offset = Rigid _cur - Rigid _pre

diff _smooth = Filter(diff _offset , radius)

Among them, diff _offset is used to indicate the position offset of the key point in the previous frame image and the current frame image, Rigid _cur is used to indicate the position of the key point in the key point of the current frame image, and Rigid _pre uses To represent the position of the key point in the previous frame image, Filter() is used to represent the filter processing function, diff _smooth is used to represent the post-filter inter-frame offset of the key point, and radius is used to represent the filter radius.

Optionally, the following formula is used to adjust the position of the key point in the current frame image according to the filtered inter-frame offset and the position of the previous frame image:

dst _pts = Rigid _pre + diff _smooth

Among them, dst _pts is used to indicate the adjusted position of the key point in the current frame image, Rigid _pre is used to indicate the position of the key point in the previous frame image, and diff _smooth is used to indicate the filtering of the key point Back frame offset.

Optionally, before adjusting the position of the key point in the current frame image according to the filtered inter-frame offset and the position of the previous frame image, it also includes: determining according to the filtered inter-frame offset The maximum global offset between frames; according to the maximum global offset between frames and the first piecewise linear function, determine the first key point offset regression coefficient; according to the filtered inter-frame offset and the previous frame The position of the image, adjusting the position of the key point in the current frame image includes: according to the product of the filtered inter-frame offset and the first key point offset regression coefficient, and the position of the previous frame image, Adjust the position of the key point in the current frame image.

Optionally, the following formula is used to determine the first key point offset regression coefficient according to the maximum global offset between frames and the first piecewise linear function:

f ₁ (Diff _max ) = 0, if Diff _max ≤ 1;

f ₁ (Diff _max )=(Diff _max -N)/(N-1)+1, else if 1<Diff _max ≤N;

f ₁ (Diff _max )=1, else if Diff _max ≥ N;

Among them, Diff _max is used to represent the maximum global offset between frames, f ₁ (x) is used to represent the first piecewise linear function, f ₁ (Diff _max ) is used to represent the first key point offset regression coefficient, and N is positive rational number.

Optionally, the smaller the jitter average value of one or more key point positions between two adjacent frames of images in the historical video is, the smaller the selected value of N is.

Optionally, the following formula is used to adjust the position of the key point in the current frame according to the product of the filtered inter-frame offset and the first key point offset regression coefficient and the position of the previous frame image. Image location:

dst _pts = Rigid _pre +f ₁ (Diff _max )×diff _smooth

Wherein, dst _pts is used to represent the adjusted position of the key point in the current frame image, Rigid _pre is used to represent the position of the key point in the previous frame image, and f ₁ (Diff _max ) is used to represent the first A key point offset regression coefficient, diff _smooth is used to represent the filtered inter-frame offset of the key point.

Optionally, before adjusting the position of the key point in the current frame image according to the filtered inter-frame offset and the position of the previous frame image, the key point processing method further includes: according to the filtered inter-frame offset shift, determine the maximum global offset between frames; according to the position offset of each key point in the previous frame image and the current frame image, determine the key point acquisition stability calibration parameters; according to the maximum global offset between the frames Shift, the key point acquisition stability calibration parameters and the second piecewise linear function, determine the second key point offset regression coefficient; according to the filtered inter-frame offset and the position of the previous frame image, adjust the The position of the key point in the current frame image includes: adjusting the position of the key point in the current frame image according to the product of the filtered inter-frame offset and the second key point offset regression coefficient, and the position of the previous frame image. Describe the position of the current frame image.

Optionally, the following formula is used to determine the maximum global offset between frames according to the filtered inter-frame offset:

Diff _max = MAX(diff _{smooth_all} )

Among them, Diff _max is used to represent the maximum global offset between frames, Max() is the maximum value function, and diff _{smooth_all} is used to represent the filtered inter-frame offset of each key point.

Optionally, the following formula is used to determine the key point acquisition stability calibration parameters according to the position offset of each key point in the previous frame image and the current frame image:

diff _pts = ABS(Rigid _cur -Rigid _pre )

Stable _thr = Min(M,Max(diff _{pts_all} ))

Among them, diff _pts is used to indicate the absolute value of the position offset of the key point in the previous frame image and the current frame image, ABS() is used to indicate the absolute value function, and diff _{pts_all} is used to indicate that each key point is in the The absolute value of the position offset between the previous frame image and the current frame image, Max() is the maximum value function, Min() is the minimum value function, Stable _thr is used to indicate the key point acquisition stability calibration parameter, M is positive rational numbers.

Optionally, two adjacent frames of images in the historical video are selected, and the smaller the mean value of shaking of one or more key point positions between the two frames of images is, the smaller the selected value of M is.

Optionally, the following formula is used to determine the second key point offset regression coefficient according to the maximum global offset between frames, the key point acquisition stability calibration parameter and the second piecewise linear function:

f ₂ (Diff _max ) = 0, if Diff _max ≤ 1;

f ₂ (Diff _max )＝(Diff _max -Stable _thr )/(Stable _thr -1)+1, else if 1＜Diff _max ≤ Stable _thr ;

f ₂ (Diff _max )=1, else if Diff _max ≥ Stable _thr ;

Among them, Diff _max is used to represent the maximum global offset between frames, f ₂ (x) is used to represent the second piecewise linear function, f ₂ (Diff _max ) is used to represent the second key point offset regression coefficient, Stable _thr Collect stability calibration parameters for key points.

Optionally, the following formula is used to adjust the position of the key point in the current frame according to the product of the filtered inter-frame offset and the second key point offset regression coefficient and the position of the previous frame image. Image location:

dst _pts = Rigid _pre +f ₂ (Diff _max )×diff _smooth

Wherein, dst _pts is used to represent the adjusted position of the key point in the current frame image, Rigid _pre is used to represent the position of the key point in the previous frame image, f ₂ (Diff _max ) is used to represent the first Two key point offset regression coefficients, diff _smooth is used to represent the filtered inter-frame offset of the key point.

Optionally, one or more of the following is satisfied: the key point is a human face key point; the position of the key point is a pixel sequence number of the key point in the image.

In order to solve the above technical problems, an embodiment of the present invention provides a key point processing device, including: a key point position detection module, which is used to collect the previous frame image and the current frame image in the video, and detect the positions of multiple key points respectively ; The filter module is used to determine the position offset of the previous frame image and the current frame image for each key point, and then filter the position offset to obtain the filtered inter-frame offset; The position adjustment module is configured to, for each key point, adjust the position of the key point in the current frame image according to the filtered inter-frame offset and the position of the previous frame image.

In order to solve the above technical problem, an embodiment of the present invention provides a readable storage medium on which a computer program is stored, and when the computer program is run by a processor, the steps of the above key point processing method are executed.

In order to solve the above technical problems, an embodiment of the present invention provides a terminal, including a memory and a processor, the memory stores a computer program that can run on the processor, and when the processor runs the computer program, it executes Steps of the above key point processing method.

Compared with the prior art, the technical solutions of the embodiments of the present invention have the following beneficial effects:

In the embodiment of the present invention, the filtered inter-frame offset is obtained by filtering the position offset of the previous frame image and the current frame image, and then according to the filtered inter-frame offset and the previous frame The position of the image, adjusting the position of the key point in the current frame image, can use the local filtering and smoothing technology to filter out the jitter of the local point. Compared with the existing technology, since the frame-to-frame detection is relatively independent, there is no information interaction, and there are differences in the quality of face images between frames such as noise and brightness, so that even if the face is still in the video, the face key point detection Alignment deviation is unavoidable in the frame-to-frame results, which is manifested as irregular jitter of key points of the face. Using the scheme of the embodiment of the present invention, the local filtering and smoothing technology can be used to obtain the relative motion of the global point and filter out the irregularity of the local point. Dithering to improve the smoothness of the face after key point processing.

Further, according to the maximum inter-frame global offset and the first piecewise linear function, determine the first key point offset regression coefficient; according to the filtered inter-frame offset and the first key point offset regression The product of the coefficients, and the position of the previous frame image, adjust the position of the key point in the current frame image, and the static stabilization and dynamic regression of the video stream face key points can be realized by using a piecewise linear function.

Further, according to the position offset of each key point in the previous frame image and the current frame image, determine the key point acquisition stability calibration parameters; according to the maximum global offset between the frames, the key point acquisition stability The calibration parameter and the second piecewise linear function determine the second key point offset regression coefficient; according to the product of the filtered inter-frame offset and the second key point offset regression coefficient, and the previous frame image Position, adjust the position of the key point in the current frame image, the stability calibration of the face key point detection model can be performed, and the stability of the acquisition model can be known by obtaining the stability calibration parameters, so as to facilitate the realization of the best video stream Face key point stabilization effect. Furthermore, combined with key point acquisition stability calibration parameters, a piecewise linear function is obtained, which can further improve the static stability and dynamic regression of video stream face key points.

Further, the position of the key point is the pixel number of the key point in the image, which effectively reduces the complexity of the algorithm.

Description of drawings

Fig. 1 is the flowchart of the first kind of key point processing method in the embodiment of the present invention;

Fig. 2 is a schematic diagram of key points of a human face in an embodiment of the present invention;

Fig. 3 is a flow chart of the second key point processing method in the embodiment of the present invention;

Fig. 4 is a flow chart of the third key point processing method in the embodiment of the present invention;

Fig. 5 is a flow chart of the fourth key point processing method in the embodiment of the present invention;

Fig. 6 is a schematic structural diagram of a key point processing device in an embodiment of the present invention.

Detailed ways

In the video image processing technology, due to the complex motion state of the face in the video stream, it may lead to the problem of insufficient smoothness of the key points after the face change, and it is necessary to improve the smoothness of the key points of the face.

In order to make the above objects, features and beneficial effects of the present invention more comprehensible, specific embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings.

Referring to FIG. 1 , FIG. 1 is a flow chart of a first key point processing method in an embodiment of the present invention. The first key point processing method may include steps S11 to S13:

Step S11: collecting the previous frame image and the current frame image in the video, and detecting the positions of multiple key points respectively;

Step S12: For each key point, determine the position offset between the previous frame image and the current frame image, and then filter the position offset to obtain a filtered inter-frame offset;

Step S13: For each key point, adjust the position of the key point in the current frame image according to the filtered inter-frame offset and the position of the previous frame image.

It can be understood that, in a specific implementation, the method may be implemented in the form of a software program, and the software program runs in a processor integrated in a chip or a chip module.

In the specific implementation of step S11, the video can be a video being recorded, such as a video of beautifying and thinning the face, a video of real-time face changing/makeup changing, or a video in the process of recording and broadcasting, such as a face-lifting with an anchor interaction function, Face changing video etc. After face-lifting and face-changing, it is necessary to continue to maintain consistency, and it is necessary to continuously adjust the current frame image.

Further, the key points may be human face key points, so as to adjust the human face in the video and improve the smoothness of human face display.

Before performing face key point detection, face recognition can be performed on the video stream image first, face position information is detected, and the complete face area is intercepted according to the face position information, and then face key point detection is performed, and the detection result is a face position information of key points.

Further, the position of the key point may be a pixel number of the key point in the image.

Taking a frame of image as 996×1024pixel as an example, it can be represented by the number of pixels in different directions. For example, 500×500pixel can represent horizontal (such as X-axis direction) pixel number is 500, vertical (such as Y-axis direction) The only key point with pixel number 500.

In the embodiment of the present invention, the position of the key point is the pixel number of the key point in the image, which effectively reduces the complexity of the algorithm.

Referring to FIG. 2 , FIG. 2 is a schematic diagram of key points of a human face in an embodiment of the present invention.

The key points can be cheeks, eyes, mouth, eyebrows and nose, etc. Specifically, the position information of cheeks, eyes, mouth, eyebrows and nose in the image can be detected by using a face key point detection model.

In the figure, 104 key points are taken as an example. It can be understood that the more key points are extracted, the better the smoothness of the face after key point processing will be improved. However, the processing rate will be affected, and video freezes will occur in severe cases. It also affects user experience.

Continuing to refer to FIG. 1 , in the specific implementation of step S12 , for each key point, filter processing is performed on the position offset of two frames of images.

Further, the following formula is used to determine the position offset between the previous frame image and the current frame image, and then filter the position offset to obtain the filtered inter-frame offset:

diff _offset = Rigid _cur - Rigid _pre

diff _smooth = Filter(diff _offset , radius)

In step S13, the position of the key point is adjusted in the current frame image.

Further, the following formula can be used to adjust the position of the key point in the current frame image according to the filtered inter-frame offset and the position of the previous frame image:

dst _pts = Rigid _pre + diff _smooth

Referring to FIG. 3 , FIG. 3 is a flowchart of a second key point processing method in an embodiment of the present invention. The second key point processing method may include steps S11 to S12 in FIG. 1 , and may also include steps S31 to S33, which will be described below.

It should be noted that step S31 to step S32 may be performed before step S11, or may be performed after step S12. In this embodiment of the present invention, the sequence number of the steps does not represent a restriction on the execution order of the steps.

In step S31, the maximum global offset between frames is determined according to the filtered inter-frame offset.

Further, the following formula can be used to determine the maximum global offset between frames according to the filtered inter-frame offset:

Diff _max = Max(diff _{smooth_all} )

In step S32, a first key point offset regression coefficient is determined according to the maximum inter-frame global offset and the first piecewise linear function.

Further, the following formula can be used to determine the first key point offset regression coefficient according to the maximum global offset between frames and the first piecewise linear function:

f ₁ (Diff _max ) = 0, if Diff _max ≤ 1;

f ₁ (Diff _max )=(Diff _max -N)/(N-1)+1, else if 1<Diff _max ≤N;

f ₁ (Diff _max )=1, else if Diff _max ≥ N;

Further, the selected value of N may be determined according to empirical data, for example, determined according to an average jitter value of one or more key point positions between two adjacent frames of images in a historical video. The smaller the mean value of the jitter at the key point position is, the smaller the selected value of N is.

Wherein, the two adjacent frames of images in the historical video can be the adjacent two frames of images when the face remains still. The key point detection model detects the position offset of the key points of two adjacent frames of images, and determines the jitter mean value of the key point positions.

Wherein, the mean value of key point position jitter can be the mean value of the position offsets of the key points of two adjacent frames of images, wherein the position offset can be the pixel point difference in the X-axis direction and the pixel point difference in the Y-axis direction. Pixel difference.

In step S33, the position of the key point in the current frame image is adjusted according to the product of the filtered inter-frame offset and the first key point offset regression coefficient, and the position of the previous frame image.

Further, the following formula can be used to adjust the position of the key point in the current frame according to the product of the filtered inter-frame offset and the first key point offset regression coefficient and the position of the previous frame image. Image location:

dst _pts = Rigid _pre +f ₁ (Diff _max )×diff _smooth

In the embodiment of the present invention, the first key point offset regression coefficient is determined according to the maximum inter-frame global offset and the first piecewise linear function; according to the filtered inter-frame offset and the first The product of the key point offset regression coefficient, and the position of the previous frame image, adjust the position of the key point in the current frame image, and the static stability and dynamics of the video stream face key points can be realized by using a piecewise linear function return. That is to say, while improving the smoothness, the imaging stability is improved.

Referring to FIG. 4 , FIG. 4 is a flowchart of a third key point processing method in an embodiment of the present invention. The third key point processing method may include steps S11 to S12 in FIG. 1 , and may also include steps S41 to S44, which will be described below.

It should be noted that step S41 to step S43 may be performed before step S11, or may be performed after step S12. In this embodiment of the present invention, the sequence number of the steps does not represent a restriction on the execution order of the steps.

In step S41, the maximum global offset between frames is determined according to the filtered inter-frame offset.

Diff _max = Max(diff _{smooth_all} )

In step S42, according to the position offset of each key point in the previous frame image and the current frame image, the key point acquisition stability calibration parameters are determined.

Further, the following formula can be used to determine the key point acquisition stability calibration parameters according to the position offset of each key point in the previous frame image and the current frame image:

diff _pts = ABS(Rigid _cur -Rigid _pre )

Stable _thr = Min(M,Max(diff _{pts_all} ))

Furthermore, the selected value of M may be determined according to empirical data, for example, determined according to an average value of shaking of one or more key point positions between two adjacent frames of images in a historical video. The smaller the mean value of the jitter at the key point position is, the smaller the selected value of M is.

It is understandable that the value of M should not be too large, otherwise it will cause too insensitivity, and the jitter will not be suppressed if it is too large. For example, the person does move but mistakenly thinks it is normal jitter; the value of M should not be too small , otherwise it will be too sensitive, and small jitters will be judged as personnel movement.

As a non-limiting example, M may be 1.5-5, such as M=2.

In the embodiment of the present invention, by determining the maximum value of the absolute value of the position offset of each key point in the previous frame image and the current frame image, and then comparing with M, and retaining the smaller value, it can be used in When the offset between two frames of images is too large, the M value can be retained to avoid the influence of the extreme value of jitter and further effectively improve the imaging stability.

In step S43, a second key point offset regression coefficient is determined according to the maximum inter-frame global offset, the key point acquisition stability calibration parameter and the second piecewise linear function.

Further, the following formula can be used to determine the second key point offset regression coefficient according to the maximum global offset between frames, the key point acquisition stability calibration parameter and the second piecewise linear function:

f ₂ (Diff _max ) = 0, if Diff _max ≤ 1;

f ₂ (Diff _max )=(Diff _max -Stable _thr )/(Stable _thr -1)+1, else if 1<Diff _max ≤Stable _thr ;

f ₂ (Diff _max )=1, else if Diff _max ≥ Stable _thr ;

In the embodiment of the present invention, the second piecewise linear function is obtained by citing the key point acquisition stability calibration parameters, which can avoid the influence of the extremum value of the jitter, and perform dynamic regression processing to further improve the stability.

In step S44, the position of the key point in the current frame image is adjusted according to the product of the filtered inter-frame offset and the second key point offset regression coefficient, and the position of the previous frame image.

Further, the following formula can be used to adjust the position of the key point in the current frame according to the product of the filtered inter-frame offset and the second key point offset regression coefficient, and the position of the previous frame image. Image location:

dst _pts = Rigid _pre +f ₂ (Diff _max )×diff _smooth

In the embodiment of the present invention, according to the position offset of each key point in the previous frame image and the current frame image, the key point acquisition stability calibration parameters are determined; according to the maximum global offset between frames, the The key point acquisition stability calibration parameter and the second piecewise linear function determine the second key point offset regression coefficient; according to the product of the filtered inter-frame offset and the second key point offset regression coefficient, and The position of the previous frame image, adjusting the position of the key point in the current frame image, can perform stability calibration on the face key point detection model, and the stability of the acquisition model can be known by obtaining the stability calibration parameters, which is convenient for implementation The best facial key point stabilization effect for video streaming. Furthermore, combined with key point acquisition stability calibration parameters, a piecewise linear function is obtained, which can further improve the static stability and dynamic regression of video stream face key points.

Referring to FIG. 5 , FIG. 5 is a flowchart of a fourth key point processing method in an embodiment of the present invention. The fourth key point processing method may include step S51 to step S57, each step will be described below.

In step S51, facial recognition.

Specifically, facial recognition is performed on the faces in the video to detect the location information of the faces. In this embodiment of the present invention, no limitation is imposed on a specific face recognition method.

In step S52, the pixel sequence number of the key point of the human face is detected.

Specifically, the complete face area can be intercepted according to the face position information, and then the key point detection of the face is performed. The detection result is the position information of the key point of the face, and the position of the key point can be the pixel point of the key point in the image The serial number can be represented by the serial number of pixels in different directions.

Further, after detecting the pixel number of the key point of the human face, Rigid _cur can be used to represent the position of the key point in the key point of the current frame image, and Rigid _pre can be used to represent the position of the key point in the previous frame image. Location.

In step S53, filter processing.

Specifically, the position offset between the previous frame image and the current frame image may be determined, and then filter processing is performed on the position offset to obtain a filtered inter-frame offset.

In step S54, the maximum global offset between frames is determined.

Specifically, the maximum inter-frame global offset may be determined according to the filtered inter-frame offset determined in the previous step.

In step S55, the key point acquisition stability calibration parameters are determined.

Specifically, key point acquisition stability calibration parameters may be determined according to position offsets of each key point between the previous frame image and the current frame image.

In step S56, a second piecewise linear function is determined.

Specifically, the stability calibration parameter can be introduced into the second piecewise linear function, and the second key point is determined according to the maximum global offset between frames, the key point acquisition stability calibration parameter and the second piecewise linear function Offset regression coefficients.

In step S57, the position of the key point is adjusted.

Specifically, the position of the key point in the current frame image may be adjusted according to the product of the filtered inter-frame offset and the second key point offset regression coefficient, and the position of the previous frame image.

In specific implementation, for more detailed content about steps S51 to S57, please refer to the step descriptions in FIG. 1 to FIG. 4 for execution, and details will not be repeated here.

Referring to FIG. 6 , FIG. 6 is a schematic structural diagram of a key point processing device in an embodiment of the present invention. The key point processing means may include:

Key point position detection module 61, is used for collecting previous frame image and current frame image in video, and detects the position of a plurality of key points respectively;

Filtering module 62, is used for, for each key point, determines the position offset of described previous frame image and current frame image, then carries out filtering process to described position offset, obtains the inter-frame offset after filtering;

The position adjustment module 63 is configured to, for each key point, adjust the position of the key point in the current frame image according to the filtered inter-frame offset and the position of the previous frame image.

In a specific implementation, the above means may correspond to a chip with a data processing function in the user equipment; or correspond to a chip module including a chip with a data processing function in the user equipment, or correspond to the user equipment.

An embodiment of the present invention also provides a readable storage medium on which a computer program is stored, and the computer program executes the steps of the above method when the computer program is run by a processor. The readable storage medium may be a computer-readable storage medium, for example, may include a non-volatile memory (non-volatile) or a non-transitory (non-transitory) memory, and may also include an optical disk, a mechanical hard disk, a solid-state hard disk, and the like.

Specifically, in the embodiment of the present invention, the processor may be a central processing unit (Central Processing Unit, referred to as CPU), and the processor may also be other general-purpose processors, digital signal processors (digital signal processor, referred to as DSP) ), application specific integrated circuit (ASIC for short), off-the-shelf programmable gate array (field programmable gate array, FPGA for short) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.

It should also be understood that the memory in the embodiments of the present application may be a volatile memory or a nonvolatile memory, or may include both volatile and nonvolatile memories. Among them, the non-volatile memory can be read-only memory (read-only memory, referred to as ROM), programmable read-only memory (programmable ROM, referred to as PROM), erasable programmable read-only memory (erasable PROM, referred to as EPROM) , Electrically Erasable Programmable Read-Only Memory (electrically EPROM, referred to as EEPROM) or flash memory. The volatile memory can be random access memory (RAM), which acts as external cache memory. By way of illustration and not limitation, many forms of random access memory (RAM) are available, such as static random access memory (static RAM (SRAM), dynamic random access memory (DRAM), synchronous Dynamic random access memory (synchronous DRAM, referred to as SDRAM), double data rate synchronous dynamic random access memory (double data rate SDRAM, referred to as DDR SDRAM), enhanced synchronous dynamic random access memory (enhanced SDRAM, referred to as ESDRAM), Synchronously connect dynamic random access memory (synchlink DRAM, referred to as SLDRAM) and direct memory bus random access memory (direct rambus RAM, referred to as DR RAM).

An embodiment of the present invention also provides a terminal, including a memory and a processor, the memory stores a computer program that can run on the processor, and the processor executes the steps of the above method when running the computer program . The terminals include but are not limited to terminal devices such as mobile phones, computers, and tablet computers.

It should be understood that the term "and/or" in this article is only an association relationship describing associated objects, which means that there may be three relationships, for example, A and/or B may mean: A exists alone, and A and B exist at the same time , there are three cases of B alone. In addition, the character "/" in this article indicates that the contextual objects are an "or" relationship.

"Multiple" appearing in the embodiments of the present application means two or more.

The first, second, etc. descriptions that appear in the embodiments of this application are only for illustration and to distinguish the description objects. Any limitations of the examples.

Regarding each device described in the above embodiments, each module/unit contained in the product may be a software module/unit, or a hardware module/unit, or may be partly a software module/unit and partly a hardware module/unit. . For example, for each device or product applied to or integrated into a chip, each module/unit contained therein may be realized by hardware such as a circuit, or at least some modules/units may be realized by a software program, and the software program Running on the integrated processor inside the chip, the remaining (if any) modules/units can be realized by means of hardware such as circuits; They are all realized by means of hardware such as circuits, and different modules/units can be located in the same component (such as chips, circuit modules, etc.) or different components of the chip module, or at least some modules/units can be realized by means of software programs, The software program runs on the processor integrated in the chip module, and the remaining (if any) modules/units can be realized by hardware such as circuits; /Units can be realized by means of hardware such as circuits, and different modules/units can be located in the same component (such as chips, circuit modules, etc.) or different components in the terminal, or at least some modules/units can be implemented in the form of software programs Realization, the software program runs on the processor integrated in the terminal, and the remaining (if any) modules/units can be implemented by means of hardware such as circuits.

Although the present invention is disclosed above, the present invention is not limited thereto. Any person skilled in the art can make various changes and modifications without departing from the spirit and scope of the present invention, so the protection scope of the present invention should be based on the scope defined in the claims.

Claims

A key point processing method, characterized in that, comprising:

Collect the previous frame image and the current frame image in the video, and detect the positions of multiple key points respectively;

For each key point, determine the position offset of the previous frame image and the current frame image, and then filter the position offset to obtain the filtered inter-frame offset;

For each key point, the position of the key point in the current frame image is adjusted according to the filtered inter-frame offset and the position of the previous frame image.
The key point processing method according to claim 1, wherein the following formula is used to determine the position offset between the previous frame image and the current frame image, and then filter the position offset, Get the inter-frame offset after filtering:

diff offset = Rigid cur - Rigid pre

diff smooth = Filter(diff offset , radius)

Among them, diff offset is used to indicate the position offset of the key point in the previous frame image and the current frame image, Rigid cur is used to indicate the position of the key point in the key point of the current frame image, and Rigid pre uses To represent the position of the key point in the previous frame image, Filter() is used to represent the filter processing function, diff smooth is used to represent the inter-frame offset after filtering of the key point, and radius is used to represent the filtering radius.
The key point processing method according to claim 2, wherein the following formula is used to adjust the position of the key point in the current frame image according to the filtered inter-frame offset and the position of the previous frame image. Location:

dst pts = Rigid pre + diff smooth

Among them, dst pts is used to indicate the adjusted position of the key point in the current frame image, Rigid pre is used to indicate the position of the key point in the previous frame image, and diff smooth is used to indicate the filtering of the key point Back frame offset.
The key point processing method according to claim 2, wherein, before adjusting the position of the key point in the current frame image according to the filtered inter-frame offset and the position of the previous frame image, further include:

Determine the maximum global offset between frames according to the filtered inter-frame offset;

Determine the first key point offset regression coefficient according to the maximum global offset between frames and the first piecewise linear function;

According to the position of the inter-frame offset after the filtering and the previous frame image, adjusting the key point at the position of the current frame image includes:

Adjust the position of the key point in the current frame image according to the product of the filtered inter-frame offset and the first key point offset regression coefficient, and the position of the previous frame image.
The key point processing method according to claim 4, wherein the following formula is used to determine the first key point offset regression coefficient according to the maximum global offset between the frames and the first piecewise linear function:

f 1 (Diff max ) = 0, if Diff max ≤ 1;

f 1 (Diff max )=(Diff max -N)/(N-1)+1, else if 1<Diff max ≤N;

f 1 (Diff max )=1, else if Diff max ≥ N;

Among them, Diff max is used to represent the maximum global offset between frames, f 1 (x) is used to represent the first piecewise linear function, f 1 (Diff max ) is used to represent the first key point offset regression coefficient, and N is positive rational number.
The key point processing method according to claim 5, wherein,

The smaller the jitter average value of one or more key point positions between two adjacent frames of images in the historical video is, the smaller the selected value of N is.
The key point processing method according to claim 4, wherein the following formula is used, according to the product of the filtered inter-frame offset and the first key point offset regression coefficient, and the previous frame image position, adjust the position of the key point in the current frame image:

dst pts = Rigid pre +f 1 (Diff max )×diff smooth

Wherein, dst pts is used to represent the adjusted position of the key point in the current frame image, Rigid pre is used to represent the position of the key point in the previous frame image, and f 1 (Diff max ) is used to represent the first A key point offset regression coefficient, diff smooth is used to represent the filtered inter-frame offset of the key point.
The key point processing method according to claim 2, wherein, before adjusting the position of the key point in the current frame image according to the filtered inter-frame offset and the position of the previous frame image, further include:

Determine the maximum global offset between frames according to the filtered inter-frame offset;

Determine key point acquisition stability calibration parameters according to the position offset of each key point in the previous frame image and the current frame image;

Determine a second key point offset regression coefficient according to the maximum global offset between frames, the key point acquisition stability calibration parameter, and the second piecewise linear function;

Adjusting the position of the key point in the current frame image according to the filtered inter-frame offset and the position of the previous frame image includes:

Adjust the position of the key point in the current frame image according to the product of the filtered inter-frame offset and the second key point offset regression coefficient, and the position of the previous frame image.
The key point processing method according to claim 4 or 8, wherein the following formula is used to determine the maximum global offset between frames according to the filtered inter-frame offset:

Diff max = MAX(diff smooth_all )

Among them, Diff max is used to indicate the maximum global offset between frames, MAX() is the maximum value function, and diff smooth_all is used to indicate the filtered inter-frame offset of each key point.
The key point processing method according to claim 8, wherein the following formula is used to determine the key point acquisition stability calibration parameters according to the position offset of each key point in the previous frame image and the current frame image :

diff pts = ABS(Rigid cur -Rigid pre )

Stable thr = Min(M,Max(diff pts_all ))

Among them, diff pts is used to represent the absolute value of the position offset of the key point in the previous frame image and the current frame image, ABS uses () to represent the absolute value function, diff pts_all is used to represent each key point in the The absolute value of the position offset between the previous frame image and the current frame image, Max() is the maximum value function, Min() is the minimum value function, Stable thr is used to indicate the key point acquisition stability calibration parameter, M is positive rational numbers.
The key point processing method according to claim 10, wherein,

Selecting two adjacent frames of images in the historical video, the smaller the mean value of the jitter of one or more key point positions between the two frames of images, the smaller the selected value of M.
The key point processing method according to claim 8, characterized in that, the following formula is used to determine according to the maximum global offset between frames, the key point acquisition stability calibration parameters and the second piecewise linear function The second key point offset regression coefficient:

f 2 (Diff max ) = 0, if Diff max ≤ 1;

f 2 (Diff max )=(Diff max -Stable thr )/(Stable thr -1)+1, else if 1<Diff max ≤Stable thr ;

f 2 (Diff max )=1, else if Diff max ≥ Stable thr ;

Among them, Diff max is used to represent the maximum global offset between frames, f 2 (x) is used to represent the second piecewise linear function, f 2 (Diff max ) is used to represent the second key point offset regression coefficient, Stable thr Collect stability calibration parameters for key points.
The key point processing method according to claim 8, wherein the following formula is used, according to the product of the filtered inter-frame offset and the second key point offset regression coefficient, and the previous frame image position, adjust the position of the key point in the current frame image:

dst pts = Rigid pre +f 2 (Diff max )×diff smooth

Wherein, dst pts is used to indicate the adjusted position of the key point in the current frame image, Rigid pre is used to indicate the position of the key point in the previous frame image, and f 2 (Diff max ) is used to indicate the first Two key point offset regression coefficients, diff smooth is used to represent the filtered inter-frame offset of the key point.
The key point processing method according to claim 1, wherein one or more of the following are satisfied:

The key points are human face key points;

The position of the key point is the sequence number of the pixel of the key point in the image.
A key point processing device is characterized in that it comprises:

The key point position detection module is used to collect the previous frame image and the current frame image in the video, and detect the positions of multiple key points respectively;

A filtering module, for each key point, determine the position offset of the previous frame image and the current frame image, and then filter the position offset to obtain the filtered inter-frame offset;

The position adjustment module is configured to, for each key point, adjust the position of the key point in the current frame image according to the filtered inter-frame offset and the position of the previous frame image.
A readable storage medium on which a computer program is stored, wherein the computer program executes the steps of the key point processing method according to any one of claims 1 to 14 when the computer program is run by a processor.
A terminal, comprising a memory and a processor, the memory stores a computer program that can run on the processor, wherein the processor executes any one of claims 1 to 14 when running the computer program. The steps of the key point processing method described in the item.