WO2022121577A1 - Image processing method and apparatus - Google Patents

Image processing method and apparatus Download PDF

Info

Publication number
WO2022121577A1
WO2022121577A1 PCT/CN2021/128769 CN2021128769W WO2022121577A1 WO 2022121577 A1 WO2022121577 A1 WO 2022121577A1 CN 2021128769 W CN2021128769 W CN 2021128769W WO 2022121577 A1 WO2022121577 A1 WO 2022121577A1
Authority
WO
WIPO (PCT)
Prior art keywords
face
key point
target
image
relative distance
Prior art date
Application number
PCT/CN2021/128769
Other languages
French (fr)
Chinese (zh)
Inventor
刘易周
Original Assignee
北京达佳互联信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京达佳互联信息技术有限公司 filed Critical 北京达佳互联信息技术有限公司
Publication of WO2022121577A1 publication Critical patent/WO2022121577A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present disclosure relates to the technical field of image processing, and in particular, to an image processing method, an apparatus, an electronic device, and a storage medium.
  • a deep neural network can be used to perform semantic segmentation on the images captured by the terminal or in real time to obtain image processing results such as key points of the face, the mask map of the hairstyle area, and the mask map of the facial features.
  • the processing results achieve many creative effects, such as the enlargement and dislocation of facial features, face stickers, and makeup.
  • an image processing method including: collecting face video data, and using each frame of face image in the face video data as an image to be processed; performing face recognition to obtain face key points; extracting reference key points in a preset area from the face key points, and determining an amplification factor according to the position information of the reference key points; Amplify, and perform face tracking on the enlarged face according to the face motion information obtained by the face recognition.
  • the method further includes: obtaining target key points in the central area of the face from the reference key points; The target position corresponding to the target key point; the face is moved in the direction of the target position relative to the target key point until the target key point reaches the target position.
  • the preset correspondence between the position of the target key point and the target position includes: the position of the target key point includes a plurality of value intervals, and each value interval corresponds to a corresponding change
  • the change rate corresponds to the degree to which the target position changes as the position of the target key point changes, and the value interval is determined based on the size of the image capture page in a preset direction.
  • the size of the image capture page in a preset direction is divided to obtain a first value space, a second value space and a third value space connected in sequence; the rate of change is as follows The method is determined: the coordinate value in the preset direction is obtained from the position of the target key point; in response to the coordinate value being located in the first value space or the third value space, the rate of change of the target position is a first rate of change; when the coordinate value is located in the second value space, the rate of change of the target position is a second rate of change; the first rate of change is greater than the second rate of change.
  • the target key point is a nose tip key point.
  • the determining the magnification factor according to the position information of the reference key point includes: determining a first relative distance of a horizontal area and a second relative distance of a vertical area according to the position information of the reference key point; obtaining The three-dimensional angle of the face, which includes a pitch angle and a yaw angle; a first weight corresponding to the first relative distance is determined according to the pitch angle and the yaw angle, and a first weight corresponding to the second relative distance is determined according to the pitch angle and the yaw angle.
  • the second weight corresponding to the relative distance obtain the sum of the product of the first relative distance and the first weight and the product of the second relative distance and the second weight; determine the magnification factor as the image capture page The ratio of the width of , to the sum of said products.
  • the determining a first weight corresponding to the first relative distance and a second weight corresponding to the second relative distance according to the pitch angle and the yaw angle includes: determining The first weight is the ratio of the pitch angle to the sum of the pitch angle and the yaw angle; the second weight is determined to be the sum of the yaw angle and the pitch angle and the yaw angle. and the ratio.
  • the preset area is a T-shaped area of the human face, and the T-shaped area of the human face includes the central area of the forehead and the central area of the human face;
  • the reference key points include left eye key points, right eye key points, The key point between the eyebrows and the key point of the tip of the nose;
  • the determining of the first relative distance in the horizontal area and the second relative distance in the vertical area according to the position information of the reference key point includes: determining the first relative distance as the left The distance between the eye key point and the right eye key point; the second relative distance is determined as the distance between the eyebrow key point and the nose tip key point.
  • an image processing apparatus including: an image acquisition module configured to collect face video data, and use each frame of face image in the face video data as an image to be processed a face recognition module, configured to perform face recognition on the to-be-processed image to obtain a face key point; a coefficient determination module, configured to extract a reference key point of a preset area from the face key point, The amplification factor is determined according to the position information of the reference key point; the face processing module is configured to amplify the human face according to the amplification factor, and according to the face motion information obtained by the face recognition face tracking.
  • the apparatus further includes: a key point acquisition module configured to acquire target key points in the central area of the face from the reference key points; a position determination module configured to obtain target key points according to a preset target The correspondence between the position of the key point and the target position determines the target position corresponding to the target key point; the moving module is configured to move the face in the direction of the target position relative to the target key point, until the target key point reaches the target position.
  • the preset correspondence between the position of the target key point and the target position includes: the position of the target key point includes a plurality of value intervals, and each value interval corresponds to a corresponding change
  • the change rate corresponds to the degree to which the target position changes as the position of the target key point changes, and the value interval is determined based on the size of the image capture page in a preset direction.
  • the size of the image capture page in a preset direction is divided to obtain a first value space, a second value space and a third value space connected in sequence; the rate of change is as follows The method is determined: the coordinate value in the preset direction is obtained from the position of the target key point; when the coordinate value is located in the first value space or the third value space, the rate of change of the target position is the first rate of change; in response to the coordinate value being located in the second value space, the rate of change of the target position is the second rate of change; the first rate of change is greater than the second rate of change.
  • the target key point is a nose tip key point.
  • the coefficient determination module includes: a distance determination unit configured to determine a first relative distance of a horizontal area and a second relative distance of a vertical area according to the position information of the reference key point; an angle acquisition unit , is configured to obtain a three-dimensional angle of the face, the three-dimensional angle of the face includes a pitch angle and a yaw angle; a weight determination unit is configured to determine the first relative distance from the first relative distance according to the pitch angle and the yaw angle the corresponding first weight, and the second weight corresponding to the second relative distance; the computing unit is configured to obtain the product of the first relative distance and the first weight and the second relative distance and the the sum of the products of the second weights; the coefficient determination unit is configured to determine the enlargement coefficient as the ratio of the width of the image capturing page to the sum of the products.
  • the weight determination unit is configured to determine the first weight as a ratio of the pitch angle to the sum of the pitch angle and the yaw angle; determine the second weight as the The ratio of the yaw angle to the sum of the pitch angle and the yaw angle.
  • the preset area is a T-shaped area of the human face, and the T-shaped area of the human face includes the central area of the forehead and the central area of the human face;
  • the reference key points include left eye key points, right eye key points, a key point between the eyebrows and a key point on the tip of the nose;
  • the distance determining unit is configured to determine the first relative distance as the distance between the left eye key point and the right eye key point; determine the second relative distance The distance is the distance between the key point between the eyebrows and the key point of the tip of the nose.
  • an electronic device comprising: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to execute the instructions to implement any one of the first aspect The image processing method described in the embodiment.
  • a storage medium when an instruction in the storage medium is executed by a processor of an electronic device, the electronic device can execute the image described in any one of the embodiments of the first aspect Approach.
  • a computer program product comprising a computer program, the computer program being stored in a readable storage medium, and at least one processor of a device from the readable storage medium The computer program is read and executed, so that the device executes the image processing method described in any one of the embodiments of the first aspect.
  • Each frame of face image in the collected face video data is used as an image to be processed, and face recognition is performed on the image to be processed to obtain face key points.
  • the reference key points of the preset area are extracted from the face key points.
  • the magnification factor is determined based on the position information of the reference key point in the preset area.
  • the face is enlarged according to the amplification factor, and face tracking is performed on the enlarged face according to the face motion information obtained by face recognition.
  • FIG. 1 is an application environment diagram of an image processing method according to an exemplary embodiment.
  • Fig. 2 is a flowchart of an image processing method according to an exemplary embodiment.
  • Fig. 3 is a flowchart showing a step of determining a target position according to an exemplary embodiment.
  • Fig. 4 is a schematic diagram of a piecewise function according to an exemplary embodiment.
  • Fig. 5 is a flowchart showing a step of determining an amplification factor according to an exemplary embodiment.
  • Fig. 6 is a flowchart of an image processing method according to an exemplary embodiment.
  • Fig. 7 is a schematic diagram of processing an image according to an exemplary embodiment.
  • Fig. 8 is a schematic diagram of processing an image according to an exemplary embodiment.
  • Fig. 9 is a block diagram of an image processing apparatus according to an exemplary embodiment.
  • Fig. 10 is an internal structure diagram of an electronic device according to an exemplary embodiment.
  • a deep neural network can be used to perform semantic segmentation on the images captured by the terminal or in real time to obtain image processing results such as key points of the face, the mask map of the hairstyle area, and the mask map of the facial features.
  • the processing results achieve many creative effects, such as the enlargement and dislocation of facial features, face stickers, and makeup.
  • the face in the image can be focused, so that the face can be displayed at the center of the screen with a larger area.
  • the following two implementations are often used to focus on the human face:
  • the image processing method provided by the present disclosure can be applied to the application environment shown in FIG. 1 .
  • the terminal 110 is pre-deployed with a face pose estimation method for face pose estimation, and an image processing logic supporting image processing based on the face recognition result.
  • the face pose estimation method can be a deep learning model-based method, an appearance-based method, a classification-based method, and the like. Face pose estimation methods and image processing logic can be embedded in the application. Applications are not limited to social applications, instant messaging applications, short video applications, and the like.
  • the terminal 110 collects face video data, and uses each frame of face image in the face video data as an image to be processed. Perform face recognition on the image to be processed to obtain face key points.
  • the reference key points of the preset area are extracted from the face key points, and the amplification factor is determined according to the position information of the reference key points; face for face tracking.
  • the terminal 110 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers and portable wearable devices.
  • FIG. 2 is a flowchart of an image processing method according to an exemplary embodiment. As shown in FIG. 2 , the image processing method used in the terminal 110 includes the following steps.
  • step S210 face video data is collected, and each frame of face image in the face video data is used as an image to be processed.
  • the face video data can be acquired by the image acquisition device.
  • the image acquisition device may be a device provided in the terminal; it may also be an independent device, such as a camera, a video camera, and the like.
  • the client may automatically control the image acquisition device to acquire the user's face video data after receiving the image processing instruction.
  • the image processing instruction may be triggered by the user by clicking on a preset face processing control or the like.
  • the client takes the face image of the current frame in the face video data as the image to be processed, and processes the face image of the current frame in real time according to the content described in steps S220 to S240 while collecting the face video data.
  • the objects to be processed may also be other human body parts other than the human face, such as hands, limbs, etc.; and may even be other types, such as animals, buildings, stars, and the like.
  • the to-be-processed image may also be a pre-shot still image stored in a local database or a server, or a still image captured in real time.
  • step S220 face recognition is performed on the image to be processed to obtain face key points.
  • a method based on a deep learning model can be used for face recognition of the image to be processed.
  • the deep learning model can be any model that can be used for face key point recognition, for example, DCNN (Deep Convolutional Network, deep convolutional neural network model) and the like.
  • the face key points can be predefined, and the number includes at least one.
  • each sample image is annotated according to pre-defined key point related information (such as key point ranking, key point positions, etc.).
  • the labeled sample images are used to train the deep learning model to obtain a deep learning model that can output the position information of the key points of the face.
  • the client inputs the acquired image to be processed into the trained deep learning model in real time to obtain the face key points.
  • step S230 the reference key points of the preset area are extracted from the face key points, and the magnification coefficient is determined according to the position information of the reference key points.
  • the preset area refers to an area that can roughly represent the position of the face in the image to be processed, for example, it may be a contour area, a center area, and the like of the human face.
  • the number of reference key points in the preset area may include multiple, which does not exclude the case of one.
  • the magnification factor is used to amplify the human face, so as to increase the proportion of the human face in the image capture page, so as to achieve the effect of highlighting the human face.
  • the key point related information of the reference key point is pre-configured in the client.
  • the client can extract the reference key points from the face key points according to the key point related information of the reference key points.
  • the client obtains the location information of the reference key point.
  • the size of the preset area is calculated according to the position information of the reference key points.
  • the size of the preset area can be characterized by parameters such as the frame size of the preset area and the distance between key points in the preset area.
  • the amplification factor is obtained by a preset algorithm. The preset algorithm depends on the specific situation.
  • the size of the preset area can be compared with a preset constant to obtain the magnification factor;
  • the matching magnification factor is obtained from the method; or, a correlation function is established by summarizing multiple experiments, and the magnification factor is calculated according to the size of the preset area through the correlation function.
  • step S240 the human face is enlarged according to the enlargement coefficient, and face tracking is performed on the enlarged human face according to the face motion information obtained by the face recognition.
  • the face motion information is not limited to including the face motion angle, the face motion trajectory, etc., and the face motion information can be output synchronously when the key points of the face are output through the deep learning model.
  • the client after obtaining the magnification factor, performs an enlarged display on the face in the to-be-processed image of the current frame by the magnification factor. At the same time, the client obtains the face motion information of the current frame, and controls the enlarged face in the to-be-processed image to move according to the face motion information, so as to achieve the effect of real-time face tracking.
  • the process of face enlargement can be displayed through animation special effects.
  • each frame of face image in the collected face video data is used as an image to be processed, and face recognition is performed on the image to be processed to obtain face key points.
  • the reference key points of the preset area are extracted from the face key points.
  • the magnification factor is determined based on the position information of the reference key point in the preset area.
  • the face is enlarged according to the amplification factor, and face tracking is performed on the enlarged face according to the face motion information obtained by face recognition.
  • the processing of the human face in the image may further include a process of moving the human face. This can be achieved by the following steps:
  • step S310 the target key point in the central area of the face is obtained from the reference key point.
  • step S320 the target position corresponding to the target key point is determined according to the preset correspondence between the position of the target key point and the target position.
  • step S330 the face is moved in the direction of the target position relative to the target key point until the target key point reaches the target position.
  • the target key point refers to the key point that can roughly represent the central position of the face.
  • the preset area can be the facial features area (including the area of the eyes, nose, and mouth), the T-shaped area (including the central area of the forehead and the face of the human face). The central area of the forehead and the nasal canal), the bridge of the nose (the area between the forehead and the quasi-head), etc.; the target key points can be selected correspondingly in the preset area. key points etc.
  • the target position refers to the position where the target key point is moved, and is used to display the face near the center of the image capture page without affecting the rendering effect of the face.
  • keypoint-related information of target keypoints may be pre-configured in the client.
  • the target key points are extracted from the face keys according to the key point related information of the target key points.
  • the target position corresponding to the target key point of the current frame is determined from the preset correspondence between the position of the target key point and the target position. The client moves the face according to the direction of the target position relative to the target key point until the target key point reaches the target position.
  • the face after the face is moved or before the face is moved, the face can also be enlarged by a multiplier of the enlargement factor according to the enlargement factor obtained in the above embodiment, with the target key point as the center.
  • the moving and enlarging process of the human face can be displayed through animation special effects.
  • the target key points are extracted from the reference key points, and the target position is obtained based on the target key points.
  • the efficiency of face processing can be improved; on the other hand, since the target key points can roughly represent the Therefore, the use of target key points can also improve the accuracy of face processing.
  • the preset correspondence between the position of the target key point and the target position includes: the position of the target key point includes a plurality of value intervals, each value interval corresponds to a corresponding change rate, and the change rate is The degree to which the target position changes with the position of the target key point, and the value interval is determined based on the size of the image capture page in the preset direction.
  • the image collection page may be an image page displayed by the client.
  • the preset direction can be determined according to the shooting angle of the image to be processed.
  • the shooting angle is vertical screen shooting
  • the preset direction may be the horizontal direction when the terminal device is placed in vertical screen.
  • the size in the preset direction can be characterized by the pixel size.
  • the rate of change is used to measure the degree to which the target position changes as the position of the target keypoints within the image to be processed changes.
  • the relative positions of the target key points and the image to be processed include an edge region and a central region of the image to be processed.
  • the value interval of the edge area and the center area can be predefined.
  • the rate of change of the target key point in the value range of the edge area is different from the change rate of the target key point in the value range corresponding to the central area.
  • the image processing process can present a better face tracking effect, and the presentation effect of the face will not be distorted.
  • the size of the image capture page in the preset direction is divided to obtain a first value space, a second value space and a third value space connected in sequence; the rate of change is determined in the following manner: from The coordinate value in the preset direction is obtained from the position of the target key point; when the coordinate value is in the first value space or the third value space, the change rate of the target position is the first change rate; when the coordinate value is in the second value space, the change rate is the first change rate; In the value space, the rate of change of the target position is the second rate of change; the first rate of change is greater than the second rate of change.
  • first change rate and/or the second change rate may be constant or indefinite.
  • Sequential connection means that the values of the first value space and the second value space are connected end to end, and the values of the second value interval and the third value interval are connected end to end.
  • the first value space and the second value interval can be used to represent the edge area of the image capturing page.
  • the third value interval can be used to represent the central area of the image capture page.
  • the pixel of the image acquisition device is 720*1280px (pixel, pixel)
  • 720 is the horizontal pixel width when the terminal device is placed in a vertical screen
  • 1280 is the vertical pixel width when the terminal device is placed in a vertical screen.
  • the default orientation is the horizontal orientation when the terminal device is placed in portrait orientation.
  • the pixel width 720 in the horizontal direction can be divided to obtain three value intervals connected in sequence, for example, divided into three value intervals of 0-200, 200-520, and 520-720.
  • the locations of target keypoints may be characterized by pixel coordinates.
  • the client obtains the coordinate value of the preset direction from the position of the target key point. Determine which value space the coordinate value belongs to. If it belongs to the first value interval or the third value interval, it means that the target key point is located in the edge area of the image capture page, and the client obtains the target position according to the first change rate; if the coordinate value is in the second value interval, it means If the target key point is located in the central area of the image capture page, the client obtains the target position according to the second change rate. Since the first change rate is greater than the second change rate, the change range of the target part in the edge area will be greater than that in the center area.
  • the pixels of the image to be processed are 720*1280px, and 720 is the horizontal pixel width when the terminal device is placed in a vertical screen.
  • the horizontal pixels are divided into three value ranges of 0-200, 200-520, and 520-720 which are connected in sequence.
  • the pixel coordinates of the target position in the horizontal direction can be obtained by the following piecewise function:
  • offset and centerPosX are the pixel coordinates of the target position in the horizontal direction; curPixelPosX is the pixel coordinates of the target key point in the horizontal direction.
  • the target position can also be obtained with reference to the piecewise function shown in FIG. 4 .
  • the pixel coordinates of the target position can also be mapped to the space of 0-1.
  • the target position changes rapidly from (0, y) to (0.5, y) with the position of the target key point;
  • the target position changes smoothly around (0.5, y) with the position change of the target key point, or even does not change.
  • y represents the coordinate value of the target key point in the vertical direction. It can be understood that although the rate of change of the piecewise function shown in FIG. 4 (which can be represented by a slope) is a constant, the rate of change of the piecewise function may also be an indefinite number in practical applications.
  • the efficiency of image processing can be accelerated, so that the image processing process can present a better face tracking effect, and the presentation of the face will not be affected.
  • the effect is distorted.
  • the amplification factor is determined according to the position information of the reference key point, which can be achieved by the following steps:
  • step S510 the first relative distance of the horizontal area and the second relative distance of the vertical area are determined according to the position information of the reference key point.
  • the horizontal area and the vertical area are obtained by dividing the preset area.
  • a group of key points located at both ends of the horizontal area in the horizontal area are acquired, and the first relative distance of the horizontal area is obtained by calculating the position information of the group of key points.
  • the vertical area a group of key points located at both ends of the vertical area in the vertical area are obtained, and the second relative distance of the vertical area is obtained by calculating the position information of a group of key points.
  • the preset area is a T-shaped area of the human face, and the T-shaped area of the human face includes the central area of the forehead and the central area of the human face; the reference key points include the left eye key point, the right eye key point, the eyebrow key point and the tip of the nose key point.
  • the first relative distance may be the distance between the left eye key point and the right eye key point, which may be calculated according to the position information of the left eye key point and the right eye key point.
  • the second relative distance may be the distance between the key point between the eyebrows and the key point of the tip of the nose, which may be calculated according to the position information of the key point between the eyebrows and the key point of the tip of the nose.
  • step S520 a three-dimensional angle of the face is obtained, and the three-dimensional angle of the face includes a pitch angle and a yaw angle.
  • the three-dimensional angle of the face can be represented by Euler angles.
  • Euler angle refers to the rotation angle of an object around the three coordinate axes (x, y, z axis) of the coordinate system.
  • Euler angles can be obtained by performing gesture recognition on the key points of the face.
  • the pose estimation algorithm of Opencv an open source computer vision library
  • the Euler angle includes a pitch angle and a yaw angle.
  • the pitch angle (pitch) represents the angle that the object rotates around the x-axis
  • the yaw angle (yaw) represents the angle that the object rotates around the y-axis.
  • step S530 a first weight corresponding to the first relative distance and a second weight corresponding to the second relative distance are determined according to the pitch angle and the yaw angle.
  • the ratio of the pitch angle to the sum of the pitch angle and the yaw angle can be used as the first weight A, that is:
  • the ratio of the yaw angle to the sum of the pitch angle and the yaw angle can be used as the second weight B, namely:
  • step S540 the product of the first relative distance and the first weight and the sum of the product of the second relative distance and the second weight are obtained.
  • the magnification factor is determined as the ratio of the width of the image capturing page to the sum of the products.
  • the client obtains the first relative distance of the horizontal area, the second relative distance of the vertical area, and the first weight corresponding to the first relative distance and the second weight corresponding to the second relative distance
  • the sum of the product between the first relative distance and the first weight and the product between the second relative distance and the second weight is calculated.
  • the sum of the products can be obtained by the following formula:
  • scaleHelpValue represents the sum of the products; ewidth represents the first relative distance of the horizontal area; nHeight represents the second relative distance of the vertical area; A represents the first weight; B represents the second weight.
  • magnification factor can be obtained by the following formula:
  • scaleValue represents the magnification factor
  • width represents the width of the image capture page in the preset direction.
  • the amplification factor can be quickly obtained according to the preconfigured calculation formula, which avoids the performance bottleneck caused by many key points, and at the same time speeds up the acquisition efficiency of the amplification factor.
  • the amplification factor can be quickly obtained according to the preconfigured calculation formula, which avoids the performance bottleneck caused by many key points, and at the same time speeds up the acquisition efficiency of the amplification factor.
  • Fig. 6 is a flowchart of an image processing method according to some embodiments.
  • the terminal is a user's handheld device with a built-in image acquisition device, such as a smart phone, a tablet computer, a portable wearable device, and the like.
  • the image to be processed is the face image of the current frame in the face video data collected by the user's handheld device. As shown in Figure 6, the following steps are included.
  • step S602 face video data is collected through the user's handheld device.
  • step S604 face recognition is performed on the face image of the current frame in the face video data by using a deep learning model to obtain face key points.
  • step S606 the three-dimensional angle of the face obtained by estimating the pose of the face according to the key points of the face is obtained.
  • the three-dimensional angle of the face is represented by Euler angles, including pitch angle and yaw angle.
  • step S608 the reference key points of the T-shaped area are extracted from the face key points.
  • the reference key points include the left eye key point, the right eye key point, the nose tip key point and the eyebrow tip key point.
  • step S610 the first relative distance of the horizontal area is obtained by calculating according to the position information of the left eye key point and the right eye key point.
  • the second relative distance of the vertical area is calculated according to the position information of the key point of the nose tip and the key point of the eyebrow tip.
  • step S612 the ratio of the pitch angle to the sum of the pitch angle and the yaw angle is used as the first weight; the ratio of the yaw angle to the sum of the pitch angle and the yaw angle is used as the second weight.
  • step S614 the product of the first relative distance and the first weight and the sum of the product of the second relative distance and the second weight are obtained.
  • the magnification factor is determined as the ratio of the width of the image capturing page to the sum of the products.
  • step S618 the target position corresponding to the position of the nose tip key point in the current frame is determined from the preset correspondence between the position of the nose tip key point and the target position.
  • the correspondence between the position of the key point of the nose tip and the target position can be represented by a piecewise function, and the specific implementation of the piecewise function can refer to the above-mentioned embodiments, which will not be described in detail here.
  • step S620 the human face is moved until the key point of the nose tip reaches the target position. Taking the key point of the nose tip as the center, the face is enlarged by the magnification factor according to the magnification factor.
  • FIG. 7 is a schematic diagram obtained by processing a human face by the method in this embodiment
  • FIG. 8 is a schematic diagram obtained by using a processing method of a single key point in the related art. Comparing Fig. 7 and Fig. 8, it can be seen that for the same original image, the processing method of a single key point in the related art may not be stable enough and prone to distortion (the left ear is too stretched). By means of the present disclosure, the operating pressure of the device can be reduced, and a better image processing effect can also be obtained.
  • steps in the above flow charts are displayed in sequence according to the arrows, these steps are not necessarily executed in the sequence indicated by the arrows. Unless explicitly stated herein, there is no strict order in the execution of these steps, and these steps may be performed in other orders. Moreover, at least a part of the steps in the above flow chart may include multiple steps or multiple stages. These steps or stages are not necessarily executed at the same time, but may be executed at different times. The execution sequence of these steps or stages It is also not necessarily performed sequentially, but may be performed alternately or alternately with other steps or at least a portion of a step or phase within the other steps.
  • FIG. 9 is a block diagram of an image processing apparatus 900 according to some embodiments.
  • the apparatus 900 includes an image acquisition module 901 , a face recognition module 902 , a coefficient determination module 903 and a face processing module 904 .
  • the image acquisition module 901 is configured to collect face video data, and use each frame of face image in the face video data as an image to be processed; the face recognition module 902 is configured to perform face recognition on the image to be processed, and obtain a human face. face key points; the coefficient determination module 903 is configured to extract the reference key points of the preset area from the face key points, and determine the amplification factor according to the position information of the reference key points; the face processing module 904 is configured to be based on the amplification factor The face is enlarged, and face tracking is performed on the enlarged face according to the face motion information obtained by face recognition.
  • the apparatus further includes: a key point acquisition module configured to acquire target key points in the central area of the face from reference key points; a position determination module configured to obtain target key points according to preset target key points The corresponding relationship between the position and the target position is determined, and the target position corresponding to the target key point is determined; the moving module is configured to move the face in the direction of the target position relative to the target key point until the target key point reaches the target position.
  • a key point acquisition module configured to acquire target key points in the central area of the face from reference key points
  • a position determination module configured to obtain target key points according to preset target key points The corresponding relationship between the position and the target position is determined, and the target position corresponding to the target key point is determined
  • the moving module is configured to move the face in the direction of the target position relative to the target key point until the target key point reaches the target position.
  • the preset correspondence between the position of the target key point and the target position includes: the position of the target key point includes a plurality of value intervals, each value interval corresponds to a corresponding change rate, and the change rate is The degree to which the target position changes with the position of the target key point, and the value interval is determined based on the size of the image capture page in the preset direction.
  • the size of the image capture page in the preset direction is divided to obtain a first value space, a second value space and a third value space connected in sequence; the rate of change is determined in the following manner: from The coordinate value in the preset direction is obtained from the position of the target key point; when the coordinate value is in the first value space or the third value space, the change rate of the target position is the first change rate; when the coordinate value is in the second value space, the change rate is the first change rate; In the value space, the rate of change of the target position is the second rate of change; the first rate of change is greater than the second rate of change.
  • the target keypoint is a nose tip keypoint.
  • the coefficient determination module 903 includes: a distance determination unit configured to determine a first relative distance of the horizontal area and a second relative distance of the vertical area according to the position information of the reference key points; an angle acquisition unit configured to In order to obtain the three-dimensional angle of the face, the three-dimensional angle of the face includes a pitch angle and a yaw angle; the weight determination unit is configured to determine the first weight corresponding to the first relative distance according to the pitch angle and the yaw angle, and the second relative distance.
  • a calculation unit configured to obtain the sum of the product of the first relative distance and the first weight and the sum of the product of the second relative distance and the second weight; the coefficient determination unit configured to determine the magnification factor as the image The ratio of the width of the capture page to the sum of the products.
  • the weight determination unit is configured to determine the first weight as the ratio of the pitch angle to the sum of the pitch angle and the yaw angle; and to determine the second weight as the ratio of the yaw angle to the sum of the pitch angle and the yaw angle ratio.
  • the preset area is a T-shaped area of the human face, and the T-shaped area of the human face includes the central area of the forehead and the central area of the human face;
  • the reference key points include the left eye key point, the right eye key point, the eyebrow key point and the tip of the nose key point;
  • a distance determination unit configured to determine the first relative distance as the distance between the left eye key point and the right eye key point; and determine the second relative distance as the distance between the eyebrow key point and the nose tip key point.
  • Fig. 10 shows a block diagram of a device 1000 for image processing according to some embodiments.
  • device 1000 may be a mobile phone, computer, digital broadcast terminal, messaging device, game console, tablet device, medical device, fitness device, personal digital assistant, or the like.
  • a device 1000 may include one or more of the following components: a processing component 1002, a memory 1004, a power supply component 1006, a multimedia component 1008, an audio component 1010, an input/output (I/O) interface 1012, a sensor component 1014, and Communication component 1016.
  • the processing component 1002 generally controls the overall operation of the device 1000, such as operations associated with display, phone calls, data communications, camera operations, and recording operations.
  • the processing component 1002 can include one or more processors 1020 to execute instructions to perform all or some of the steps of the methods described above.
  • processing component 1002 may include one or more modules that facilitate interaction between processing component 1002 and other components.
  • processing component 1002 may include a multimedia module to facilitate interaction between multimedia component 1008 and processing component 1002.
  • Memory 1004 is configured to store various types of data to support operation at device 1000 . Examples of such data include instructions for any application or method operating on device 1000, contact data, phonebook data, messages, pictures, videos, and the like. Memory 1004 may be implemented by any type of volatile or non-volatile storage device or combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable programmable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Magnetic or Optical Disk.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read only memory
  • EPROM erasable programmable Programmable Read Only Memory
  • PROM Programmable Read Only Memory
  • ROM Read Only Memory
  • Magnetic Memory Flash Memory
  • Magnetic or Optical Disk Magnetic Disk
  • Power supply assembly 1006 provides power to various components of device 1000 .
  • Power supply components 1006 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power to device 1000 .
  • Multimedia component 1008 includes a screen that provides an output interface between the device 1000 and the user.
  • the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user.
  • the touch panel includes one or more touch sensors to sense touch, swipe, and gestures on the touch panel. The touch sensor may not only sense the boundaries of a touch or swipe action, but also detect the duration and pressure associated with the touch or swipe action.
  • the multimedia component 1008 includes a front-facing camera and/or a rear-facing camera. When the device 1000 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera may receive external multimedia data. Each of the front and rear cameras can be a fixed optical lens system or have focal length and optical zoom capability.
  • Audio component 1010 is configured to output and/or input audio signals.
  • audio component 1010 includes a microphone (MIC) that is configured to receive external audio signals when device 1000 is in operating modes, such as call mode, recording mode, and voice recognition mode.
  • the received audio signal may be further stored in memory 1004 or transmitted via communication component 1016 .
  • audio component 1010 also includes a speaker for outputting audio signals.
  • the I/O interface 1012 provides an interface between the processing component 1002 and a peripheral interface module, which may be a keyboard, a click wheel, a button, or the like. These buttons may include, but are not limited to: home button, volume buttons, start button, and lock button.
  • Sensor assembly 1014 includes one or more sensors for providing status assessment of various aspects of device 1000 .
  • the sensor component 1014 can detect the open/closed state of the device 1000, the relative positioning of components, such as the display and keypad of the device 1000, and the sensor component 1014 can also detect a change in the position of the device 1000 or a component of the device 1000 , the presence or absence of user contact with the device 1000 , the device 1000 orientation or acceleration/deceleration and the temperature change of the device 1000 .
  • Sensor assembly 1014 may include a proximity sensor configured to detect the presence of nearby objects in the absence of any physical contact.
  • Sensor assembly 1014 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications.
  • the sensor assembly 1014 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
  • Communication component 1016 is configured to facilitate wired or wireless communication between device 1000 and other devices.
  • Device 1000 may access wireless networks based on communication standards, such as WiFi, carrier networks (such as 2G, 3G, 4G, or 5G), or a combination thereof.
  • the communication component 1016 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel.
  • the communication component 1016 also includes a near field communication (NFC) module to facilitate short-range communication.
  • the NFC module may be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.
  • RFID radio frequency identification
  • IrDA infrared data association
  • UWB ultra-wideband
  • Bluetooth Bluetooth
  • device 1000 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gates An array (FPGA), controller, microcontroller, microprocessor, or other electronic component implementation for performing the above method.
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • FPGA field programmable gates
  • controller microcontroller, microprocessor, or other electronic component implementation for performing the above method.
  • non-transitory computer-readable storage medium including instructions, such as memory 1004 including instructions, executable by the processor 1020 of the device 1000 to perform the method described above.
  • the non-transitory computer-readable storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, and the like.

Abstract

An image processing method and apparatus, an electronic device, and a storage medium. The method comprises: acquiring face video data and using each face image frame in the face video data as an image to be processed; performing face recognition on the image to be processed, so as to obtain key points of a face; extracting a reference key point of a preset region from the key points of the face, and determining an amplification coefficient according to position information of the reference key point; and amplifying the face according to the amplification coefficient, and performing face tracking on the amplified face according to face motion information obtained from face recognition.

Description

图像处理方法及装置Image processing method and device
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本申请基于申请号为202011434480.4、申请日为2020年12月10日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。This application is based on the Chinese patent application with the application number of 202011434480.4 and the filing date of December 10, 2020, and claims the priority of the Chinese patent application. The entire content of the Chinese patent application is incorporated herein by reference.
技术领域technical field
本公开涉及图像处理技术领域,尤其涉及一种图像处理方法、装置、电子设备及存储介质。The present disclosure relates to the technical field of image processing, and in particular, to an image processing method, an apparatus, an electronic device, and a storage medium.
背景技术Background technique
随着智能终端的普及以及图像处理技术的发展,越来越多的应用程序可以对图像中的人脸进行处理,以达到需要的效果,例如,智能美颜、魔法特效、人脸追踪等。而随着智能终端软硬件的飞速发展,实时渲染技术在智能终端的应用变得越来越广,在智能终端进行这些效果的实时展现也变为可能。例如,可以通过深度神经网络对终端静态、亦或实时拍摄的图像进行语义分割,获得人脸的关键点、发型区域掩码图、五官位置的掩码图等图像处理结果,利用所得到的图像处理结果实现很多有创意的效果,例如,五官的放大、错位,人脸的贴纸、贴妆等。With the popularization of smart terminals and the development of image processing technology, more and more applications can process faces in images to achieve desired effects, such as smart beauty, magic special effects, and face tracking. With the rapid development of software and hardware of intelligent terminals, the application of real-time rendering technology in intelligent terminals has become more and more extensive, and it is also possible to display these effects in real time in intelligent terminals. For example, a deep neural network can be used to perform semantic segmentation on the images captured by the terminal or in real time to obtain image processing results such as key points of the face, the mask map of the hairstyle area, and the mask map of the facial features. The processing results achieve many creative effects, such as the enlargement and dislocation of facial features, face stickers, and makeup.
发明内容SUMMARY OF THE INVENTION
根据本公开实施例的第一方面,提供一种图像处理方法,包括:采集人脸视频数据,将所述人脸视频数据中的每帧人脸图像作为待处理图像;对所述待处理图像进行人脸识别,得到人脸关键点;从所述人脸关键点中提取预设区域的参考关键点,根据所述参考关键点的位置信息确定放大系数;根据所述放大系数对人脸进行放大,并根据所述人脸识别得到的人脸运动信息对放大后的所述人脸进行人脸跟踪。According to a first aspect of the embodiments of the present disclosure, an image processing method is provided, including: collecting face video data, and using each frame of face image in the face video data as an image to be processed; performing face recognition to obtain face key points; extracting reference key points in a preset area from the face key points, and determining an amplification factor according to the position information of the reference key points; Amplify, and perform face tracking on the enlarged face according to the face motion information obtained by the face recognition.
在一个实施例中,所述方法还包括:从所述参考关键点中获取人脸中心区域中的目标 关键点;根据预设的目标关键点的位置和目标位置的对应关系,确定与所述目标关键点对应的目标位置;以所述目标位置相对所述目标关键点的方向对所述人脸进行移动,直至所述目标关键点到达所述目标位置。In one embodiment, the method further includes: obtaining target key points in the central area of the face from the reference key points; The target position corresponding to the target key point; the face is moved in the direction of the target position relative to the target key point until the target key point reaches the target position.
在一个实施例中,所述预设的目标关键点的位置和目标位置的对应关系,包括:所述目标关键点的位置包括多个取值区间,每个所述取值区间与相应的变化率对应,所述变化率为目标位置随着所述目标关键点的位置变化而变化的程度,所述取值区间基于图像采集页面在预设方向上的尺寸确定。In one embodiment, the preset correspondence between the position of the target key point and the target position includes: the position of the target key point includes a plurality of value intervals, and each value interval corresponds to a corresponding change The change rate corresponds to the degree to which the target position changes as the position of the target key point changes, and the value interval is determined based on the size of the image capture page in a preset direction.
在一个实施例中,将所述图像采集页面在预设方向上的尺寸进行划分,得到依次衔接的第一取值空间、第二取值空间和第三取值空间;所述变化率按照以下方式确定:从所述目标关键点的位置中获取所述预设方向上的坐标值;响应于所述坐标值位于第一取值空间或第三取值空间,所述目标位置的变化率为第一变化率;响应于所述坐标值位于第二取值空间时,所述目标位置的变化率为第二变化率;所述第一变化率大于所述第二变化率。In one embodiment, the size of the image capture page in a preset direction is divided to obtain a first value space, a second value space and a third value space connected in sequence; the rate of change is as follows The method is determined: the coordinate value in the preset direction is obtained from the position of the target key point; in response to the coordinate value being located in the first value space or the third value space, the rate of change of the target position is a first rate of change; when the coordinate value is located in the second value space, the rate of change of the target position is a second rate of change; the first rate of change is greater than the second rate of change.
在一个实施例中,所述目标关键点为鼻尖关键点。In one embodiment, the target key point is a nose tip key point.
在一个实施例中,所述根据所述参考关键点的位置信息确定放大系数,包括:根据所述参考关键点的位置信息确定水平区域的第一相对距离和垂直区域的第二相对距离;获取人脸三维角度,所述人脸三维角度包括俯仰角和偏航角;根据所述俯仰角和所述偏航角确定与所述第一相对距离对应的第一权重,以及与所述第二相对距离对应的第二权重;获取所述第一相对距离与所述第一权重的乘积和所述第二相对距离与所述第二权重的乘积之和;确定所述放大系数为图像采集页面的宽度与所述乘积之和的比值。In one embodiment, the determining the magnification factor according to the position information of the reference key point includes: determining a first relative distance of a horizontal area and a second relative distance of a vertical area according to the position information of the reference key point; obtaining The three-dimensional angle of the face, which includes a pitch angle and a yaw angle; a first weight corresponding to the first relative distance is determined according to the pitch angle and the yaw angle, and a first weight corresponding to the second relative distance is determined according to the pitch angle and the yaw angle. the second weight corresponding to the relative distance; obtain the sum of the product of the first relative distance and the first weight and the product of the second relative distance and the second weight; determine the magnification factor as the image capture page The ratio of the width of , to the sum of said products.
在一个实施例中,所述根据所述俯仰角和所述偏航角确定与所述第一相对距离对应的第一权重,以及与所述第二相对距离对应的第二权重,包括:确定所述第一权重为所述俯仰角与所述俯仰角和所述偏航角之和的比值;确定所述第二权重为所述偏航角与所述俯仰角和所述偏航角之和的比值。In one embodiment, the determining a first weight corresponding to the first relative distance and a second weight corresponding to the second relative distance according to the pitch angle and the yaw angle includes: determining The first weight is the ratio of the pitch angle to the sum of the pitch angle and the yaw angle; the second weight is determined to be the sum of the yaw angle and the pitch angle and the yaw angle. and the ratio.
在一个实施例中,所述预设区域为人脸T型区域,所述人脸T型区域包括额头中央区域和人脸中央区域;所述参考关键点包括左眼关键点、右眼关键点、眉间关键点和鼻尖关键点;所述根据所述参考关键点的位置信息确定水平区域的第一相对距离和垂直区域的第二相对距离,包括:确定所述第一相对距离为所述左眼关键点与所述右眼关键点之间的距离;确定所述第二相对距离为所述眉间关键点与所述鼻尖关键点之间的距离。In one embodiment, the preset area is a T-shaped area of the human face, and the T-shaped area of the human face includes the central area of the forehead and the central area of the human face; the reference key points include left eye key points, right eye key points, The key point between the eyebrows and the key point of the tip of the nose; the determining of the first relative distance in the horizontal area and the second relative distance in the vertical area according to the position information of the reference key point includes: determining the first relative distance as the left The distance between the eye key point and the right eye key point; the second relative distance is determined as the distance between the eyebrow key point and the nose tip key point.
根据本公开实施例的第二方面,提供一种图像处理装置,包括:图像采集模块,被配置为采集人脸视频数据,将所述人脸视频数据中的每帧人脸图像作为待处理图像;人脸识别模块,被配置为对所述待处理图像进行人脸识别,得到人脸关键点;系数确定模块,被配置为从所述人脸关键点中提取预设区域的参考关键点,根据所述参考关键点的位置信息确定放大系数;人脸处理模块,被配置为根据所述放大系数对人脸进行放大,并根据所述人脸识别得到的人脸运动信息对放大后的所述人脸进行人脸跟踪。According to a second aspect of the embodiments of the present disclosure, there is provided an image processing apparatus, including: an image acquisition module configured to collect face video data, and use each frame of face image in the face video data as an image to be processed a face recognition module, configured to perform face recognition on the to-be-processed image to obtain a face key point; a coefficient determination module, configured to extract a reference key point of a preset area from the face key point, The amplification factor is determined according to the position information of the reference key point; the face processing module is configured to amplify the human face according to the amplification factor, and according to the face motion information obtained by the face recognition face tracking.
在一个实施例中,所述装置还包括:关键点获取模块,被配置为从所述参考关键点中获取人脸中心区域中的目标关键点;位置确定模块,被配置为根据预设的目标关键点的位置和目标位置的对应关系,确定与所述目标关键点对应的目标位置;移动模块,被配置为以所述目标位置相对所述目标关键点的方向对所述人脸进行移动,直至所述目标关键点到达所述目标位置。In one embodiment, the apparatus further includes: a key point acquisition module configured to acquire target key points in the central area of the face from the reference key points; a position determination module configured to obtain target key points according to a preset target The correspondence between the position of the key point and the target position determines the target position corresponding to the target key point; the moving module is configured to move the face in the direction of the target position relative to the target key point, until the target key point reaches the target position.
在一个实施例中,所述预设的目标关键点的位置和目标位置的对应关系,包括:所述目标关键点的位置包括多个取值区间,每个所述取值区间与相应的变化率对应,所述变化率为目标位置随着所述目标关键点的位置变化而变化的程度,所述取值区间基于图像采集页面在预设方向上的尺寸确定。In one embodiment, the preset correspondence between the position of the target key point and the target position includes: the position of the target key point includes a plurality of value intervals, and each value interval corresponds to a corresponding change The change rate corresponds to the degree to which the target position changes as the position of the target key point changes, and the value interval is determined based on the size of the image capture page in a preset direction.
在一个实施例中,将所述图像采集页面在预设方向上的尺寸进行划分,得到依次衔接的第一取值空间、第二取值空间和第三取值空间;所述变化率按照以下方式确定:从所述目标关键点的位置中获取所述预设方向上的坐标值;响应于所述坐标值位于第一取值空间或第三取值空间时,所述目标位置的变化率为第一变化率;响应于所述坐标值位于第二取值空间时,所述目标位置的变化率为第二变化率;所述第一变化率大于所述第二变化率。In one embodiment, the size of the image capture page in a preset direction is divided to obtain a first value space, a second value space and a third value space connected in sequence; the rate of change is as follows The method is determined: the coordinate value in the preset direction is obtained from the position of the target key point; when the coordinate value is located in the first value space or the third value space, the rate of change of the target position is the first rate of change; in response to the coordinate value being located in the second value space, the rate of change of the target position is the second rate of change; the first rate of change is greater than the second rate of change.
在一个实施例中,所述目标关键点为鼻尖关键点。In one embodiment, the target key point is a nose tip key point.
在一个实施例中,所述系数确定模块,包括:距离确定单元,被配置为根据所述参考关键点的位置信息确定水平区域的第一相对距离和垂直区域的第二相对距离;角度获取单元,被配置为获取人脸三维角度,所述人脸三维角度包括俯仰角和偏航角;权重确定单元,被配置为根据所述俯仰角和所述偏航角确定与所述第一相对距离对应的第一权重,以及与所述第二相对距离对应的第二权重;计算单元,被配置为获取所述第一相对距离与所述第一权重的乘积和所述第二相对距离与所述第二权重的乘积之和;系数确定单元,被配置为确定所述放大系数为图像采集页面的宽度与所述乘积之和的比值。In one embodiment, the coefficient determination module includes: a distance determination unit configured to determine a first relative distance of a horizontal area and a second relative distance of a vertical area according to the position information of the reference key point; an angle acquisition unit , is configured to obtain a three-dimensional angle of the face, the three-dimensional angle of the face includes a pitch angle and a yaw angle; a weight determination unit is configured to determine the first relative distance from the first relative distance according to the pitch angle and the yaw angle the corresponding first weight, and the second weight corresponding to the second relative distance; the computing unit is configured to obtain the product of the first relative distance and the first weight and the second relative distance and the the sum of the products of the second weights; the coefficient determination unit is configured to determine the enlargement coefficient as the ratio of the width of the image capturing page to the sum of the products.
在一个实施例中,所述权重确定单元,被配置为确定所述第一权重为所述俯仰角与所述俯仰角和所述偏航角之和的比值;确定所述第二权重为所述偏航角与所述俯仰角和所述偏航角之和的比值。In one embodiment, the weight determination unit is configured to determine the first weight as a ratio of the pitch angle to the sum of the pitch angle and the yaw angle; determine the second weight as the The ratio of the yaw angle to the sum of the pitch angle and the yaw angle.
在一个实施例中,所述预设区域为人脸T型区域,所述人脸T型区域包括额头中央区域和人脸中央区域;所述参考关键点包括左眼关键点、右眼关键点、眉间关键点和鼻尖关键点;所述距离确定单元,被配置为确定所述第一相对距离为所述左眼关键点与所述右眼关键点之间的距离;确定所述第二相对距离为所述眉间关键点与所述鼻尖关键点之间的距离。In one embodiment, the preset area is a T-shaped area of the human face, and the T-shaped area of the human face includes the central area of the forehead and the central area of the human face; the reference key points include left eye key points, right eye key points, a key point between the eyebrows and a key point on the tip of the nose; the distance determining unit is configured to determine the first relative distance as the distance between the left eye key point and the right eye key point; determine the second relative distance The distance is the distance between the key point between the eyebrows and the key point of the tip of the nose.
根据本公开实施例的第三方面,提供一种电子设备,包括:处理器;用于存储处理器可执行指令的存储器;其中,处理器被配置为执行指令,以实现第一方面的任一项实施例中所述的图像处理方法。According to a third aspect of the embodiments of the present disclosure, there is provided an electronic device, comprising: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to execute the instructions to implement any one of the first aspect The image processing method described in the embodiment.
根据本公开实施例的第四方面,提供一种存储介质,当存储介质中的指令由电子设备的处理器执行时,使得电子设备能够执行第一方面的任一项实施例中所述的图像处理方法。According to a fourth aspect of the embodiments of the present disclosure, there is provided a storage medium, when an instruction in the storage medium is executed by a processor of an electronic device, the electronic device can execute the image described in any one of the embodiments of the first aspect Approach.
根据本公开实施例的第五方面,提供一种计算机程序产品,所述程序产品包括计算机程序,所述计算机程序存储在可读存储介质中,设备的至少一个处理器从所述可读存储介质读取并执行所述计算机程序,使得设备执行第一方面的任一项实施例中所述的图像处理方法。According to a fifth aspect of the embodiments of the present disclosure, there is provided a computer program product, the program product comprising a computer program, the computer program being stored in a readable storage medium, and at least one processor of a device from the readable storage medium The computer program is read and executed, so that the device executes the image processing method described in any one of the embodiments of the first aspect.
将采集到的人脸视频数据中的每帧人脸图像作为待处理图像,对待处理图像进行人脸识别,得到人脸关键点。然后,从人脸关键点中提取预设区域的参考关键点。基于预设区域中的参考关键点的位置信息确定放大系数。最后,根据放大系数对人脸进行放大,并根据人脸识别得到的人脸运动信息对放大后的人脸进行人脸跟踪。通过采用预设区域中的参考关键点,避免了关键点较多带来的性能瓶颈;通过以预设区域中的多个关键点为参考确定放大系数,还能确保对人脸进行放大处理的精度。Each frame of face image in the collected face video data is used as an image to be processed, and face recognition is performed on the image to be processed to obtain face key points. Then, the reference key points of the preset area are extracted from the face key points. The magnification factor is determined based on the position information of the reference key point in the preset area. Finally, the face is enlarged according to the amplification factor, and face tracking is performed on the enlarged face according to the face motion information obtained by face recognition. By using the reference key points in the preset area, the performance bottleneck caused by many key points is avoided; by using multiple key points in the preset area as a reference to determine the magnification factor, it can also ensure that the face is enlarged. precision.
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本公开。It is to be understood that the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the present disclosure.
附图说明Description of drawings
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理,并不构成对本公开的不当限定。The accompanying drawings, which are incorporated into and constitute a part of this specification, illustrate embodiments consistent with the present disclosure, and together with the description, serve to explain the principles of the present disclosure and do not unduly limit the present disclosure.
图1是根据一示例性实施例示出的一种图像处理方法的应用环境图。FIG. 1 is an application environment diagram of an image processing method according to an exemplary embodiment.
图2是根据一示例性实施例示出的一种图像处理方法的流程图。Fig. 2 is a flowchart of an image processing method according to an exemplary embodiment.
图3是根据一示例性实施例示出的一种确定目标位置步骤的流程图。Fig. 3 is a flowchart showing a step of determining a target position according to an exemplary embodiment.
图4是根据一示例性实施例示出的一种分段函数的示意图。Fig. 4 is a schematic diagram of a piecewise function according to an exemplary embodiment.
图5是根据一示例性实施例示出的一种确定放大系数步骤的流程图。Fig. 5 is a flowchart showing a step of determining an amplification factor according to an exemplary embodiment.
图6是根据一示例性实施例示出的一种图像处理方法的流程图。Fig. 6 is a flowchart of an image processing method according to an exemplary embodiment.
图7是根据一示例性实施例示出的一种对图像进行处理的示意图。Fig. 7 is a schematic diagram of processing an image according to an exemplary embodiment.
图8是根据一示例性实施例示出的一种对图像进行处理的示意图。Fig. 8 is a schematic diagram of processing an image according to an exemplary embodiment.
图9是根据一示例性实施例示出的一种图像处理装置的框图。Fig. 9 is a block diagram of an image processing apparatus according to an exemplary embodiment.
图10是根据一示例性实施例示出的一种电子设备的内部结构图。Fig. 10 is an internal structure diagram of an electronic device according to an exemplary embodiment.
具体实施方式Detailed ways
为了使本领域普通人员更好地理解本公开的技术方案,下面将结合附图,对本公开实施例中的技术方案进行清楚、完整地描述。In order to make those skilled in the art better understand the technical solutions of the present disclosure, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
需要说明的是,本公开的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本公开的实施例能够以除了在这里图示或描述的那些以外的顺序实施。以下示例性实施例中所描述的实施方式并不代表与本公开相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本公开的一些方面相一致的装置和方法的例子。It should be noted that the terms "first", "second" and the like in the description and claims of the present disclosure and the above drawings are used to distinguish similar objects, and are not necessarily used to describe a specific sequence or sequence. It is to be understood that the data so used may be interchanged under appropriate circumstances such that the embodiments of the disclosure described herein can be practiced in sequences other than those illustrated or described herein. The implementations described in the illustrative examples below are not intended to represent all implementations consistent with this disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as recited in the appended claims.
随着智能终端的普及以及图像处理技术的发展,越来越多的应用程序可以对图像中的人脸进行处理,以达到需要的效果,例如,智能美颜、魔法特效、人脸追踪等。而随着智能终端软硬件的飞速发展,实时渲染技术在智能终端的应用变得越来越广,在智能终端进行这些效果的实时展现也变为可能。例如,可以通过深度神经网络对终端静态、亦或实时拍摄的图像进行语义分割,获得人脸的关键点、发型区域掩码图、五官位置的掩码图等图像处理结果,利用所得到的图像处理结果实现很多有创意的效果,例如,五官的放大、错 位,人脸的贴纸、贴妆等。With the popularization of smart terminals and the development of image processing technology, more and more applications can process faces in images to achieve desired effects, such as smart beauty, magic special effects, and face tracking. With the rapid development of software and hardware of intelligent terminals, the application of real-time rendering technology in intelligent terminals has become more and more extensive, and it is also possible to display these effects in real time in intelligent terminals. For example, a deep neural network can be used to perform semantic segmentation on the images captured by the terminal or in real time to obtain image processing results such as key points of the face, the mask map of the hairstyle area, and the mask map of the facial features. The processing results achieve many creative effects, such as the enlargement and dislocation of facial features, face stickers, and makeup.
为了实现提高人脸追踪精度、或者提高人脸呈现效果等目的,可以对图像中的人脸进行聚焦处理,以使人脸以较大的面积显示于屏幕中心位置。相关技术中,常采用以下两种实现方式对人脸进行聚焦处理:In order to achieve the purpose of improving the tracking accuracy of the face, or improving the effect of presenting the face, the face in the image can be focused, so that the face can be displayed at the center of the screen with a larger area. In the related art, the following two implementations are often used to focus on the human face:
(1)获取人脸的全部关键点数据;将全部的人脸关键点数据从CPU(Central Processing Unit,中央处理器)传入GPU(Graphics Processing Unit,图形处理器);基于全部关键点数据创建一个虚拟框的临界区域;将虚拟框的临界区域聚焦到屏幕中心。采用这种方式虽呈现效果较好,但是由于全部的人脸关键点数量较多(可达到上百个),将全部的人脸关键点数据从CPU传输至GPU对设备的性能有一定的影响。(1) Obtain all the key point data of the face; transfer all the key point data of the face from the CPU (Central Processing Unit, central processing unit) to the GPU (Graphics Processing Unit, graphics processor); create based on all the key point data A critical area of a virtual box; focuses the critical area of a virtual box to the center of the screen. Although the rendering effect is better in this way, due to the large number of all face key points (up to hundreds), transferring all face key point data from the CPU to the GPU has a certain impact on the performance of the device .
(2)获取单一的人脸关键点数据;根据单一的人脸关键点数据直接进行图像的warp(变形)操作,将人脸聚焦到屏幕中心。采用这种方式由于所采用的人脸关键点较少,使得呈现效果稳定性较差。(2) Acquire single face key point data; directly perform image warp (deformation) operation according to the single face key point data, and focus the face to the center of the screen. In this way, because there are fewer face key points, the rendering effect is less stable.
因此,在相关技术中,性能和准确性难以兼容。Therefore, in the related art, performance and accuracy are difficult to be compatible.
本公开所提供的图像处理方法,可以应用于如图1所示的应用环境中。终端110中预先部署有用于人脸姿态估计的人脸姿态估计方法,以及支持基于人脸识别结果对图像进行处理的图像处理逻辑。人脸姿态估计方法可以是基于深度学习模型的方法、基于表观的方法,基于分类的方法等。人脸姿态估计方法和图像处理逻辑可以内嵌于应用程序中。应用程序不限于是社交类应用程序、即时通信类应用程序、短视频类应用程序等。在一些实施例中,终端110采集人脸视频数据,将人脸视频数据中的每帧人脸图像作为待处理图像。对待处理图像进行人脸识别,得到人脸关键点。从人脸关键点中提取预设区域的参考关键点,根据参考关键点的位置信息确定放大系数;根据放大系数对人脸进行放大,并根据人脸识别得到的人脸运动信息对放大后的人脸进行人脸跟踪。其中,终端110可以是但不限于各种个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备。The image processing method provided by the present disclosure can be applied to the application environment shown in FIG. 1 . The terminal 110 is pre-deployed with a face pose estimation method for face pose estimation, and an image processing logic supporting image processing based on the face recognition result. The face pose estimation method can be a deep learning model-based method, an appearance-based method, a classification-based method, and the like. Face pose estimation methods and image processing logic can be embedded in the application. Applications are not limited to social applications, instant messaging applications, short video applications, and the like. In some embodiments, the terminal 110 collects face video data, and uses each frame of face image in the face video data as an image to be processed. Perform face recognition on the image to be processed to obtain face key points. The reference key points of the preset area are extracted from the face key points, and the amplification factor is determined according to the position information of the reference key points; face for face tracking. The terminal 110 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers and portable wearable devices.
图2是根据一示例性实施例示出的一种图像处理方法的流程图,如图2所示,图像处理方法用于终端110中,包括以下步骤。FIG. 2 is a flowchart of an image processing method according to an exemplary embodiment. As shown in FIG. 2 , the image processing method used in the terminal 110 includes the following steps.
在步骤S210中,采集人脸视频数据,将人脸视频数据中的每帧人脸图像作为待处理图像。In step S210, face video data is collected, and each frame of face image in the face video data is used as an image to be processed.
其中,人脸视频数据可以通过图像采集装置进行采集得到。图像采集装置可以是设置 于终端中的装置;也可以是独立的装置,例如,相机、摄像机等。Wherein, the face video data can be acquired by the image acquisition device. The image acquisition device may be a device provided in the terminal; it may also be an independent device, such as a camera, a video camera, and the like.
在一些实施例中,客户端可以在接收到图像处理指令后自动控制图像采集装置采集用户的人脸视频数据。图像处理指令可以是用户通过点击预设的人脸处理控件等形式触发的。客户端将人脸视频数据中的当前帧人脸图像作为待处理图像,并在采集人脸视频数据的同时实时按照步骤S220至步骤S240所述的内容对当前帧人脸图像进行处理。In some embodiments, the client may automatically control the image acquisition device to acquire the user's face video data after receiving the image processing instruction. The image processing instruction may be triggered by the user by clicking on a preset face processing control or the like. The client takes the face image of the current frame in the face video data as the image to be processed, and processes the face image of the current frame in real time according to the content described in steps S220 to S240 while collecting the face video data.
在一些实施例中,需要处理的对象也可以是除人脸外的其他人体部位,例如,手部、四肢部等;甚至还可以是其他种类,例如动物、建筑、星体等。In some embodiments, the objects to be processed may also be other human body parts other than the human face, such as hands, limbs, etc.; and may even be other types, such as animals, buildings, stars, and the like.
在一些实施例中,待处理图像还可以是保存在本地数据库或者服务器中预先拍摄的静态图像,或者实时拍摄的静态图像。In some embodiments, the to-be-processed image may also be a pre-shot still image stored in a local database or a server, or a still image captured in real time.
在步骤S220中,对待处理图像进行人脸识别,得到人脸关键点。In step S220, face recognition is performed on the image to be processed to obtain face key points.
其中,对待处理图像进行人脸识别可以采用基于深度学习模型的方法。深度学习模型可以是任一种能够用于进行人脸关键点识别的模型,例如,DCNN(Deep Convolutional Network,深度卷积神经网络模型)等。人脸关键点可以是预先定义好的,数量包括至少一个。在深度学习模型的训练过程中,按照预先定义的关键点相关信息(例如关键点排序、关键点位置等)对每张样本图像进行标注。采用标注后的样本图像对深度学习模型进行训练,得到能够输出人脸关键点的位置信息的深度学习模型。在一些实施例中,客户端实时将所获取的待处理图像输入至已训练的深度学习模型,得到人脸关键点。Among them, a method based on a deep learning model can be used for face recognition of the image to be processed. The deep learning model can be any model that can be used for face key point recognition, for example, DCNN (Deep Convolutional Network, deep convolutional neural network model) and the like. The face key points can be predefined, and the number includes at least one. During the training process of the deep learning model, each sample image is annotated according to pre-defined key point related information (such as key point ranking, key point positions, etc.). The labeled sample images are used to train the deep learning model to obtain a deep learning model that can output the position information of the key points of the face. In some embodiments, the client inputs the acquired image to be processed into the trained deep learning model in real time to obtain the face key points.
在步骤S230中,从人脸关键点中提取预设区域的参考关键点,根据参考关键点的位置信息确定放大系数。In step S230, the reference key points of the preset area are extracted from the face key points, and the magnification coefficient is determined according to the position information of the reference key points.
其中,预设区域是指能够大致表征人脸在待处理图像中的位置的区域,例如,可以是人脸的轮廓区域、中心区域等。预设区域中的参考关键点数量可以包括多个,多个不排除是一个的情况。The preset area refers to an area that can roughly represent the position of the face in the image to be processed, for example, it may be a contour area, a center area, and the like of the human face. The number of reference key points in the preset area may include multiple, which does not exclude the case of one.
放大系数用于对人脸进行放大处理,以增大人脸部位在图像采集页面中的比例,从而达到突出展示人脸的效果。The magnification factor is used to amplify the human face, so as to increase the proportion of the human face in the image capture page, so as to achieve the effect of highlighting the human face.
在一些实施例中,预先将参考关键点的关键点相关信息配置在客户端中。在得到深度学习模型输出的人脸关键点后,客户端便可根据参考关键点的关键点相关信息从人脸关键点中提取得到参考关键点。客户端获取参考关键点的位置信息。根据参考关键点的位置信息计算得到预设区域的尺寸大小。预设区域的尺寸大小可以通过预设区域的边框大小、预 设区域中关键点间的距离等参数表征。通过预设算法得到放大系数。预设算法视具体情况而定,例如,可以是将预设区域的尺寸大小与预设常数进行比较得到放大系数;或者,预先建立预设区域的尺寸大小与放大系数的对应关系,从对应关系中获取匹配的放大系数;或者,通过多次实验总结建立相关函数,通过相关函数根据预设区域的尺寸大小计算得到放大系数。In some embodiments, the key point related information of the reference key point is pre-configured in the client. After obtaining the face key points output by the deep learning model, the client can extract the reference key points from the face key points according to the key point related information of the reference key points. The client obtains the location information of the reference key point. The size of the preset area is calculated according to the position information of the reference key points. The size of the preset area can be characterized by parameters such as the frame size of the preset area and the distance between key points in the preset area. The amplification factor is obtained by a preset algorithm. The preset algorithm depends on the specific situation. For example, the size of the preset area can be compared with a preset constant to obtain the magnification factor; The matching magnification factor is obtained from the method; or, a correlation function is established by summarizing multiple experiments, and the magnification factor is calculated according to the size of the preset area through the correlation function.
在步骤S240中,根据放大系数对人脸进行放大,并根据人脸识别得到的人脸运动信息对放大后的人脸进行人脸跟踪。In step S240, the human face is enlarged according to the enlargement coefficient, and face tracking is performed on the enlarged human face according to the face motion information obtained by the face recognition.
其中,人脸运动信息不限于包括人脸运动角度、人脸运动轨迹等,可以通过深度学习模型输出人脸关键点时,同步输出人脸运动信息。Among them, the face motion information is not limited to including the face motion angle, the face motion trajectory, etc., and the face motion information can be output synchronously when the key points of the face are output through the deep learning model.
在一些实施例中,客户端在得到放大系数后,对当前帧待处理图像中的人脸进行放大系数倍的放大显示。同时,客户端获取当前帧的人脸运动信息,按照人脸运动信息控制待处理图像中放大后的人脸进行运动,以达到人脸实时跟踪的效果。In some embodiments, after obtaining the magnification factor, the client performs an enlarged display on the face in the to-be-processed image of the current frame by the magnification factor. At the same time, the client obtains the face motion information of the current frame, and controls the enlarged face in the to-be-processed image to move according to the face motion information, so as to achieve the effect of real-time face tracking.
进一步地,为了使图像处理的功能更加全面,可以通过动画特效展示人脸放大的过程。Further, in order to make the image processing function more comprehensive, the process of face enlargement can be displayed through animation special effects.
上述图像处理方法中,将采集到的人脸视频数据中的每帧人脸图像作为待处理图像,对待处理图像进行人脸识别,得到人脸关键点。然后,从人脸关键点中提取预设区域的参考关键点。基于预设区域中的参考关键点的位置信息确定放大系数。最后,根据放大系数对人脸进行放大,并根据人脸识别得到的人脸运动信息对放大后的人脸进行人脸跟踪。通过采用预设区域中的参考关键点,避免了关键点较多带来的性能瓶颈;通过以预设区域中的多个关键点为参考确定放大系数,还能确保对人脸进行放大处理的精度。In the above image processing method, each frame of face image in the collected face video data is used as an image to be processed, and face recognition is performed on the image to be processed to obtain face key points. Then, the reference key points of the preset area are extracted from the face key points. The magnification factor is determined based on the position information of the reference key point in the preset area. Finally, the face is enlarged according to the amplification factor, and face tracking is performed on the enlarged face according to the face motion information obtained by face recognition. By using the reference key points in the preset area, the performance bottleneck caused by many key points is avoided; by using multiple key points in the preset area as a reference to determine the magnification factor, it can also ensure that the face is enlarged. precision.
在一个示例性实施例中,如图3所示,对图像中的人脸进行处理还可以包括移动人脸的过程。可以通过以下步骤实现:In an exemplary embodiment, as shown in FIG. 3 , the processing of the human face in the image may further include a process of moving the human face. This can be achieved by the following steps:
在步骤S310中,从参考关键点中获取人脸中心区域中的目标关键点。In step S310, the target key point in the central area of the face is obtained from the reference key point.
在步骤S320中,根据预设的目标关键点的位置和目标位置的对应关系,确定与目标关键点对应的目标位置。In step S320, the target position corresponding to the target key point is determined according to the preset correspondence between the position of the target key point and the target position.
在步骤S330中,以目标位置相对目标关键点的方向对人脸进行移动,直至目标关键点到达目标位置。In step S330, the face is moved in the direction of the target position relative to the target key point until the target key point reaches the target position.
其中,目标关键点是指能够大致表征人脸中心位置的关键点,例如,预设区域可以采用五官区域(包含双眼、鼻、口的区域)、T型区域(包括额头的中央区域和人脸的中央 区域,即前额和鼻管形成的区域)、鼻梁区域(由额头到准头之间的区域)等;目标关键点则可以相应地选取预设区域中的鼻尖关键点、鼻翼关键点、上唇关键点等。Among them, the target key point refers to the key point that can roughly represent the central position of the face. For example, the preset area can be the facial features area (including the area of the eyes, nose, and mouth), the T-shaped area (including the central area of the forehead and the face of the human face). The central area of the forehead and the nasal canal), the bridge of the nose (the area between the forehead and the quasi-head), etc.; the target key points can be selected correspondingly in the preset area. key points etc.
目标位置是指目标关键点移动后所在的位置,用于在不影响人脸呈现效果的情况下,使人脸能够展示在图像采集页面的中心位置附近。The target position refers to the position where the target key point is moved, and is used to display the face near the center of the image capture page without affecting the rendering effect of the face.
在一些实施例中,目标关键点的关键点相关信息可以预先配置在客户端中。在客户端获取人脸关键点后,根据目标关键点的关键点相关信息从人脸关键中提取得到目标关键点。进而根据目标关键点的位置信息,从预设的目标关键点的位置和目标位置的对应关系中,确定与当前帧的目标关键点对应的目标位置。客户端按照目标位置相对目标关键点的方向对人脸进行移动,直至目标关键点到达目标位置。In some embodiments, keypoint-related information of target keypoints may be pre-configured in the client. After the client obtains the face key points, the target key points are extracted from the face keys according to the key point related information of the target key points. Further, according to the position information of the target key point, the target position corresponding to the target key point of the current frame is determined from the preset correspondence between the position of the target key point and the target position. The client moves the face according to the direction of the target position relative to the target key point until the target key point reaches the target position.
进一步地,在人脸移动后或者人脸移动前,还可以以目标关键点为中心,根据上述实施例得到的放大系数对人脸进行放大系数倍的放大处理。Further, after the face is moved or before the face is moved, the face can also be enlarged by a multiplier of the enlargement factor according to the enlargement factor obtained in the above embodiment, with the target key point as the center.
进一步地,为了使图像处理的功能更加全面,可以通过动画特效展示人脸的移动和放大过程。Further, in order to make the image processing function more comprehensive, the moving and enlarging process of the human face can be displayed through animation special effects.
在本实施例中,通过从参考关键点中提取目标关键点,以目标关键点为基准得到目标位置,一方面,能够提高人脸处理的效率;另一方面,由于目标关键点是能够大致表征人脸中心位置的点,因此,采用目标关键点还能够提高人脸处理的精度。In this embodiment, the target key points are extracted from the reference key points, and the target position is obtained based on the target key points. On the one hand, the efficiency of face processing can be improved; on the other hand, since the target key points can roughly represent the Therefore, the use of target key points can also improve the accuracy of face processing.
在一些实施例中,预设的目标关键点的位置和目标位置的对应关系,包括:目标关键点的位置包括多个取值区间,每个取值区间与相应的变化率对应,变化率为目标位置随着目标关键点的位置变化而变化的程度,取值区间基于图像采集页面在预设方向上的尺寸确定。In some embodiments, the preset correspondence between the position of the target key point and the target position includes: the position of the target key point includes a plurality of value intervals, each value interval corresponds to a corresponding change rate, and the change rate is The degree to which the target position changes with the position of the target key point, and the value interval is determined based on the size of the image capture page in the preset direction.
其中,图像采集页面可以是客户端显示的图像页面。The image collection page may be an image page displayed by the client.
预设方向可以根据待处理图像的拍摄角度而定。例如,拍摄角度为竖屏拍摄,预设方向则可以为终端设备竖屏放置时的水平方向。预设方向上的尺寸可以通过像素尺寸表征。The preset direction can be determined according to the shooting angle of the image to be processed. For example, the shooting angle is vertical screen shooting, and the preset direction may be the horizontal direction when the terminal device is placed in vertical screen. The size in the preset direction can be characterized by the pixel size.
变化率用于衡量目标位置随着目标关键点在待处理图像内的位置变化而变化的程度。The rate of change is used to measure the degree to which the target position changes as the position of the target keypoints within the image to be processed changes.
在一些实施例中,目标关键点和待处理图像的相对位置包括位于待处理图像的边缘区域和中心区域。边缘区域和中心区域的取值区间可以预先定义。目标关键点在边缘区域的取值区间时的变化率,与目标关键点在中心区域对应的取值区间的变化率不同。In some embodiments, the relative positions of the target key points and the image to be processed include an edge region and a central region of the image to be processed. The value interval of the edge area and the center area can be predefined. The rate of change of the target key point in the value range of the edge area is different from the change rate of the target key point in the value range corresponding to the central area.
在本实施例中,通过根据目标关键点与待处理图像的相对位置配置相应的变化率,使 得图像处理过程能够呈现较佳的人脸追踪效果,且不会使人脸的呈现效果发生畸变。In this embodiment, by configuring the corresponding change rate according to the relative position of the target key point and the image to be processed, the image processing process can present a better face tracking effect, and the presentation effect of the face will not be distorted.
在一些实施例中,将图像采集页面在预设方向上的尺寸进行划分,得到依次衔接的第一取值空间、第二取值空间和第三取值空间;变化率按照以下方式确定:从目标关键点的位置中获取预设方向上的坐标值;当坐标值位于第一取值空间或第三取值空间时,目标位置的变化率为第一变化率;当坐标值位于第二取值空间时,目标位置的变化率为第二变化率;第一变化率大于第二变化率。In some embodiments, the size of the image capture page in the preset direction is divided to obtain a first value space, a second value space and a third value space connected in sequence; the rate of change is determined in the following manner: from The coordinate value in the preset direction is obtained from the position of the target key point; when the coordinate value is in the first value space or the third value space, the change rate of the target position is the first change rate; when the coordinate value is in the second value space, the change rate is the first change rate; In the value space, the rate of change of the target position is the second rate of change; the first rate of change is greater than the second rate of change.
其中,第一变化率和(或)第二变化率可以为常数,也可以为不定数。Wherein, the first change rate and/or the second change rate may be constant or indefinite.
依次衔接是指第一取值空间和第二取值空间的取值首尾连接,第二取值区间和第三取值区间的取值首尾连接。第一取值空间和第二取值区间可以用于表示图像采集页面的边缘区域。第三取值区间可以用于表示图像采集页面的中心区域。Sequential connection means that the values of the first value space and the second value space are connected end to end, and the values of the second value interval and the third value interval are connected end to end. The first value space and the second value interval can be used to represent the edge area of the image capturing page. The third value interval can be used to represent the central area of the image capture page.
在一些实施例中,若图像采集装置的像素为720*1280px(pixel,像素),720为终端设备竖屏放置时的水平像素宽,1280为终端设备竖屏放置时的竖直像素宽。预设方向为终端设备竖屏放置时的水平方向。那么可以将水平方向上的像素宽720进行划分得到依次衔接三个取值区间,例如划分为0~200、200~520、520~720三个取值区间。In some embodiments, if the pixel of the image acquisition device is 720*1280px (pixel, pixel), 720 is the horizontal pixel width when the terminal device is placed in a vertical screen, and 1280 is the vertical pixel width when the terminal device is placed in a vertical screen. The default orientation is the horizontal orientation when the terminal device is placed in portrait orientation. Then, the pixel width 720 in the horizontal direction can be divided to obtain three value intervals connected in sequence, for example, divided into three value intervals of 0-200, 200-520, and 520-720.
在一些实施例中,目标关键点的位置可以通过像素坐标表征。客户端从目标关键点的位置中获取预设方向的坐标值。判断坐标值属于哪个取值空间。若属于第一取值区间或第三取值区间,表示目标关键点位于图像采集页面的边缘区域,则客户端根据第一变化率得到目标位置;若坐标值位于第二取值区间内,表示目标关键点位于图像采集页面的中心区域,则客户端根据第二变化率得到目标位置。由于第一变化率大于第二变化率,因此,目标部位在边缘区域的变化幅度将会大于在中心区域的变化幅度。In some embodiments, the locations of target keypoints may be characterized by pixel coordinates. The client obtains the coordinate value of the preset direction from the position of the target key point. Determine which value space the coordinate value belongs to. If it belongs to the first value interval or the third value interval, it means that the target key point is located in the edge area of the image capture page, and the client obtains the target position according to the first change rate; if the coordinate value is in the second value interval, it means If the target key point is located in the central area of the image capture page, the client obtains the target position according to the second change rate. Since the first change rate is greater than the second change rate, the change range of the target part in the edge area will be greater than that in the center area.
在一个具体的实施例中,待处理图像的像素为720*1280px,720为终端设备竖屏放置时的水平像素宽。将水平像素款划分为依次衔接的三个取值区间0~200、200~520、520~720。可以通过以下分段函数得到目标位置在水平方向的像素坐标:In a specific embodiment, the pixels of the image to be processed are 720*1280px, and 720 is the horizontal pixel width when the terminal device is placed in a vertical screen. The horizontal pixels are divided into three value ranges of 0-200, 200-520, and 520-720 which are connected in sequence. The pixel coordinates of the target position in the horizontal direction can be obtained by the following piecewise function:
Figure PCTCN2021128769-appb-000001
Figure PCTCN2021128769-appb-000001
其中,offset和centerPosX为目标位置在水平方向上的像素坐标;curPixelPosX为目标关键点在水平方向上的像素坐标。Among them, offset and centerPosX are the pixel coordinates of the target position in the horizontal direction; curPixelPosX is the pixel coordinates of the target key point in the horizontal direction.
还可以参照图4所示的分段函数得到目标位置。为了提高目标位置的精确性,还可以将目标位置的像素坐标映射至0-1的空间。如图4所示,在水平像素坐标为0~200和520~720所在的边缘区域,目标位置的随着目标关键点的位置变化,从(0,y)至(0.5,y)快速变化;在水平像素坐标为520~720所在的中心区域,目标位置的随着目标关键点的位置变化在(0.5,y)附近平缓变化,甚至不变化。其中,y表示目标关键点在竖直方向上的坐标值。可以理解的是,虽然图4中所示的分段函数中变化率(可以通过斜率表示)为常数,但是在实际应用过程中分段函数的变化率还可以为不定数。The target position can also be obtained with reference to the piecewise function shown in FIG. 4 . In order to improve the accuracy of the target position, the pixel coordinates of the target position can also be mapped to the space of 0-1. As shown in Figure 4, in the edge area where the horizontal pixel coordinates are 0 to 200 and 520 to 720, the target position changes rapidly from (0, y) to (0.5, y) with the position of the target key point; In the central area where the horizontal pixel coordinates are from 520 to 720, the target position changes smoothly around (0.5, y) with the position change of the target key point, or even does not change. Among them, y represents the coordinate value of the target key point in the vertical direction. It can be understood that although the rate of change of the piecewise function shown in FIG. 4 (which can be represented by a slope) is a constant, the rate of change of the piecewise function may also be an indefinite number in practical applications.
在本实施例中,通过配置分段函数,根据该分段函数得到目标位置,可以加快图像处理的效率,使图像处理过程能够呈现较佳的人脸追踪效果,且不会使人脸的呈现效果发生畸变。In this embodiment, by configuring a segment function and obtaining the target position according to the segment function, the efficiency of image processing can be accelerated, so that the image processing process can present a better face tracking effect, and the presentation of the face will not be affected. The effect is distorted.
在一些实施例中,如图5所示,在步骤S230中,根据参考关键点的位置信息确定放大系数,可以通过以下步骤实现:In some embodiments, as shown in FIG. 5 , in step S230, the amplification factor is determined according to the position information of the reference key point, which can be achieved by the following steps:
在步骤S510中,根据参考关键点的位置信息确定水平区域的第一相对距离和垂直区域的第二相对距离。In step S510, the first relative distance of the horizontal area and the second relative distance of the vertical area are determined according to the position information of the reference key point.
其中,水平区域和垂直区域是对预设区域进行划分得到的。获取水平区域中分别位于水平区域两端的一组关键点,根据该一组关键点的位置信息计算得到水平区域的第一相对距离。同理,对于垂直区域,获取垂直区域中分别位于垂直区域两端的一组关键点,根据一组关键点的位置信息计算得到垂直区域的第二相对距离。The horizontal area and the vertical area are obtained by dividing the preset area. A group of key points located at both ends of the horizontal area in the horizontal area are acquired, and the first relative distance of the horizontal area is obtained by calculating the position information of the group of key points. Similarly, for the vertical area, a group of key points located at both ends of the vertical area in the vertical area are obtained, and the second relative distance of the vertical area is obtained by calculating the position information of a group of key points.
在一些实施例中,预设区域为人脸T型区域,人脸T型区域包括额头中央区域和人脸中央区域;参考关键点包括左眼关键点、右眼关键点、眉间关键点和鼻尖关键点。那么第一相对距离则可以为左眼关键点和右眼关键点之间的距离,可以根据左眼关键点和右眼关键点的位置信息计算得到。第二相对距离则可以为眉间关键点和鼻尖关键点之间的距离,可以根据眉间关键点和鼻尖关键点的位置信息计算得到。In some embodiments, the preset area is a T-shaped area of the human face, and the T-shaped area of the human face includes the central area of the forehead and the central area of the human face; the reference key points include the left eye key point, the right eye key point, the eyebrow key point and the tip of the nose key point. Then, the first relative distance may be the distance between the left eye key point and the right eye key point, which may be calculated according to the position information of the left eye key point and the right eye key point. The second relative distance may be the distance between the key point between the eyebrows and the key point of the tip of the nose, which may be calculated according to the position information of the key point between the eyebrows and the key point of the tip of the nose.
在步骤S520中,获取人脸三维角度,人脸三维角度包括俯仰角和偏航角。In step S520, a three-dimensional angle of the face is obtained, and the three-dimensional angle of the face includes a pitch angle and a yaw angle.
其中,人脸三维角度可以采用欧拉角表征。欧拉角是指物体绕坐标系三个坐标轴(x,y,z轴)的旋转角度。欧拉角可以通过对人脸关键点进行姿态识别得到。在一些实施例中,采用Opencv(一种开源计算机视觉库)的姿态估计算法根据人脸关键点解出旋转向量,将旋转向量转换为欧拉角。在本实施例中,欧拉角包括俯仰角和偏航角。俯仰角 (pitch)表示物体绕x轴旋转的角度;偏航角(yaw)表示物体绕y轴旋转的角度。Among them, the three-dimensional angle of the face can be represented by Euler angles. Euler angle refers to the rotation angle of an object around the three coordinate axes (x, y, z axis) of the coordinate system. Euler angles can be obtained by performing gesture recognition on the key points of the face. In some embodiments, the pose estimation algorithm of Opencv (an open source computer vision library) is used to solve the rotation vector according to the key points of the face, and convert the rotation vector into Euler angles. In this embodiment, the Euler angle includes a pitch angle and a yaw angle. The pitch angle (pitch) represents the angle that the object rotates around the x-axis; the yaw angle (yaw) represents the angle that the object rotates around the y-axis.
在步骤S530中,根据俯仰角和偏航角确定与第一相对距离对应的第一权重,以及与第二相对距离对应的第二权重。In step S530, a first weight corresponding to the first relative distance and a second weight corresponding to the second relative distance are determined according to the pitch angle and the yaw angle.
在一些实施例中,对于水平区域,可以将俯仰角与俯仰角和偏航角之和的比值作为第一权重A,即:In some embodiments, for the horizontal region, the ratio of the pitch angle to the sum of the pitch angle and the yaw angle can be used as the first weight A, that is:
Figure PCTCN2021128769-appb-000002
Figure PCTCN2021128769-appb-000002
对于垂直区域,可以将偏航角与俯仰角和偏航角之和的比值作为第二权重B,即:For the vertical area, the ratio of the yaw angle to the sum of the pitch angle and the yaw angle can be used as the second weight B, namely:
Figure PCTCN2021128769-appb-000003
Figure PCTCN2021128769-appb-000003
在步骤S540中,获取第一相对距离与第一权重的乘积和第二相对距离与第二权重的乘积之和。In step S540, the product of the first relative distance and the first weight and the sum of the product of the second relative distance and the second weight are obtained.
在步骤S550中,确定放大系数为图像采集页面的宽度与乘积之和的比值。In step S550, the magnification factor is determined as the ratio of the width of the image capturing page to the sum of the products.
在一些实施例中,客户端在得到水平区域的第一相对距离、垂直区域的第二相对距离,以及与第一相对距离对应的第一权重、与第二相对距离对应的第二权重后,计算第一相对距离与第一权重之间的乘积和第二相对距离与第二权重之间的乘积之和。可以通过以下公式得到乘积之和:In some embodiments, after the client obtains the first relative distance of the horizontal area, the second relative distance of the vertical area, and the first weight corresponding to the first relative distance and the second weight corresponding to the second relative distance, The sum of the product between the first relative distance and the first weight and the product between the second relative distance and the second weight is calculated. The sum of the products can be obtained by the following formula:
scaleHelpValue=ewidth*A+nHeight*BscaleHelpValue=ewidth*A+nHeight*B
其中,scaleHelpValue代表乘积之和;ewidth代表水平区域的第一相对距离;nHeight代表垂直区域的第二相对距离;A代表第一权重;B代表第二权重。Among them, scaleHelpValue represents the sum of the products; ewidth represents the first relative distance of the horizontal area; nHeight represents the second relative distance of the vertical area; A represents the first weight; B represents the second weight.
最后,计算图像采集页面在预设方向上的宽度与乘积之和的比值,作为放大系数。可以通过以下公式得到放大系数:Finally, the ratio of the width of the image capture page in the preset direction to the sum of the products is calculated as an enlargement factor. The magnification factor can be obtained by the following formula:
Figure PCTCN2021128769-appb-000004
Figure PCTCN2021128769-appb-000004
其中,scaleValue代表放大系数;width代表预设方向上图像采集页面的宽度。Among them, scaleValue represents the magnification factor; width represents the width of the image capture page in the preset direction.
在本实施例中,根据预先配置的计算公式可快速得到放大系数,避免了关键点较多带来的性能瓶颈,同时加快了放大系数的获取效率。通过结合人脸三维角度得到水平区域和垂直区域各自对应的权重,再基于权重得到合理的放大系数,可以确保人脸放大的精度。In this embodiment, the amplification factor can be quickly obtained according to the preconfigured calculation formula, which avoids the performance bottleneck caused by many key points, and at the same time speeds up the acquisition efficiency of the amplification factor. By combining the three-dimensional angle of the face to obtain the corresponding weights of the horizontal area and the vertical area, and then obtaining a reasonable magnification factor based on the weight, the accuracy of face magnification can be ensured.
图6是根据一些实施例示出的一种图像处理方法的流程图。在本实施例中,终端为内 置图像采集装置的用户手持设备,例如,智能手机、平板电脑、便携式可穿戴设备等。待处理图像为通过用户手持设备采集的人脸视频数据中的当前帧人脸图像。如图6所示,包括以下步骤。Fig. 6 is a flowchart of an image processing method according to some embodiments. In this embodiment, the terminal is a user's handheld device with a built-in image acquisition device, such as a smart phone, a tablet computer, a portable wearable device, and the like. The image to be processed is the face image of the current frame in the face video data collected by the user's handheld device. As shown in Figure 6, the following steps are included.
在步骤S602中,通过用户手持设备采集人脸视频数据。In step S602, face video data is collected through the user's handheld device.
在步骤S604中,通过深度学习模型对人脸视频数据中的当前帧人脸图像进行人脸识别,得到人脸关键点。In step S604, face recognition is performed on the face image of the current frame in the face video data by using a deep learning model to obtain face key points.
在步骤S606中,获取根据人脸关键点进行人脸姿态估计得到的人脸三维角度。人脸三维角度通过欧拉角表征,包括俯仰角和偏航角。In step S606, the three-dimensional angle of the face obtained by estimating the pose of the face according to the key points of the face is obtained. The three-dimensional angle of the face is represented by Euler angles, including pitch angle and yaw angle.
在步骤S608中,从人脸关键点中提取T型区域的参考关键点。参考关键点包括左眼关键点、右眼关键点、鼻尖关键点和眉尖关键点。In step S608, the reference key points of the T-shaped area are extracted from the face key points. The reference key points include the left eye key point, the right eye key point, the nose tip key point and the eyebrow tip key point.
在步骤S610中,根据左眼关键点和右眼关键点的位置信息计算得到水平区域的第一相对距离。根据鼻尖关键点和眉尖关键点的位置信息计算得到垂直区域的第二相对距离。In step S610, the first relative distance of the horizontal area is obtained by calculating according to the position information of the left eye key point and the right eye key point. The second relative distance of the vertical area is calculated according to the position information of the key point of the nose tip and the key point of the eyebrow tip.
在步骤S612中,将俯仰角与俯仰角和偏航角之和的比值作为第一权重;将偏航角与俯仰角和偏航角之和的比值作为第二权重。In step S612, the ratio of the pitch angle to the sum of the pitch angle and the yaw angle is used as the first weight; the ratio of the yaw angle to the sum of the pitch angle and the yaw angle is used as the second weight.
在步骤S614中,获取第一相对距离与第一权重的乘积和第二相对距离与第二权重的乘积之和。In step S614, the product of the first relative distance and the first weight and the sum of the product of the second relative distance and the second weight are obtained.
在步骤S616中,确定放大系数为图像采集页面的宽度与乘积之和的比值。In step S616, the magnification factor is determined as the ratio of the width of the image capturing page to the sum of the products.
在步骤S618中,从预设的鼻尖关键点的位置和目标位置的对应关系中,确定与当前帧鼻尖关键点的位置对应的目标位置。鼻尖关键点的位置和目标位置的对应关系可以通过分段函数表示、分段函数的具体实现方式可以参照上述实施例,在此不做具体阐述。In step S618, the target position corresponding to the position of the nose tip key point in the current frame is determined from the preset correspondence between the position of the nose tip key point and the target position. The correspondence between the position of the key point of the nose tip and the target position can be represented by a piecewise function, and the specific implementation of the piecewise function can refer to the above-mentioned embodiments, which will not be described in detail here.
在步骤S620中,对人脸部进行移动,直至鼻尖关键点到达目标位置。以鼻尖关键点为中心,根据放大系数对人脸进行放大系数倍的放大。In step S620, the human face is moved until the key point of the nose tip reaches the target position. Taking the key point of the nose tip as the center, the face is enlarged by the magnification factor according to the magnification factor.
图7为通过本实施例中的方式对人脸进行处理得到的示意图;图8为采用相关技术中单一关键点的处理方式得到的示意图。对比图7和图8可知,对于相同的原始图像,通过相关技术中单一关键点的处理方式,可能不够稳定、易发生畸变(左耳处过于拉伸)。而通过本公开的方式即可以减轻设备的运行压力,也可以得到较佳的图像处理效果。FIG. 7 is a schematic diagram obtained by processing a human face by the method in this embodiment; FIG. 8 is a schematic diagram obtained by using a processing method of a single key point in the related art. Comparing Fig. 7 and Fig. 8, it can be seen that for the same original image, the processing method of a single key point in the related art may not be stable enough and prone to distortion (the left ear is too stretched). By means of the present disclosure, the operating pressure of the device can be reduced, and a better image processing effect can also be obtained.
应该理解的是,虽然上述流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行 并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,上述流程图中的至少一部分步骤可以包括多个步骤或者多个阶段,这些步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤中的步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that although the steps in the above flow charts are displayed in sequence according to the arrows, these steps are not necessarily executed in the sequence indicated by the arrows. Unless explicitly stated herein, there is no strict order in the execution of these steps, and these steps may be performed in other orders. Moreover, at least a part of the steps in the above flow chart may include multiple steps or multiple stages. These steps or stages are not necessarily executed at the same time, but may be executed at different times. The execution sequence of these steps or stages It is also not necessarily performed sequentially, but may be performed alternately or alternately with other steps or at least a portion of a step or phase within the other steps.
图9是根据一些实施例示出的一种图像处理装置900框图。参照图9,该装置900包括图像采集模块901、人脸识别模块902、系数确定模块903和人脸处理模块904。FIG. 9 is a block diagram of an image processing apparatus 900 according to some embodiments. Referring to FIG. 9 , the apparatus 900 includes an image acquisition module 901 , a face recognition module 902 , a coefficient determination module 903 and a face processing module 904 .
图像采集模块901,被配置为采集人脸视频数据,将人脸视频数据中的每帧人脸图像作为待处理图像;人脸识别模块902,被配置为对待处理图像进行人脸识别,得到人脸关键点;系数确定模块903,被配置为从人脸关键点中提取预设区域的参考关键点,根据参考关键点的位置信息确定放大系数;人脸处理模块904,被配置为根据放大系数对人脸进行放大,并根据人脸识别得到的人脸运动信息对放大后的人脸进行人脸跟踪。The image acquisition module 901 is configured to collect face video data, and use each frame of face image in the face video data as an image to be processed; the face recognition module 902 is configured to perform face recognition on the image to be processed, and obtain a human face. face key points; the coefficient determination module 903 is configured to extract the reference key points of the preset area from the face key points, and determine the amplification factor according to the position information of the reference key points; the face processing module 904 is configured to be based on the amplification factor The face is enlarged, and face tracking is performed on the enlarged face according to the face motion information obtained by face recognition.
在一些实施例中,所述装置还包括:关键点获取模块,被配置为从参考关键点中获取人脸中心区域中的目标关键点;位置确定模块,被配置为根据预设的目标关键点的位置和目标位置的对应关系,确定与目标关键点对应的目标位置;移动模块,被配置为以目标位置相对目标关键点的方向对人脸进行移动,直至目标关键点到达目标位置。In some embodiments, the apparatus further includes: a key point acquisition module configured to acquire target key points in the central area of the face from reference key points; a position determination module configured to obtain target key points according to preset target key points The corresponding relationship between the position and the target position is determined, and the target position corresponding to the target key point is determined; the moving module is configured to move the face in the direction of the target position relative to the target key point until the target key point reaches the target position.
在一些实施例中,预设的目标关键点的位置和目标位置的对应关系,包括:目标关键点的位置包括多个取值区间,每个取值区间与相应的变化率对应,变化率为目标位置随着目标关键点的位置变化而变化的程度,取值区间基于图像采集页面在预设方向上的尺寸确定。In some embodiments, the preset correspondence between the position of the target key point and the target position includes: the position of the target key point includes a plurality of value intervals, each value interval corresponds to a corresponding change rate, and the change rate is The degree to which the target position changes with the position of the target key point, and the value interval is determined based on the size of the image capture page in the preset direction.
在一些实施例中,将图像采集页面在预设方向上的尺寸进行划分,得到依次衔接的第一取值空间、第二取值空间和第三取值空间;变化率按照以下方式确定:从目标关键点的位置中获取预设方向上的坐标值;当坐标值位于第一取值空间或第三取值空间时,目标位置的变化率为第一变化率;当坐标值位于第二取值空间时,目标位置的变化率为第二变化率;第一变化率大于第二变化率。In some embodiments, the size of the image capture page in the preset direction is divided to obtain a first value space, a second value space and a third value space connected in sequence; the rate of change is determined in the following manner: from The coordinate value in the preset direction is obtained from the position of the target key point; when the coordinate value is in the first value space or the third value space, the change rate of the target position is the first change rate; when the coordinate value is in the second value space, the change rate is the first change rate; In the value space, the rate of change of the target position is the second rate of change; the first rate of change is greater than the second rate of change.
在一些实施例中,目标关键点为鼻尖关键点。In some embodiments, the target keypoint is a nose tip keypoint.
在一些实施例中,系数确定模块903,包括:距离确定单元,被配置为根据参考关键点的位置信息确定水平区域的第一相对距离和垂直区域的第二相对距离;角度获取单元,被配置为获取人脸三维角度,人脸三维角度包括俯仰角和偏航角;权重确定单元,被配置 为根据俯仰角和偏航角确定与第一相对距离对应的第一权重,以及与第二相对距离对应的第二权重;计算单元,被配置为获取第一相对距离与第一权重的乘积和第二相对距离与第二权重的乘积之和;系数确定单元,被配置为确定放大系数为图像采集页面的宽度与乘积之和的比值。In some embodiments, the coefficient determination module 903 includes: a distance determination unit configured to determine a first relative distance of the horizontal area and a second relative distance of the vertical area according to the position information of the reference key points; an angle acquisition unit configured to In order to obtain the three-dimensional angle of the face, the three-dimensional angle of the face includes a pitch angle and a yaw angle; the weight determination unit is configured to determine the first weight corresponding to the first relative distance according to the pitch angle and the yaw angle, and the second relative distance. a second weight corresponding to the distance; a calculation unit configured to obtain the sum of the product of the first relative distance and the first weight and the sum of the product of the second relative distance and the second weight; the coefficient determination unit configured to determine the magnification factor as the image The ratio of the width of the capture page to the sum of the products.
在一些实施例中,权重确定单元,被配置为确定第一权重为俯仰角与俯仰角和偏航角之和的比值;确定第二权重为偏航角与俯仰角和偏航角之和的比值。In some embodiments, the weight determination unit is configured to determine the first weight as the ratio of the pitch angle to the sum of the pitch angle and the yaw angle; and to determine the second weight as the ratio of the yaw angle to the sum of the pitch angle and the yaw angle ratio.
在一些实施例中,预设区域为人脸T型区域,人脸T型区域包括额头中央区域和人脸中央区域;参考关键点包括左眼关键点、右眼关键点、眉间关键点和鼻尖关键点;距离确定单元,被配置为确定第一相对距离为左眼关键点与右眼关键点之间的距离;确定第二相对距离为眉间关键点与鼻尖关键点之间的距离。In some embodiments, the preset area is a T-shaped area of the human face, and the T-shaped area of the human face includes the central area of the forehead and the central area of the human face; the reference key points include the left eye key point, the right eye key point, the eyebrow key point and the tip of the nose key point; a distance determination unit configured to determine the first relative distance as the distance between the left eye key point and the right eye key point; and determine the second relative distance as the distance between the eyebrow key point and the nose tip key point.
关于上述实施例中的装置,其中各个模块执行操作的具体方式已经在有关该方法的实施例中进行了详细描述,此处将不做详细阐述说明。Regarding the apparatus in the above-mentioned embodiment, the specific manner in which each module performs operations has been described in detail in the embodiment of the method, and will not be described in detail here.
图10根据一些实施例示出的一种用于图像处理的设备1000的框图。例如,设备1000可以是移动电话、计算机、数字广播终端、消息收发设备、游戏控制台、平板设备、医疗设备、健身设备、个人数字助理等。Fig. 10 shows a block diagram of a device 1000 for image processing according to some embodiments. For example, device 1000 may be a mobile phone, computer, digital broadcast terminal, messaging device, game console, tablet device, medical device, fitness device, personal digital assistant, or the like.
参照图10,设备1000可以包括以下一个或多个组件:处理组件1002、存储器1004、电源组件1006、多媒体组件1008、音频组件1010、输入/输出(I/O)的接口1012、传感器组件1014以及通信组件1016。10, a device 1000 may include one or more of the following components: a processing component 1002, a memory 1004, a power supply component 1006, a multimedia component 1008, an audio component 1010, an input/output (I/O) interface 1012, a sensor component 1014, and Communication component 1016.
处理组件1002通常控制设备1000的整体操作,诸如与显示、电话呼叫、数据通信、相机操作和记录操作相关联的操作。处理组件1002可以包括一个或多个处理器1020来执行指令,以完成上述的方法的全部或部分步骤。此外,处理组件1002可以包括一个或多个模块,便于处理组件1002和其他组件之间的交互。例如,处理组件1002可以包括多媒体模块,以方便多媒体组件1008和处理组件1002之间的交互。The processing component 1002 generally controls the overall operation of the device 1000, such as operations associated with display, phone calls, data communications, camera operations, and recording operations. The processing component 1002 can include one or more processors 1020 to execute instructions to perform all or some of the steps of the methods described above. Additionally, processing component 1002 may include one or more modules that facilitate interaction between processing component 1002 and other components. For example, processing component 1002 may include a multimedia module to facilitate interaction between multimedia component 1008 and processing component 1002.
存储器1004被配置为存储各种类型的数据以支持在设备1000的操作。这些数据的示例包括用于在设备1000上操作的任何应用程序或方法的指令、联系人数据、电话簿数据、消息、图片、视频等。存储器1004可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM)、电可擦除可编程只读存储器(EEPROM)、可擦除可编程只读存储器(EPROM)、可编程只读存储器(PROM)、只读存储器(ROM)、 磁存储器、快闪存储器、磁盘或光盘。 Memory 1004 is configured to store various types of data to support operation at device 1000 . Examples of such data include instructions for any application or method operating on device 1000, contact data, phonebook data, messages, pictures, videos, and the like. Memory 1004 may be implemented by any type of volatile or non-volatile storage device or combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable programmable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Magnetic or Optical Disk.
电源组件1006为设备1000的各种组件提供电力。电源组件1006可以包括电源管理系统,一个或多个电源,及其他与为设备1000生成、管理和分配电力相关联的组件。 Power supply assembly 1006 provides power to various components of device 1000 . Power supply components 1006 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power to device 1000 .
多媒体组件1008包括在所述设备1000和用户之间的提供一个输出接口的屏幕。在一些实施例中,屏幕可以包括液晶显示器(LCD)和触摸面板(TP)。如果屏幕包括触摸面板,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。所述触摸传感器可以不仅感测触摸或滑动动作的边界,而且还检测与所述触摸或滑动操作相关的持续时间和压力。在一些实施例中,多媒体组件1008包括一个前置摄像头和/或后置摄像头。当设备1000处于操作模式,如拍摄模式或视频模式时,前置摄像头和/或后置摄像头可以接收外部的多媒体数据。每个前置摄像头和后置摄像头可以是一个固定的光学透镜系统或具有焦距和光学变焦能力。 Multimedia component 1008 includes a screen that provides an output interface between the device 1000 and the user. In some embodiments, the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touch, swipe, and gestures on the touch panel. The touch sensor may not only sense the boundaries of a touch or swipe action, but also detect the duration and pressure associated with the touch or swipe action. In some embodiments, the multimedia component 1008 includes a front-facing camera and/or a rear-facing camera. When the device 1000 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera may receive external multimedia data. Each of the front and rear cameras can be a fixed optical lens system or have focal length and optical zoom capability.
音频组件1010被配置为输出和/或输入音频信号。例如,音频组件1010包括一个麦克风(MIC),当设备1000处于操作模式,如呼叫模式、记录模式和语音识别模式时,麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器1004或经由通信组件1016发送。在一些实施例中,音频组件1010还包括一个扬声器,用于输出音频信号。 Audio component 1010 is configured to output and/or input audio signals. For example, audio component 1010 includes a microphone (MIC) that is configured to receive external audio signals when device 1000 is in operating modes, such as call mode, recording mode, and voice recognition mode. The received audio signal may be further stored in memory 1004 or transmitted via communication component 1016 . In some embodiments, audio component 1010 also includes a speaker for outputting audio signals.
I/O接口1012为处理组件1002和外围接口模块之间提供接口,上述外围接口模块可以是键盘,点击轮,按钮等。这些按钮可包括但不限于:主页按钮、音量按钮、启动按钮和锁定按钮。The I/O interface 1012 provides an interface between the processing component 1002 and a peripheral interface module, which may be a keyboard, a click wheel, a button, or the like. These buttons may include, but are not limited to: home button, volume buttons, start button, and lock button.
传感器组件1014包括一个或多个传感器,用于为设备1000提供各个方面的状态评估。例如,传感器组件1014可以检测到设备1000的打开/关闭状态,组件的相对定位,例如所述组件为设备1000的显示器和小键盘,传感器组件1014还可以检测设备1000或设备1000一个组件的位置改变,用户与设备1000接触的存在或不存在,设备1000方位或加速/减速和设备1000的温度变化。传感器组件1014可以包括接近传感器,被配置用来在没有任何的物理接触时检测附近物体的存在。传感器组件1014还可以包括光传感器,如CMOS或CCD图像传感器,用于在成像应用中使用。在一些实施例中,该传感器组件1014还可以包括加速度传感器、陀螺仪传感器、磁传感器、压力传感器或温度传感器。 Sensor assembly 1014 includes one or more sensors for providing status assessment of various aspects of device 1000 . For example, the sensor component 1014 can detect the open/closed state of the device 1000, the relative positioning of components, such as the display and keypad of the device 1000, and the sensor component 1014 can also detect a change in the position of the device 1000 or a component of the device 1000 , the presence or absence of user contact with the device 1000 , the device 1000 orientation or acceleration/deceleration and the temperature change of the device 1000 . Sensor assembly 1014 may include a proximity sensor configured to detect the presence of nearby objects in the absence of any physical contact. Sensor assembly 1014 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 1014 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
通信组件1016被配置为便于设备1000和其他设备之间有线或无线方式的通信。设备 1000可以接入基于通信标准的无线网络,如WiFi,运营商网络(如2G、3G、4G或5G),或它们的组合。在一些实施例中,通信组件1016经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一些实施例中,所述通信组件1016还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。 Communication component 1016 is configured to facilitate wired or wireless communication between device 1000 and other devices. Device 1000 may access wireless networks based on communication standards, such as WiFi, carrier networks (such as 2G, 3G, 4G, or 5G), or a combination thereof. In some embodiments, the communication component 1016 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In some embodiments, the communication component 1016 also includes a near field communication (NFC) module to facilitate short-range communication. For example, the NFC module may be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.
在一些实施例中,设备1000可以被一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述方法。In some embodiments, device 1000 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gates An array (FPGA), controller, microcontroller, microprocessor, or other electronic component implementation for performing the above method.
在一些实施例中,还提供了一种包括指令的非临时性计算机可读存储介质,例如包括指令的存储器1004,上述指令可由设备1000的处理器1020执行以完成上述方法。例如,所述非临时性计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。In some embodiments, there is also provided a non-transitory computer-readable storage medium including instructions, such as memory 1004 including instructions, executable by the processor 1020 of the device 1000 to perform the method described above. For example, the non-transitory computer-readable storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, and the like.
本领域技术人员在考虑说明书及实践这里公开的内容后,将容易想到本公开的其它实施方案。本公开旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由下面的权利要求指出。Other embodiments of the present disclosure will readily occur to those skilled in the art upon consideration of the specification and practice of what is disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of this disclosure that follow the general principles of this disclosure and include common general knowledge or techniques in the technical field not disclosed by this disclosure . The specification and examples are to be regarded as exemplary only, with the true scope and spirit of the disclosure being indicated by the following claims.
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。It is to be understood that the present disclosure is not limited to the precise structures described above and illustrated in the accompanying drawings, and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (18)

  1. 一种图像处理方法,包括:An image processing method, comprising:
    采集人脸视频数据,将所述人脸视频数据中的每帧人脸图像作为待处理图像;collecting face video data, and using each frame of face image in the face video data as an image to be processed;
    对所述待处理图像进行人脸识别,得到人脸关键点;Perform face recognition on the to-be-processed image to obtain face key points;
    从所述人脸关键点中提取预设区域的参考关键点,根据所述参考关键点的位置信息确定放大系数;Extract the reference key points of the preset area from the face key points, and determine the amplification factor according to the position information of the reference key points;
    根据所述放大系数对人脸进行放大,并根据所述人脸识别得到的人脸运动信息对放大后的所述人脸进行人脸跟踪。The human face is enlarged according to the enlargement coefficient, and face tracking is performed on the enlarged human face according to the face motion information obtained by the face recognition.
  2. 根据权利要求1所述的图像处理方法,其中,所述方法还包括:The image processing method according to claim 1, wherein the method further comprises:
    从所述参考关键点中获取人脸中心区域中的目标关键点;Obtain the target key point in the central area of the face from the reference key point;
    根据预设的目标关键点的位置和目标位置的对应关系,确定与所述目标关键点对应的目标位置;Determine the target position corresponding to the target key point according to the preset correspondence between the position of the target key point and the target position;
    以所述目标位置相对所述目标关键点的方向对所述人脸进行移动,直至所述目标关键点到达所述目标位置。The face is moved in the direction of the target position relative to the target key point until the target key point reaches the target position.
  3. 根据权利要求2所述的图像处理方法,其中,所述预设的目标关键点的位置和目标位置的对应关系,包括:The image processing method according to claim 2, wherein the preset correspondence between the position of the target key point and the target position comprises:
    所述目标关键点的位置包括多个取值区间,每个所述取值区间与相应的变化率对应,所述变化率为目标位置随着所述目标关键点的位置变化而变化的程度,所述取值区间基于图像采集页面在预设方向上的尺寸确定。The position of the target key point includes a plurality of value intervals, each of the value intervals corresponds to a corresponding change rate, and the change rate is the degree to which the target position changes with the position change of the target key point, The value interval is determined based on the size of the image capture page in the preset direction.
  4. 根据权利要求3所述的图像处理方法,其中,将所述图像采集页面在预设方向上的尺寸进行划分,得到依次衔接的第一取值空间、第二取值空间和第三取值空间;所述变化率按照以下方式确定:The image processing method according to claim 3, wherein the size of the image capture page in a preset direction is divided to obtain a first value space, a second value space and a third value space connected in sequence ; the rate of change is determined as follows:
    从所述目标关键点的位置中获取所述预设方向上的坐标值;Obtain the coordinate value in the preset direction from the position of the target key point;
    响应于所述坐标值位于第一取值空间或第三取值空间,所述目标位置的变化率为第一变化率;In response to the coordinate value being located in the first value space or the third value space, the rate of change of the target position is the first rate of change;
    响应于所述坐标值位于第二取值空间,所述目标位置的变化率为第二变化率;In response to the coordinate value being located in the second value space, the rate of change of the target position is a second rate of change;
    所述第一变化率大于所述第二变化率。The first rate of change is greater than the second rate of change.
  5. 根据权利要求3所述的图像处理方法,其中,所述目标关键点为鼻尖关键点。The image processing method according to claim 3, wherein the target key point is a nose tip key point.
  6. 根据权利要求1所述的图像处理方法,其中,所述根据所述参考关键点的位置信息 确定放大系数,包括:The image processing method according to claim 1, wherein the determining the amplification factor according to the position information of the reference key point comprises:
    根据所述参考关键点的位置信息确定水平区域的第一相对距离和垂直区域的第二相对距离;Determine the first relative distance of the horizontal area and the second relative distance of the vertical area according to the position information of the reference key point;
    获取人脸三维角度,所述人脸三维角度包括俯仰角和偏航角;obtaining a three-dimensional angle of the face, where the three-dimensional angle of the face includes a pitch angle and a yaw angle;
    根据所述俯仰角和所述偏航角确定与所述第一相对距离对应的第一权重,以及与所述第二相对距离对应的第二权重;determining a first weight corresponding to the first relative distance and a second weight corresponding to the second relative distance according to the pitch angle and the yaw angle;
    获取所述第一相对距离与所述第一权重的乘积和所述第二相对距离与所述第二权重的乘积之和;obtaining the sum of the product of the first relative distance and the first weight and the product of the second relative distance and the second weight;
    确定所述放大系数为图像采集页面的宽度与所述乘积之和的比值。The magnification factor is determined as the ratio of the width of the image capture page to the sum of the products.
  7. 根据权利要求6所述的图像处理方法,其中,所述根据所述俯仰角和所述偏航角确定与所述第一相对距离对应的第一权重,以及与所述第二相对距离对应的第二权重,包括:The image processing method according to claim 6, wherein the first weight corresponding to the first relative distance is determined according to the pitch angle and the yaw angle, and a weight corresponding to the second relative distance is determined. Second weight, including:
    确定所述第一权重为所述俯仰角与所述俯仰角和所述偏航角之和的比值;determining that the first weight is the ratio of the pitch angle to the sum of the pitch angle and the yaw angle;
    确定所述第二权重为所述偏航角与所述俯仰角和所述偏航角之和的比值。The second weight is determined as a ratio of the yaw angle to the sum of the pitch angle and the yaw angle.
  8. 根据权利要求6所述的图像处理方法,其中,所述预设区域为人脸T型区域,所述人脸T型区域包括额头中央区域和人脸中央区域;所述参考关键点包括左眼关键点、右眼关键点、眉间关键点和鼻尖关键点;The image processing method according to claim 6, wherein the preset area is a T-shaped area of the human face, and the T-shaped area of the human face includes the central area of the forehead and the central area of the human face; the reference key points include the left eye key point, right eye key point, eyebrow key point and nose tip key point;
    所述根据所述参考关键点的位置信息确定水平区域的第一相对距离和垂直区域的第二相对距离,包括:The determining of the first relative distance of the horizontal area and the second relative distance of the vertical area according to the position information of the reference key point includes:
    确定所述第一相对距离为所述左眼关键点与所述右眼关键点之间的距离;determining that the first relative distance is the distance between the left eye key point and the right eye key point;
    确定所述第二相对距离为所述眉间关键点与所述鼻尖关键点之间的距离。The second relative distance is determined as the distance between the key point between the eyebrows and the key point of the tip of the nose.
  9. 一种图像处理装置,包括:An image processing device, comprising:
    图像采集模块,被配置为采集人脸视频数据,将所述人脸视频数据中的每帧人脸图像作为待处理图像;an image acquisition module, configured to collect face video data, and use each frame of face image in the face video data as an image to be processed;
    人脸识别模块,被配置为对所述待处理图像进行人脸识别,得到人脸关键点;a face recognition module, configured to perform face recognition on the to-be-processed image to obtain face key points;
    系数确定模块,被配置为从所述人脸关键点中提取预设区域的参考关键点,根据所述参考关键点的位置信息确定放大系数;a coefficient determination module, configured to extract a reference key point of a preset area from the face key points, and determine an amplification coefficient according to the position information of the reference key point;
    人脸处理模块,被配置为根据所述放大系数对人脸进行放大,并根据所述人脸识别得到的人脸运动信息对放大后的所述人脸进行人脸跟踪。The face processing module is configured to amplify the face according to the amplification factor, and perform face tracking on the enlarged face according to the face motion information obtained by the face recognition.
  10. 根据权利要求9所述的图像处理装置,其中,所述装置还包括:The image processing apparatus according to claim 9, wherein the apparatus further comprises:
    关键点获取模块,被配置为从所述参考关键点中获取人脸中心区域中的目标关键点;a key point acquisition module, configured to acquire target key points in the central area of the face from the reference key points;
    位置确定模块,被配置为根据预设的目标关键点的位置和目标位置的对应关系,确定与所述目标关键点对应的目标位置;a position determination module, configured to determine the target position corresponding to the target key point according to the preset correspondence between the position of the target key point and the target position;
    移动模块,被配置为以所述目标位置相对所述目标关键点的方向对所述人脸进行移动,直至所述目标关键点到达所述目标位置。The moving module is configured to move the face in the direction of the target position relative to the target key point until the target key point reaches the target position.
  11. 根据权利要求10所述的图像处理装置,其中,所述预设的目标关键点的位置和目标位置的对应关系,包括:The image processing apparatus according to claim 10, wherein the preset correspondence between the positions of the target key points and the target positions comprises:
    所述目标关键点的位置包括多个取值区间,每个所述取值区间与相应的变化率对应,所述变化率为目标位置随着所述目标关键点的位置变化而变化的程度,所述取值区间基于目标关键点的位置与图像采集页面的边界之间的距离确定。The position of the target key point includes a plurality of value intervals, each of the value intervals corresponds to a corresponding change rate, and the change rate is the degree to which the target position changes with the position change of the target key point, The value interval is determined based on the distance between the position of the target key point and the boundary of the image capture page.
  12. 根据权利要求11所述的图像处理装置,其中,所述变化率按照以下方式确定:The image processing apparatus of claim 11, wherein the rate of change is determined as follows:
    响应于所述目标关键点与所述图像采集页面的边界的距离小于阈值,所述目标位置的变化率为第一变化率;In response to the distance between the target key point and the boundary of the image capture page being less than a threshold, the rate of change of the target position is a first rate of change;
    响应于所述距离大于或者等于所述阈值,所述目标位置的变化率为第二变化率;In response to the distance being greater than or equal to the threshold, the rate of change of the target location is a second rate of change;
    所述第一变化率大于所述第二变化率。The first rate of change is greater than the second rate of change.
  13. 根据权利要求11所述的图像处理装置,其中,所述目标关键点为鼻尖关键点。The image processing apparatus according to claim 11, wherein the target key point is a nose tip key point.
  14. 根据权利要求9所述的图像处理装置,其中,所述系数确定模块,包括:The image processing apparatus according to claim 9, wherein the coefficient determination module comprises:
    距离确定单元,被配置为根据所述参考关键点的位置信息确定水平区域的第一相对距离和垂直区域的第二相对距离;a distance determination unit, configured to determine a first relative distance of the horizontal area and a second relative distance of the vertical area according to the position information of the reference key point;
    角度获取单元,被配置为获取人脸三维角度,所述人脸三维角度包括俯仰角和偏航角;an angle obtaining unit, configured to obtain a three-dimensional angle of the face, where the three-dimensional angle of the face includes a pitch angle and a yaw angle;
    权重确定单元,被配置为根据所述俯仰角和所述偏航角确定与所述第一相对距离对应的第一权重,以及与所述第二相对距离对应的第二权重;a weight determination unit configured to determine a first weight corresponding to the first relative distance and a second weight corresponding to the second relative distance according to the pitch angle and the yaw angle;
    计算单元,被配置为获取所述第一相对距离与所述第一权重的乘积和所述第二相对距离与所述第二权重的乘积之和;a calculation unit configured to obtain the sum of the product of the first relative distance and the first weight and the product of the second relative distance and the second weight;
    系数确定单元,被配置为确定所述放大系数为图像采集页面的宽度与所述乘积之和的比值。A coefficient determination unit configured to determine the enlargement coefficient as a ratio of the width of the image capture page to the sum of the products.
  15. 根据权利要求14所述的图像处理装置,其中,所述权重确定单元,被配置为确定 所述第一权重为所述俯仰角与所述俯仰角和所述偏航角之和的比值;确定所述第二权重为所述偏航角与所述俯仰角和所述偏航角之和的比值。The image processing apparatus according to claim 14, wherein the weight determination unit is configured to determine the first weight as a ratio of the pitch angle to the sum of the pitch angle and the yaw angle; determine The second weight is a ratio of the yaw angle to the sum of the pitch angle and the yaw angle.
  16. 根据权利要求14所述的图像处理装置,其中,所述预设区域为人脸T型区域,所述人脸T型区域包括额头中央区域和人脸中央区域;所述参考关键点包括左眼关键点、右眼关键点、眉间关键点和鼻尖关键点;The image processing device according to claim 14, wherein the preset area is a T-shaped area of a human face, and the T-shaped area of the human face includes a central area of the forehead and a central area of the human face; the reference key points include a left eye key point, right eye key point, eyebrow key point and nose tip key point;
    所述距离确定单元,被配置为确定所述第一相对距离为所述左眼关键点与所述右眼关键点之间的距离;确定所述第二相对距离为所述眉间关键点与所述鼻尖关键点之间的距离。The distance determination unit is configured to determine the first relative distance as the distance between the left eye key point and the right eye key point; determine the second relative distance as the between the eyebrow key point and the eyebrow key point. The distance between the nose tip key points.
  17. 一种电子设备,包括:An electronic device comprising:
    处理器;processor;
    用于存储所述处理器可执行指令的存储器;a memory for storing the processor-executable instructions;
    其中,所述处理器被配置为执行所述指令,以实现以下处理:wherein the processor is configured to execute the instructions to implement the following processes:
    采集人脸视频数据,将所述人脸视频数据中的每帧人脸图像作为待处理图像;collecting face video data, and using each frame of face image in the face video data as an image to be processed;
    对所述待处理图像进行人脸识别,得到人脸关键点;Perform face recognition on the to-be-processed image to obtain face key points;
    从所述人脸关键点中提取预设区域的参考关键点,根据所述参考关键点的位置信息确定放大系数;Extract the reference key points of the preset area from the face key points, and determine the amplification factor according to the position information of the reference key points;
    根据所述放大系数对人脸进行放大,并根据所述人脸识别得到的人脸运动信息对放大后的所述人脸进行人脸跟踪。The human face is enlarged according to the enlargement coefficient, and face tracking is performed on the enlarged human face according to the face motion information obtained by the face recognition.
  18. 一种存储介质,其中,当所述存储介质中的指令由电子设备的处理器执行时,使得所述电子设备能够执行以下处理:A storage medium, wherein, when the instructions in the storage medium are executed by a processor of an electronic device, the electronic device is enabled to perform the following processing:
    采集人脸视频数据,将所述人脸视频数据中的每帧人脸图像作为待处理图像;collecting face video data, and using each frame of face image in the face video data as an image to be processed;
    对所述待处理图像进行人脸识别,得到人脸关键点;Perform face recognition on the to-be-processed image to obtain face key points;
    从所述人脸关键点中提取预设区域的参考关键点,根据所述参考关键点的位置信息确定放大系数;Extract the reference key points of the preset area from the face key points, and determine the amplification factor according to the position information of the reference key points;
    根据所述放大系数对人脸进行放大,并根据所述人脸识别得到的人脸运动信息对放大后的所述人脸进行人脸跟踪。The human face is enlarged according to the enlargement coefficient, and face tracking is performed on the enlarged human face according to the face motion information obtained by the face recognition.
PCT/CN2021/128769 2020-12-10 2021-11-04 Image processing method and apparatus WO2022121577A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011434480.4 2020-12-10
CN202011434480.4A CN112509005B (en) 2020-12-10 2020-12-10 Image processing method, image processing device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2022121577A1 true WO2022121577A1 (en) 2022-06-16

Family

ID=74970472

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/128769 WO2022121577A1 (en) 2020-12-10 2021-11-04 Image processing method and apparatus

Country Status (2)

Country Link
CN (1) CN112509005B (en)
WO (1) WO2022121577A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116306733A (en) * 2023-02-27 2023-06-23 荣耀终端有限公司 Method for amplifying two-dimensional code and electronic equipment

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112509005B (en) * 2020-12-10 2023-01-20 北京达佳互联信息技术有限公司 Image processing method, image processing device, electronic equipment and storage medium
CN113778233B (en) * 2021-09-16 2022-04-05 广东魅视科技股份有限公司 Method and device for controlling display equipment and readable medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108460343A (en) * 2018-02-06 2018-08-28 北京达佳互联信息技术有限公司 Image processing method, system and server
CN108550185A (en) * 2018-05-31 2018-09-18 Oppo广东移动通信有限公司 Beautifying faces treating method and apparatus
CN110175558A (en) * 2019-05-24 2019-08-27 北京达佳互联信息技术有限公司 A kind of detection method of face key point, calculates equipment and storage medium at device
CN110415164A (en) * 2018-04-27 2019-11-05 武汉斗鱼网络科技有限公司 Facial metamorphosis processing method, storage medium, electronic equipment and system
US20200335136A1 (en) * 2019-07-02 2020-10-22 Beijing Dajia Internet Information Technology Co., Ltd. Method and device for processing video
CN112509005A (en) * 2020-12-10 2021-03-16 北京达佳互联信息技术有限公司 Image processing method, image processing device, electronic equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108460343A (en) * 2018-02-06 2018-08-28 北京达佳互联信息技术有限公司 Image processing method, system and server
CN110415164A (en) * 2018-04-27 2019-11-05 武汉斗鱼网络科技有限公司 Facial metamorphosis processing method, storage medium, electronic equipment and system
CN108550185A (en) * 2018-05-31 2018-09-18 Oppo广东移动通信有限公司 Beautifying faces treating method and apparatus
CN110175558A (en) * 2019-05-24 2019-08-27 北京达佳互联信息技术有限公司 A kind of detection method of face key point, calculates equipment and storage medium at device
US20200335136A1 (en) * 2019-07-02 2020-10-22 Beijing Dajia Internet Information Technology Co., Ltd. Method and device for processing video
CN112509005A (en) * 2020-12-10 2021-03-16 北京达佳互联信息技术有限公司 Image processing method, image processing device, electronic equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116306733A (en) * 2023-02-27 2023-06-23 荣耀终端有限公司 Method for amplifying two-dimensional code and electronic equipment
CN116306733B (en) * 2023-02-27 2024-03-19 荣耀终端有限公司 Method for amplifying two-dimensional code and electronic equipment

Also Published As

Publication number Publication date
CN112509005B (en) 2023-01-20
CN112509005A (en) 2021-03-16

Similar Documents

Publication Publication Date Title
US11114130B2 (en) Method and device for processing video
WO2022121577A1 (en) Image processing method and apparatus
WO2019134516A1 (en) Method and device for generating panoramic image, storage medium, and electronic apparatus
US11030733B2 (en) Method, electronic device and storage medium for processing image
CN109087238B (en) Image processing method and apparatus, electronic device, and computer-readable storage medium
CN109242765B (en) Face image processing method and device and storage medium
JP2016531362A (en) Skin color adjustment method, skin color adjustment device, program, and recording medium
CN112348933B (en) Animation generation method, device, electronic equipment and storage medium
US11308692B2 (en) Method and device for processing image, and storage medium
CN109325908B (en) Image processing method and device, electronic equipment and storage medium
EP3975046B1 (en) Method and apparatus for detecting occluded image and medium
WO2023273499A1 (en) Depth measurement method and apparatus, electronic device, and storage medium
WO2023273498A1 (en) Depth detection method and apparatus, electronic device, and storage medium
CN107977636B (en) Face detection method and device, terminal and storage medium
CN112541400A (en) Behavior recognition method and device based on sight estimation, electronic equipment and storage medium
CN111144266B (en) Facial expression recognition method and device
CN110807769B (en) Image display control method and device
WO2020114097A1 (en) Boundary box determining method and apparatus, electronic device, and storage medium
CN111340691A (en) Image processing method, image processing device, electronic equipment and storage medium
WO2021189927A1 (en) Image processing method and apparatus, electronic device, and storage medium
CN107239758B (en) Method and device for positioning key points of human face
CN111489284B (en) Image processing method and device for image processing
CN113642551A (en) Nail key point detection method and device, electronic equipment and storage medium
CN110110742B (en) Multi-feature fusion method and device, electronic equipment and storage medium
CN116320721A (en) Shooting method, shooting device, terminal and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21902284

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 25.09.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21902284

Country of ref document: EP

Kind code of ref document: A1