CN110610171A - Image processing method and device, electronic equipment and computer readable storage medium - Google Patents

Image processing method and device, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN110610171A
CN110610171A CN201910905113.9A CN201910905113A CN110610171A CN 110610171 A CN110610171 A CN 110610171A CN 201910905113 A CN201910905113 A CN 201910905113A CN 110610171 A CN110610171 A CN 110610171A
Authority
CN
China
Prior art keywords
face
image
region
portrait
angle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910905113.9A
Other languages
Chinese (zh)
Inventor
黄海东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN201910905113.9A priority Critical patent/CN110610171A/en
Publication of CN110610171A publication Critical patent/CN110610171A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The application relates to an image processing method and device, an electronic device and a computer readable storage medium. The method comprises the steps of obtaining an image to be identified; detecting whether a human face exists in the image to be recognized; when the image to be recognized has a face, acquiring a candidate region containing a portrait according to the face; the portrait includes the face; inputting the candidate area into a portrait segmentation network to obtain a portrait area, and taking the portrait area as a main area of the image to be identified; and when the face does not exist in the image to be recognized, inputting the image to be recognized into a main body recognition network to obtain a main body area of the image to be recognized. The method and the device, the electronic equipment and the computer readable storage medium can improve the accuracy of the subject identification.

Description

Image processing method and device, electronic equipment and computer readable storage medium
Technical Field
The present application relates to the field of image technologies, and in particular, to an image processing method and apparatus, an electronic device, and a computer-readable storage medium.
Background
With the development of imaging technology, people are more and more accustomed to shooting images or videos through image acquisition equipment such as a camera on electronic equipment and recording various information. After the electronic device acquires the image, the main body of the image is often required to be identified, so that the clearer image of the main body can be acquired. However, when the traditional subject recognition technology is used for recognizing the portrait, the most significant area is often used as the subject, the portrait cannot be recognized accurately, and the problem of inaccurate image processing exists.
Disclosure of Invention
The embodiment of the application provides an image processing method, an image processing device, electronic equipment and a computer readable storage medium, which can improve the accuracy of subject identification.
An image processing method comprising:
acquiring an image to be identified;
detecting whether a human face exists in the image to be recognized;
when a face exists in the image to be recognized, acquiring a candidate region containing a portrait according to the face; the portrait includes the face;
inputting the candidate area into a portrait segmentation network to obtain a portrait area, and taking the portrait area as a main area of the image to be identified;
and when the face does not exist in the image to be recognized, inputting the image to be recognized into a main body recognition network to obtain a main body area of the image to be recognized.
An image processing apparatus comprising:
the image acquisition module is used for acquiring an image to be identified;
the face detection module is used for detecting whether a face exists in the image to be identified;
the candidate region acquisition module is used for acquiring a candidate region containing a portrait according to the face when the face exists in the image to be identified; the portrait includes the face;
the portrait segmentation module is used for inputting the candidate region into a portrait segmentation network to obtain a portrait region, and the portrait region is used as a main region of the image to be identified;
and the main body identification module is used for inputting the image to be identified into a main body identification network when the human face does not exist in the image to be identified, so as to obtain a main body area of the image to be identified.
An electronic device comprises a memory and a processor, wherein the memory stores a computer program, and the computer program causes the processor to execute the steps of the image processing method when executed by the processor.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method.
According to the image processing method and device, the electronic equipment and the computer readable storage medium, when the face of the image to be recognized is detected, the candidate region containing the portrait is obtained according to the face, the candidate region is input into the portrait segmentation network to obtain the portrait region, and the portrait region is used as the main region of the image to be recognized; and when the face does not exist in the image to be recognized, inputting the image to be recognized into a main body recognition network to obtain a main body area of the image to be recognized. The method comprises the steps that a two-way network of a main body area of an image to be recognized is obtained through design, namely when a human face does not exist in the image to be recognized, the main body area is obtained through the main body recognition network; when the face exists in the image to be recognized, the candidate image containing the portrait is determined from the image to be recognized, and a more accurate portrait area can be obtained as a main area through the portrait segmentation network.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic diagram of an image processing circuit in one embodiment;
FIG. 2 is a flow diagram of a method of image processing in one embodiment;
FIG. 3 is a flow chart of an image processing method in another embodiment;
FIG. 4 is a flow diagram of steps in one embodiment for obtaining candidate regions;
FIG. 5 is a flow diagram of steps in an embodiment for determining an angle of a face;
FIG. 6 is a schematic diagram of image processing in another embodiment;
FIG. 7 is a block diagram showing the configuration of an image processing apparatus according to an embodiment;
fig. 8 is a schematic diagram of an internal structure of an electronic device in one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
It will be understood that, as used herein, the terms "first," "second," and the like may be used herein to describe various elements, but these elements are not limited by these terms. These terms are only used to distinguish one element from another. For example, a first portrait area may be referred to as a second portrait area, and similarly, a second portrait area may be referred to as a first portrait area, without departing from the scope of the present application. The first portrait area and the second portrait area are both portrait areas, but they are not the same portrait area.
The embodiment of the application provides electronic equipment. The electronic device includes therein an Image Processing circuit, which may be implemented using hardware and/or software components, and may include various Processing units defining an ISP (Image Signal Processing) pipeline. FIG. 1 is a schematic diagram of an image processing circuit in one embodiment. As shown in fig. 1, for convenience of explanation, only aspects of the image processing technology related to the embodiments of the present application are shown.
As shown in fig. 1, the image processing circuit includes an ISP processor 140 and control logic 150. The image data captured by the imaging device 110 is first processed by the ISP processor 140, and the ISP processor 140 analyzes the image data to capture image statistics that may be used to determine and/or control one or more parameters of the imaging device 110. The imaging device 110 may include a camera having one or more lenses 112 and an image sensor 114. The image sensor 114 may include an array of color filters (e.g., Bayer filters), and the image sensor 114 may acquire light intensity and wavelength information captured with each imaging pixel of the image sensor 114 and provide a set of raw image data that may be processed by the ISP processor 140. The attitude sensor 120 (e.g., three-axis gyroscope, hall sensor, accelerometer) may provide parameters of the acquired image processing (e.g., anti-shake parameters) to the ISP processor 140 based on the type of interface of the attitude sensor 120. The attitude sensor 120 interface may utilize an SMIA (Standard Mobile imaging architecture) interface, other serial or parallel camera interfaces, or a combination of the above.
In addition, the image sensor 114 may also send raw image data to the attitude sensor 120, the sensor 120 may provide the raw image data to the ISP processor 140 based on the type of interface of the attitude sensor 120, or the attitude sensor 120 may store the raw image data in the image memory 130.
The ISP processor 140 processes the raw image data pixel by pixel in a variety of formats. For example, each image pixel may have a bit depth of 8, 10, 12, or 14 bits, and the ISP processor 140 may perform one or more image processing operations on the raw image data, gathering statistical information about the image data. Wherein the image processing operations may be performed with the same or different bit depth precision.
The ISP processor 140 may also receive image data from the image memory 130. For example, the attitude sensor 120 interface sends raw image data to the image memory 130, and the raw image data in the image memory 130 is then provided to the ISP processor 140 for processing. The image Memory 130 may be a portion of a Memory device, a storage device, or a separate dedicated Memory within an electronic device, and may include a DMA (Direct Memory Access) feature.
Upon receiving raw image data from the image sensor 114 interface or from the attitude sensor 120 interface or from the image memory 130, the ISP processor 140 may perform one or more image processing operations, such as temporal filtering. The processed image data may be sent to image memory 130 for additional processing before being displayed. ISP processor 140 receives processed data from image memory 130 and performs image data processing on the processed data in the raw domain and in the RGB and YCbCr color spaces. The image data processed by ISP processor 140 may be output to display 160 for viewing by a user and/or further processed by a Graphics Processing Unit (GPU). Further, the output of the ISP processor 140 may also be sent to the image memory 130, and the display 160 may read image data from the image memory 130. In one embodiment, image memory 130 may be configured to implement one or more frame buffers.
The statistical data determined by the ISP processor 140 may be transmitted to the control logic 150 unit. For example, the statistical data may include image sensor 114 statistics such as gyroscope vibration frequency, auto-exposure, auto-white balance, auto-focus, flicker detection, black level compensation, lens 112 shading correction, and the like. The control logic 150 may include a processor and/or microcontroller that executes one or more routines (e.g., firmware) that may determine control parameters of the imaging device 110 and control parameters of the ISP processor 140 based on the received statistical data. For example, the control parameters of the imaging device 110 may include attitude sensor 120 control parameters (e.g., gain, integration time of exposure control, anti-shake parameters, etc.), camera flash control parameters, camera anti-shake displacement parameters, lens 112 control parameters (e.g., focal length for focusing or zooming), or a combination of these parameters. The ISP control parameters may include gain levels and color correction matrices for automatic white balance and color adjustment (e.g., during RGB processing), as well as lens 112 shading correction parameters.
In one embodiment, the image to be recognized is acquired through the lens 112 and the image sensor 114 in the imaging device (camera) 110 and is sent to the ISP processor 140. After receiving the image to be recognized, the ISP processor 140 detects whether a face exists in the image to be recognized; when a face exists in an image to be recognized, acquiring a candidate region containing a portrait according to the face; the portrait includes a human face; and (4) carrying out portrait segmentation on the candidate area, namely inputting the candidate area into a portrait segmentation network to obtain a portrait area, and taking the portrait area as a main area of the image to be identified.
When the face does not exist in the image to be recognized, the ISP processor 140 performs main body recognition on the image to be recognized, that is, the image to be recognized is input into a main body recognition network, so as to obtain a main body area of the image to be recognized.
In one embodiment, the identified subject region may be sent to control logic 150. After receiving the subject area, the control logic 150 controls the lens 112 in the imaging device 110 to move, so as to focus on the subject corresponding to the subject area, and obtain a clearer image of the subject.
FIG. 2 is a flow diagram of a method of image processing in one embodiment. As shown in fig. 2, the image processing method includes steps 202 to 210.
Step 202, acquiring an image to be identified.
The image to be recognized refers to an image for recognizing a subject region. By identifying the image to be identified, the main body in the image to be identified can be obtained. The image to be recognized may be one of an RGB (Red, Green, Blue) image, a grayscale image, and the like. The RGB image can be captured by a color camera. The grayscale image can be obtained by shooting with a black and white camera. The image to be recognized may be stored locally by the electronic device, may be stored by other devices, may be stored from a network, and may also be captured in real time by the electronic device, without being limited thereto.
Specifically, an ISP processor or a central processing unit of the electronic device may obtain an image to be recognized from a local or other device or a network, or obtain the image to be recognized by shooting a scene through a camera.
And 204, detecting whether a human face exists in the image to be recognized.
Generally, a human face includes features such as eyes, a nose, a mouth, ears, eyebrows, etc., and there is a corresponding positional relationship between the features, for example, the left eye and the right eye are symmetrical, the mouth is symmetrical, the nose is located in the middle, the ears are located at both sides of the human face, and the eyebrows are located above the eyes. The face recognition platform detects whether the to-be-recognized image contains the characteristics of eyes, nose, mouth, ears, eyebrows and the like and the position relation of each characteristic, and can determine whether the face exists in the to-be-recognized image. In the face recognition platform, a large amount of face information is stored in advance, and the face recognition platform is used for face detection, so that the additional overhead can be reduced, and the computer resources can be saved.
Step 206, when a face exists in the image to be recognized, acquiring a candidate region containing the portrait according to the face; the portrait includes a human face.
The portrait refers to an area containing a human face, that is, the portrait may contain a human face, a neck, an arm, two legs, and the like. The candidate region refers to a region containing a portrait.
It can be understood that, the face is generally at the top of the portrait, and the region containing the portrait may be selected downwards as the candidate region according to the position of the face. The candidate region can be obtained by frame selection, that is, the candidate region is a rectangular region. The candidate region may also be a circular region, a square region, a triangular region, or the like, without being limited thereto.
In one embodiment, when a face exists in the image to be recognized, a candidate region containing the portrait is obtained according to the face. In another embodiment, when at least two faces exist in the image to be recognized, one face is determined from the at least two faces, and the candidate region containing the portrait is obtained according to the determined face. Wherein, a face is determined from at least two faces, and the face with the largest area can be determined from the sizes of the faces by comparing the areas of the faces; the position of each face in the image to be recognized can be compared, and the face closest to the center of the image to be recognized can be determined.
In one embodiment, when the image to be recognized has a face, the position of each face is obtained, and when the position of each face is not within the preset range of the image to be recognized, the step of inputting the image to be recognized into a subject recognition network is performed to obtain a subject region of the image to be recognized.
It is understood that when the user is taking a landscape photograph, some visitors may be taken in at the edge portion of the photograph, and the subject of the user is a landscape and not a visitor. Therefore, when the image to be recognized has a face, but the positions of the faces are all within the preset range of the image to be recognized, the step of inputting the image to be recognized into the subject recognition network to obtain a subject region of the image to be recognized is executed, that is, the image to be recognized is subject-recognized as an image without a face. Wherein the preset range may be a central region of the image to be recognized.
In one embodiment, when the image to be recognized has a face, the area of each face is obtained, and when the area of each face is smaller than an area threshold, the step of inputting the image to be recognized into a subject recognition network is performed to obtain a subject region of the image to be recognized.
It will be appreciated that when a user is capturing a landscape or capturing other objects, there may be some visitors or other people in the background of the photograph, and the subject being captured by the user is the landscape or other objects, not those in the background. Therefore, when the image to be recognized has a face but the area of each face is smaller than the area threshold, which indicates that the face is not the face of the subject to be photographed by the user, the step of inputting the image to be recognized into the subject recognition network to obtain the subject region of the image to be recognized is executed, that is, the image to be recognized is subject-recognized as an image without a face.
And step 208, inputting the candidate area into a portrait segmentation network to obtain a portrait area, and taking the portrait area as a main area of the image to be identified.
The portrait segmentation network refers to a network for segmenting out portrait areas. The divided portrait area may include all parts of the person or may include part of the person. For example, the portrait area may include all parts of a human face, a neck, two hands, two feet, an upper torso, and the like; the portrait area may only include a face, two hands, and a torso of the upper body.
And the candidate area comprises a portrait, and the portrait area can be obtained after the candidate area is input into the portrait segmentation network, and then the portrait area is used as a main area of the image to be identified.
In the conventional subject recognition technology, an image to be recognized with a human face is generally input into a subject recognition network, so that the problem of inaccurate recognition of a recognized portrait area exists, the whole portrait may not be recognized, or the problem of inaccurate subject recognition exists by taking the portrait and a region around the portrait as a subject area.
In the application, when the face exists in the image to be recognized, the face region can be obtained more accurately through the face segmentation network, and the face region is taken as the main body region of the image to be recognized, so that the main body region of the image to be recognized is recognized more accurately.
Step 210, when no human face exists in the image to be recognized, inputting the image to be recognized into a main body recognition network to obtain a main body area of the image to be recognized.
The subject identification (subject detection) is to automatically process the region of interest and selectively ignore the region of no interest when facing a scene. The region of interest is referred to as the subject region. The subject refers to various subjects, such as flowers, cats, dogs, cattle, sky, clouds, background, etc.
And when the face does not exist in the image to be recognized, inputting the image to be recognized into a main body recognition network, and recognizing a main body area of the image through the main body recognition network.
In one embodiment, the main body recognition network performs main body recognition on the image to be recognized to obtain a main body area of the image to be recognized, and includes: step 1, generating a central weight graph corresponding to an image to be identified, wherein the weight value represented by the central weight graph is gradually reduced from the center to the edge; step 2, inputting the image to be recognized and the central weight graph into a main body recognition model to obtain a main body region confidence map, wherein the main body recognition model is a model obtained by training in advance according to the image to be recognized, the central weight graph and a corresponding marked main body mask graph of the same scene; and 3, determining the main body area in the image to be recognized according to the main body area confidence map.
Step 1, generating a central weight map corresponding to the image to be processed, wherein the weight value represented by the central weight map is gradually reduced from the center to the edge.
The central weight map is used for recording the weight values of all pixel points in the image to be identified. The weight values recorded in the central weight map gradually decrease from the center to the four sides, i.e., the central weight is the largest, and the weight values gradually decrease toward the four sides. And the weight value from the image center pixel point to the image edge pixel point of the image to be identified is represented by the center weight graph and gradually reduced.
The ISP processor or central processor may generate a corresponding central weight map according to the size of the image to be identified. The weight value represented by the central weight map gradually decreases from the center to the four sides. The central weight map may be generated using a gaussian function, or using a first order equation, or a second order equation. The gaussian function may be a two-dimensional gaussian function.
And 2, inputting the image to be recognized and the central weight map into a main body recognition model to obtain a main body region confidence map, wherein the main body recognition model is a model obtained by training in advance according to the image to be recognized, the depth map, the central weight map and a corresponding marked main body mask map of the same scene.
The subject recognition model is obtained by acquiring a large amount of training data in advance and inputting the training data into the subject recognition model containing the initial network weight for training. Each group of training data comprises an image to be recognized, a center weight graph and a labeled main body mask graph corresponding to the same scene. The image to be recognized and the central weight graph are used as input of a trained main body recognition model, and the marked main body mask (mask) graph is used as an expected output real value (ground true) of the trained main body recognition model. The main body mask image is an image filter template used for identifying a main body in an image, and can shield other parts of the image and screen out the main body in the image. The subject recognition model may be trained to recognize and detect various subjects, such as people, flowers, cats, dogs, backgrounds, etc.
Specifically, the ISP processor or the central processing unit may input the image to be recognized and the central weight map into the subject recognition model, and perform detection to obtain a subject region confidence map. The subject region confidence map is used to record the probability of which recognizable subject the subject belongs to, for example, the probability of a certain pixel point belonging to a person is 0.8, the probability of a flower is 0.1, and the probability of a background is 0.1.
And 3, determining the main body region in the image to be recognized according to the main body region confidence map.
The subject refers to various subjects, such as human, flower, cat, dog, cow, blue sky, white cloud, background, etc. The main body region is a desired main body and can be selected as desired.
Specifically, the ISP processor or the central processing unit may select the highest confidence level or the highest confidence level as a main body in the image to be recognized according to the confidence level map of the main body region, and if there is one main body, the main body is used as the main body region; if multiple bodies are present, one or more of the bodies may be selected as the body region, as desired.
In the image processing method in this embodiment, after an image to be recognized is obtained, a central weight map corresponding to the image to be recognized is generated, the image to be recognized and the central weight map are input into a corresponding subject recognition model for detection, a subject region confidence map may be obtained, a subject region in the image to be recognized may be determined according to the subject region confidence map, an object in the center of the image may be more easily detected by using the central weight map, and a subject region in the image to be recognized may be more accurately recognized by using a trained subject recognition model obtained by using the image to be recognized, the central weight map, the subject mask map and the like.
According to the main body identification method, when the face of the image to be identified is detected, a candidate region containing the portrait is obtained according to the face, the candidate region is input into the portrait segmentation network to obtain the portrait region, and the portrait region is used as the main body region of the image to be identified; and when the face does not exist in the image to be recognized, inputting the image to be recognized into a main body recognition network to obtain a main body area of the image to be recognized. The method comprises the steps that a two-way network of a main body area of an image to be recognized is obtained through design, namely when a human face does not exist in the image to be recognized, the main body area is obtained through the main body recognition network; when the face exists in the image to be recognized, the candidate image containing the portrait is determined from the image to be recognized, and a more accurate portrait area can be obtained as a main area through the portrait segmentation network.
In one embodiment, when a face exists in an image to be recognized, acquiring a candidate region containing the face according to the face comprises: when detecting that at least two faces exist in an image to be recognized, respectively acquiring face areas containing the faces; a face region contains a face; respectively acquiring the areas of at least two face areas; comparing the areas of at least two face regions to obtain a face region with the largest area as a first face region; a first candidate region including a portrait is acquired according to the first face region. Inputting the candidate area into a portrait segmentation network to obtain a portrait area, and taking the portrait area as a main area of the image to be identified, wherein the method comprises the following steps: and inputting the first candidate area into a portrait segmentation network to obtain a first portrait area, and taking the first portrait area as a main area of the image to be identified.
The face region refers to a region containing a face. The face area can be obtained by frame selection, namely the face area is a rectangular area. The face area may also be a circular area, a square area, a triangular area, etc., without being limited thereto. The first face region refers to a face region having the largest area. The first candidate region refers to a candidate region including a portrait corresponding to the first face region.
It can be understood that, when the area of the face region in the image to be processed is larger, it indicates that the face is closer to the camera, and the object closer to the camera is the subject that the user wants to shoot. Therefore, the areas of at least two face regions can be respectively obtained; comparing the areas of at least two face regions, and acquiring the face region with the largest area as a first face region; according to
The first face region acquires a first candidate region including a portrait. Generally, the face in the face region with the largest area is closest to the camera.
In this embodiment, when it is detected that at least two faces exist in an image to be recognized, face regions including the faces are respectively obtained; the method comprises the steps of respectively obtaining the areas of at least two face regions, comparing the areas of the face regions, obtaining the face region with the largest area as a first face region, obtaining a first candidate region containing a portrait according to the first face region, improving the accuracy of the obtained first face region and the first candidate region, inputting a portrait segmentation network, obtaining a more accurate first face region, and taking the first face region as a main body region, so that the accuracy of main body identification is improved. A face is determined from a plurality of faces, and a portrait corresponding to the face is finally obtained, so that the problem that a plurality of main bodies are recognized in a multi-face scene is solved, and the focusing singleness is improved.
In one embodiment, a face region with a second largest area may also be acquired as the first face region, but is not limited thereto.
In one embodiment, when a face exists in an image to be recognized, acquiring a candidate region containing the face according to the face comprises: when detecting that at least two faces exist in an image to be recognized, respectively acquiring face areas containing the faces; a face region contains a face; respectively acquiring position information of at least two face areas; acquiring a face area closest to the center of the image to be recognized as a second face area according to the position information of at least two face areas; and acquiring a second candidate region containing the portrait according to the second face region. Inputting the candidate area into a portrait segmentation network to obtain a portrait area, and taking the portrait area as a main area of the image to be identified, wherein the method comprises the following steps: and inputting the second candidate area into the portrait segmentation network to obtain a second portrait area, and taking the second portrait area as a main area of the image to be identified.
The face region refers to a region containing a face. The face area can be obtained by frame selection, namely the face area is a rectangular area. The face area may also be a circular area, a square area, a triangular area, etc., without being limited thereto. The position information of the face area may be expressed by coordinates of the center of the face area, or may be expressed by coordinates of any point in the face area, but is not limited thereto.
The second face region refers to a face region closest to the center of the image to be recognized. The second candidate region refers to a candidate region including a portrait corresponding to the second face region.
Specifically, the position information of the center of the image to be recognized may be obtained in advance, the position information of at least two face regions may be compared with the position information of the center of the image to be recognized, and the face region closest to the center of the image to be recognized may be obtained according to the comparison result.
For example, the position information of the center of the image to be recognized is (50,50), the position information of the face area a is represented by (40,30), the position information of the face area B is represented by (20,50), and the position information of the face area C is represented by (40,40), so that the distance between each face area and the center of the image to be recognized can be calculated by using the following calculation formula: wherein S is the distance between the face region and the center of the image to be recognized, a1And b1Respectively the abscissa and ordinate of a point, a2And b2Respectively the abscissa and ordinate of the other point.
Thus, the distance between the face area A and the center of the image to be recognized is The distance between the face area B and the center of the image to be recognized isThe distance between the face region C and the center of the image to be recognized isThenAnd the face area C is closest to the center of the image to be recognized and is taken as a second face area.
In this embodiment, when it is detected that at least two faces exist in an image to be recognized, face regions including the faces are respectively obtained, and position information of the at least two face regions is respectively obtained; the face area closest to the center of the image to be recognized is used as a second face area, a second candidate area containing the portrait is obtained according to the second face area, and the second face area and the second candidate area which are more accurate can be obtained, so that the second candidate area is input into the portrait segmentation network, and the second portrait area which is more accurate can be obtained as a main area. A face is determined from a plurality of faces, and a portrait corresponding to the face is finally obtained, so that the problem that a plurality of main bodies are recognized in a multi-face scene is solved, and the focusing singleness is improved.
In one embodiment, when at least two faces are detected to exist in an image to be recognized, face regions containing the faces are respectively obtained; a face region contains a face; respectively acquiring the areas of at least two face areas; acquiring candidate face regions with areas larger than an area threshold value from at least two face regions; respectively acquiring the position information of each candidate face area; acquiring a candidate face area closest to the center of the image to be recognized as a target face area according to the position information of each candidate face area; and acquiring a candidate region containing the portrait according to the target face region.
Specifically, the areas of the face regions can be respectively obtained, and candidate face regions with areas larger than an area threshold value are obtained, that is, the face regions with smaller areas are screened out, so that the face regions with smaller areas are prevented from being processed, and the main body recognition efficiency is improved.
The position information of each candidate face area is obtained, and the candidate face area closest to the center of the image to be recognized is obtained as a target face area, so that more accurate face areas can be obtained; and acquiring a candidate region containing the portrait according to the target face region, so that a more accurate candidate region can be acquired.
In one embodiment, as shown in FIG. 3, an image to be identified 302 is acquired; performing face detection on the image 302 to be recognized, and judging whether a face exists in the image 302 to be recognized, namely step 304; when the face does not exist in the image 302 to be recognized, the image 302 to be recognized is input into the main body recognition network 306, and a main body area 308 of the image 302 to be recognized is obtained.
When a human face exists in the image to be recognized 302, a human face region 310 including the human face may be acquired. When at least two faces exist in the image to be recognized 302, face regions including the faces may be respectively obtained, and one face region 310 may be determined from the at least two face regions.
In one embodiment, the areas of at least two face regions may be obtained separately, and the face region with the largest area is obtained as the first face region, i.e. the face region 310. In another embodiment, the position information of at least two face regions may also be obtained respectively, and the face region closest to the center of the image to be recognized is obtained as the second face region, that is, the face region 310.
Acquiring a candidate region 310 containing a portrait according to the face region 310; the candidate region 312 is input into the portrait segmentation network 314 to obtain a portrait region, which is used as the main region 308 of the image 302 to be recognized.
In one embodiment, as shown in fig. 4, acquiring a candidate region containing a human image according to a human face includes:
step 402, obtaining a face region containing a face.
Step 404, at least two feature points in the face region are obtained.
The feature point refers to a point where the image gray value changes drastically or a point where the curvature is large on the edge of the image (i.e., the intersection of two edges). In the face region, the feature points may be eyes, nose, mouth corners, eyebrows, moles, and the like.
And 406, determining the angle of the face according to the at least two feature points.
The angle of the face refers to the angle at which the face is tilted. The angle of the face may be left-leaning, right-leaning, etc. For example, the angle of the face may be 20 degrees to the left, 10 degrees to the right, etc.
It is understood that when the face is at a slant, the feature points in the face region also change the relative positional relationship accordingly. For example, when the face is tilted to the left, two feature points in the face region, i.e., the left-eye feature point and the right-eye feature point, are connected, and the connection line is also tilted to the left. For another example, when the face inclines to the right, two feature points in the face region, that is, the left mouth corner feature point and the left mouth corner feature point, are connected, and then the connection also inclines to the right. For another example, when the face deflects to the left and back, the length of the connection line between the left eye feature point and the right eye feature point becomes shorter, and the nose tip feature point is located on the left of the face region.
Step 408, obtaining a candidate region containing a portrait according to the angle of the face; the difference value between the angle of the candidate area and the angle of the face is within a preset range.
Generally, when the body is in a tilted state, the face of a person is correspondingly tilted. Therefore, the candidate region containing the portrait can be obtained according to the angle of the face, and the difference value between the angle of the candidate region and the angle of the face is within the preset range. The preset range can be set according to the needs of the user. For example, the preset range may be between 5 degrees and 10 degrees.
For example, when the angle of the face is 20 degrees inclined to the left, the angle at which the candidate region including the portrait is obtained from the angle of the face may be 20 degrees, 25 degrees, or the like. More accurate candidate regions can be obtained through the angle of the face.
In this embodiment, a face region including a face is obtained, at least two feature points in the face region are obtained, an angle of the face is determined according to the at least two feature points, and a more accurate candidate region can be obtained according to the angle of the face.
In one embodiment, as shown in fig. 5, determining the angle of the face according to at least two feature points includes:
and 502, connecting at least two characteristic points to obtain each connecting line.
And after at least two characteristic points in the face area are obtained, connecting the at least two characteristic points. For example, when the feature points are left-eye feature points and right-eye feature points, connecting the left-eye feature points and the right-eye feature points to obtain a connecting line; when the characteristic points are the left mouth corner characteristic point and the right mouth corner characteristic point, connecting the left mouth corner characteristic point and the right mouth corner characteristic point to obtain a connecting line; and when the feature points are nose tip feature points, left eye feature points and right eye feature points, connecting the left eye feature points with the right eye feature points to obtain a connecting line, and then connecting the center of the connecting line with the nose tip feature points to obtain another connecting line.
And step 504, obtaining the angle of each connecting line, and determining the angle of the face based on the angle of each connecting line.
The angle of the connecting line can be used to represent the angle of the face. For example, when the angle of the connecting line of the left-eye feature point and the right-eye feature point is deflected to the left by 20 degrees, the angle of the human face may be deflected to the left by 20 degrees; when the angle of the connecting line of the left mouth corner feature point and the right mouth corner feature point is deflected to the right by 10 degrees, the angle of the face may be deflected to the right by 10 degrees.
When a connecting line exists, the angle of the connecting line can be used as the angle of the face. When there are at least two connecting lines, the angle of the face is determined based on the angles of the at least two connecting lines. For example, when there are two connecting lines, one connecting line is a connecting line between the left-eye feature point and the right-eye feature point, and the other connecting line is a connecting line between the left-mouth-angle feature point and the right-mouth-angle feature point, the angles of the two connecting lines may be averaged, and the average value may be used as the angle of the human face.
For another example, when there are two connecting lines, the first connecting line is a connecting line of the left-eye feature point and the right-eye feature point, and the center of the first connecting line is connected with the nose tip feature point to obtain a second connecting line, the angle of the second connecting line may represent the angle of the human face.
In this embodiment, at least two feature points are connected to obtain each connecting line, the angle of each connecting line is obtained, and a more accurate angle of the face can be determined based on the angle of each connecting line.
In one embodiment, connecting at least two feature points to obtain each connecting line includes: and when the at least two feature points comprise the left-eye feature point and the right-eye feature point, connecting the left-eye feature point and the right-eye feature point to obtain a first connecting line. The angle of obtaining each connection to the angle based on each connecting line confirms the angle of people's face, includes: and acquiring the angle of the first connecting line, and determining the angle of the face based on the angle of the first connecting line.
The first connection line refers to a connection line between the left-eye feature point and the right-eye feature point.
It is understood that when the face in the image to be processed is in the tilted state, the line connecting the left eye and the right eye on the face is also in the tilted state. Therefore, the left-eye feature point and the right-eye feature point can be connected to obtain a first connecting line, the angle of the first connecting line is obtained, and the angle of the face is determined based on the angle of the first connecting line.
Further, determining the angle of the face based on the angle of the first connecting line; the difference value between the angle of the face and the angle of the first connecting line is within a preset range.
The preset range can be set according to the needs of the user. In one embodiment, the angle of the first connecting line may be taken as the angle of the face. In another embodiment, the angle of the face may also be determined based on the angle of the first connection line, the angle of the face is different from the angle of the first connection line, and the difference between the angle of the face and the angle of the first connection line is within a preset range. For example, the angle of the first connecting line is 10 degrees to the left, and the angle of the determined face may be 8 degrees to the left.
In this embodiment, the left-eye feature point and the right-eye feature point are connected to obtain a first connection line, an angle of the first connection line is obtained, and a more accurate angle of the human face can be determined based on the angle of the first connection line.
In one embodiment, connecting at least two feature points to obtain each connecting line includes: and when the at least two characteristic points comprise the left mouth corner characteristic point and the right mouth corner characteristic point, connecting the left mouth corner characteristic point and the right mouth corner characteristic point to obtain a second connecting line. The angle of obtaining each connection to the angle based on each connecting line confirms the angle of people's face, includes: and acquiring the angle of the second connecting line, and determining the angle of the face based on the angle of the second connecting line.
The second connecting line refers to a connecting line of the left mouth corner feature point and the right mouth corner feature point.
It can be understood that when the face in the image to be processed is in the inclined state, the connecting line of the left mouth corner and the right mouth corner on the face is also in the inclined state. Therefore, the left mouth corner feature point and the right mouth corner feature point can be connected to obtain a second connecting line, the angle of the second connecting line is obtained, and the angle of the face is determined based on the angle of the second connecting line.
Further, determining the angle of the face based on the angle of the second connecting line; the difference value between the angle of the face and the angle of the second connecting line is within a preset range.
The preset range can be set according to the needs of the user. In one embodiment, the angle of the second connecting line may be taken as the angle of the face. In another embodiment, the angle of the face may also be determined based on the angle of the second connection line, the angle of the face is different from the angle of the second connection line, and the difference between the angle of the face and the angle of the second connection line is within a preset range. For example, the angle of the second connecting line is 8 degrees to the left, and the angle of the determined face may be 9 degrees to the left.
In this embodiment, the left mouth corner feature point and the right mouth corner feature point are connected to obtain a second connecting line, an angle of the second connecting line is obtained, and a more accurate angle of the human face can be determined based on the angle of the second connecting line.
In one embodiment, the method further comprises: when the at least two feature points comprise a left-eye feature point, a right-eye feature point, a left mouth corner feature point and a right mouth corner feature point, connecting the left-eye feature point and the right-eye feature point to obtain a first connecting line, and connecting the left mouth corner feature point and the right mouth corner feature point to obtain a second connecting line; and respectively obtaining the angle of the first connecting line and the angle of the second connecting line, and determining the angle of the face based on the angle of the first connecting line and the angle of the second connecting line.
In one embodiment, the angle of the first connecting line and the angle of the second connecting line may be averaged, and the average may be used as the angle of the human face. In another embodiment, a first weighting factor of the angle of the first connection line and a second weighting factor of the angle of the second connection line may be obtained, a weighted average value is obtained according to the angle of the first connection line, the first weighting factor, the angle of the second connection line and the second weighting factor, and the weighted average value is used as the angle of the human face.
In this embodiment, the first connection line and the second connection line are obtained based on the left-eye feature point, the right-eye feature point, the left-mouth-angle feature point, and the right-mouth-angle feature point, and a more accurate angle of the human face can be obtained according to the angle of the first connection line and the angle of the second connection line.
In one embodiment, the method further comprises: and when the at least two feature points further comprise the nose tip feature point, acquiring the position information of the nose tip feature point in the face region. The determining the angle of the human face based on the angle of the first connecting line includes: and determining the angle of the face based on the angle of the first connecting line and the position information of the nose tip characteristic point in the face region.
The position information of the nose tip feature point in the face region may be represented by coordinates, or may be represented by a distance from the nose tip feature point to the center of the face region, but is not limited thereto.
Generally, the tip of the nose is located in the central region of the face. By acquiring the position information of the nose tip feature points in the face region, the front and back directions of the face can be known. For example, when the face turns left and back, the tip of the nose is left in the face region; when the face turns to the right and back, the tip of the nose faces to the right in the face area. Based on the angle of the first connecting line and the position information of the nose tip characteristic point in the face area, the angle of the face can be determined more accurately.
In another embodiment, the method further comprises: and when the at least two feature points further comprise the nose tip feature point, acquiring the position information of the nose tip feature point in the face region. The determining the angle of the face based on the angle of the second connecting line includes: and determining the angle of the face based on the angle of the second connecting line and the position information of the nose tip characteristic point in the face region.
For example, when the face turns left and back, the tip of the nose is left in the face region; when the face turns to the right and back, the tip of the nose faces to the right in the face area. Based on the angle of the second connecting line and the position information of the nose tip characteristic point in the face region, the angle of the face can be determined more accurately.
In one embodiment, when the at least two feature points further include a nose tip feature point, acquiring position information of the nose tip feature point in the face region; and respectively obtaining the angle of the first connecting line and the angle of the second connecting line, and determining the angle of the face based on the angle of the first connecting line, the angle of the second connecting line and the position information of the nose tip characteristic point in the face region.
For example, if the angle of the first connecting line is 8 degrees to the left, the angle of the second connecting line is 10 degrees to the left, and the nose tip feature point is located on the left side in the face region, the average value can be obtained based on the angles of the first connecting line and the second connecting line to obtain 9 degrees to the left of the face; the nose tip characteristic point is positioned on the left side in the face region and represents that the face deflects towards the left and back; therefore, the angle of the human face is deflected 9 degrees to the left and turned to the left and back.
In this embodiment, the angles of the human face can be determined more accurately by combining the left-eye feature points, the right-eye feature points, the left-mouth-corner feature points, the right-mouth-corner feature points, and the nose tip feature points.
In one embodiment, the method further comprises: and acquiring the area of the face region. Acquiring a candidate region containing a portrait according to the angle of the face, comprising: acquiring a candidate region containing a portrait according to the angle of the face and the area of the face region; the area of the candidate region is positively correlated with the area of the face region.
It can be understood that, when the area of the face region is larger, the face is closer to the camera, the area of the portrait containing the face is larger, and the candidate region containing the portrait is larger. Therefore, the area of the candidate region is positively correlated with the area of the face region. According to the angle of the face and the area of the face region, a more accurate candidate region can be obtained.
In one embodiment, as shown in FIG. 6, an image to be identified 602 is acquired; and performing face detection on the image to be recognized 602, and detecting two faces. And respectively acquiring face areas containing faces to obtain two face areas. The position information of the two face regions can be obtained respectively, and the face region closest to the center of the image to be recognized, that is, the face region 604, is obtained.
Acquiring 5 feature points from the face region 604, wherein the feature points are a left eye feature point, a right eye feature point, a nose tip feature point, a left mouth corner feature point and a right mouth corner feature point; connecting the left eye characteristic point and the right eye characteristic point to obtain a first connecting line, and connecting the left mouth corner characteristic point and the right mouth corner characteristic point to obtain a second connecting line; respectively acquiring the angle of a first connecting line and the angle of a second connecting line; and respectively connecting the nose tip characteristic points with the first connecting line and the second connecting line based on the angle of the first connecting line and the angle of the second connecting line to obtain a third connecting line.
For example, the angle of the first connecting line and the angle of the second connecting line may be averaged, and then a straight line corresponding to the average is perpendicular to the third connecting line passing through the nose tip feature point.
Acquiring the angle of a third connecting line; determining the angle of the face based on the angle of the third connecting line; and acquiring a candidate region 606 containing the portrait according to the angle of the face. And the difference value between the angle of the candidate region and the angle of the human face is within a preset range. Inputting the candidate region 606 into the portrait segmentation network to obtain a portrait region 608, and using the portrait region 608 as a main region of the image to be recognized.
It should be understood that although the steps in the flowcharts of fig. 2, 4 and 5 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 2, 4, and 5 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or stages is not necessarily sequential, but may be performed alternately or alternatingly with other steps or at least some of the sub-steps or stages of other steps.
Fig. 7 is a block diagram of an image processing apparatus according to an embodiment. As shown in fig. 7, there is provided an image processing apparatus 700 including: an image acquisition module 702, a face detection module 704, a candidate region acquisition module 706, a portrait segmentation module 708, and a subject recognition module 710, wherein:
an image obtaining module 702 is configured to obtain an image to be identified.
And the face detection module 704 is configured to detect whether a face exists in the image to be recognized.
A candidate region obtaining module 706, configured to obtain a candidate region including a portrait according to a face when the face exists in the image to be recognized; the portrait includes a human face.
The portrait segmentation module 708 is configured to input the candidate region into a portrait segmentation network to obtain a portrait region, and use the portrait region as a main region of the image to be identified.
The main body recognition module 710 is configured to, when a face does not exist in the image to be recognized, input the image to be recognized into a main body recognition network to obtain a main body region of the image to be recognized.
When detecting that the image to be recognized has the face, the image processing device acquires a candidate region containing the portrait according to the face, inputs the candidate region into the portrait segmentation network to obtain the portrait region, and takes the portrait region as a main region of the image to be recognized; and when the face does not exist in the image to be recognized, inputting the image to be recognized into a main body recognition network to obtain a main body area of the image to be recognized. The method comprises the steps that a two-way network of a main body area of an image to be recognized is obtained through design, namely when a human face does not exist in the image to be recognized, the main body area is obtained through the main body recognition network; when the face exists in the image to be recognized, the candidate image containing the portrait is determined from the image to be recognized, and a more accurate portrait area can be obtained as a main area through the portrait segmentation network.
In an embodiment, the candidate region obtaining module 706 is further configured to, when at least two faces are detected in the image to be recognized, respectively obtain face regions including the faces; a face region contains a face; respectively acquiring the areas of at least two face areas; comparing the areas of at least two face regions to obtain a face region with the largest area as a first face region; a first candidate region including a portrait is acquired according to the first face region. Inputting the candidate area into a portrait segmentation network to obtain a portrait area, and taking the portrait area as a main area of the image to be identified, wherein the method comprises the following steps: and inputting the first candidate area into a portrait segmentation network to obtain a first portrait area, and taking the first portrait area as a main area of the image to be identified.
In an embodiment, the candidate region obtaining module 706 is further configured to, when at least two faces are detected in the image to be recognized, respectively obtain face regions including the faces; a face region contains a face; respectively acquiring position information of at least two face areas; acquiring a face area closest to the center of the image to be recognized as a second face area according to the position information of at least two face areas; and acquiring a second candidate region containing the portrait according to the second face region. Inputting the candidate area into a portrait segmentation network to obtain a portrait area, and taking the portrait area as a main area of the image to be identified, wherein the method comprises the following steps: and inputting the second candidate area into the portrait segmentation network to obtain a second portrait area, and taking the second portrait area as a main area of the image to be identified.
In an embodiment, the candidate region obtaining module 706 is further configured to obtain a face region including a face; acquiring at least two feature points in a face region; determining the angle of the face according to the at least two feature points; acquiring a candidate region containing a portrait according to the angle of the face; the difference value between the angle of the candidate area and the angle of the face is within a preset range.
In an embodiment, the candidate region obtaining module 706 is further configured to connect at least two feature points to obtain each connection line; and acquiring the angle of each connecting line, and determining the angle of the face based on the angle of each connecting line.
In an embodiment, the candidate region obtaining module 706 is further configured to, when the at least two feature points include a left-eye feature point and a right-eye feature point, connect the left-eye feature point and the right-eye feature point to obtain a first connection line. The angle of obtaining each connection to the angle based on each connecting line confirms the angle of people's face, includes: and acquiring the angle of the first connecting line, and determining the angle of the face based on the angle of the first connecting line.
In an embodiment, the candidate region obtaining module 706 is further configured to, when the at least two feature points include a left mouth corner feature point and a right mouth corner feature point, connect the left mouth corner feature point and the right mouth corner feature point to obtain a second connection line. The angle of obtaining each connection to the angle based on each connecting line confirms the angle of people's face, includes: and acquiring the angle of the second connecting line, and determining the angle of the face based on the angle of the second connecting line.
In one embodiment, the main body recognition apparatus 700 further includes a position information obtaining module, configured to obtain position information of the nose tip feature point in the face region when the at least two feature points further include the nose tip feature point. Determining the angle of the face based on the angle of the first connecting line, comprising: and determining the angle of the face based on the angle of the first connecting line and the position information of the nose tip characteristic point in the face region. Or determining the angle of the face based on the angle of the second connecting line, including: and determining the angle of the face based on the position information of the second connecting line and the nose tip characteristic point in the face region.
In one embodiment, the image processing apparatus 700 further includes an area obtaining module, configured to obtain an area of the face region. Acquiring a candidate region containing a portrait according to the angle of the face, comprising: acquiring a candidate region containing a portrait according to the angle of the face and the area of the face region; the area of the candidate region is positively correlated with the area of the face region.
The division of the modules in the image processing apparatus is only for illustration, and in other embodiments, the image processing apparatus may be divided into different modules as needed to complete all or part of the functions of the image processing apparatus.
Fig. 8 is a schematic diagram of an internal structure of an electronic device in one embodiment. As shown in fig. 8, the electronic device includes a processor and a memory connected by a system bus. Wherein, the processor is used for providing calculation and control capability and supporting the operation of the whole electronic equipment. The memory may include a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The computer program can be executed by a processor to implement an image processing method provided in the following embodiments. The internal memory provides a cached execution environment for the operating system computer programs in the non-volatile storage medium. The electronic device may be a mobile phone, a tablet computer, or a personal digital assistant or a wearable device, etc.
The implementation of each module in the image processing apparatus provided in the embodiment of the present application may be in the form of a computer program. The computer program may be run on a terminal or a server. The program modules constituted by the computer program may be stored on the memory of the terminal or the server. Which when executed by a processor, performs the steps of the method described in the embodiments of the present application.
The embodiment of the application also provides a computer readable storage medium. One or more non-transitory computer-readable storage media containing computer-executable instructions that, when executed by one or more processors, cause the processors to perform the steps of the image processing method.
A computer program product comprising instructions which, when run on a computer, cause the computer to perform an image processing method.
Any reference to memory, storage, database, or other medium used by embodiments of the present application may include non-volatile and/or volatile memory. Suitable non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM), which acts as external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms, such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), Enhanced SDRAM (ESDRAM), synchronous Link (Synchlink) DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and bus dynamic RAM (RDRAM).
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (11)

1. An image processing method, comprising:
acquiring an image to be identified;
detecting whether a human face exists in the image to be recognized;
when a face exists in the image to be recognized, acquiring a candidate region containing a portrait according to the face; the portrait includes the face;
inputting the candidate area into a portrait segmentation network to obtain a portrait area, and taking the portrait area as a main area of the image to be identified;
and when the face does not exist in the image to be recognized, inputting the image to be recognized into a main body recognition network to obtain a main body area of the image to be recognized.
2. The method according to claim 1, wherein when a face exists in the image to be recognized, acquiring a candidate region containing the face according to the face comprises:
when detecting that at least two faces exist in the image to be recognized, respectively acquiring face regions containing the faces;
respectively acquiring the areas of at least two face regions;
comparing the areas of at least two face regions to obtain a face region with the largest area as a first face region;
acquiring a first candidate region containing a portrait according to the first face region;
inputting the candidate region into a portrait segmentation network to obtain a portrait region, and using the portrait region as a main region of the image to be recognized, including:
and inputting the first candidate region into a portrait segmentation network to obtain a first portrait region, and taking the first portrait region as a main region of the image to be identified.
3. The method according to claim 1, wherein when a face exists in the image to be recognized, acquiring a candidate region containing the face according to the face comprises:
when detecting that at least two faces exist in the image to be recognized, respectively acquiring face regions containing the faces; a face region contains a face;
respectively acquiring the position information of at least two face areas;
acquiring a face area closest to the center of the image to be recognized according to the position information of at least two face areas as a second face area;
acquiring a second candidate region containing a portrait according to the second face region;
inputting the candidate region into a portrait segmentation network to obtain a portrait region, and using the portrait region as a main region of the image to be recognized, including:
and inputting the second candidate area into a portrait segmentation network to obtain a second portrait area, and taking the second portrait area as a main area of the image to be identified.
4. The method of claim 1, wherein obtaining candidate regions containing human images from the human face comprises:
acquiring a face area containing the face;
acquiring at least two feature points in the face region;
determining the angle of the face according to the at least two feature points;
acquiring a candidate region containing a portrait according to the angle of the face; and the difference value between the angle of the candidate region and the angle of the human face is within a preset range.
5. The method of claim 4, wherein determining the angle of the face from the at least two feature points comprises:
connecting the at least two characteristic points to obtain each connecting line;
and acquiring the angle of each connecting line, and determining the angle of the face based on the angle of each connecting line.
6. The method of claim 5, wherein said connecting the at least two feature points to obtain each connecting line comprises:
when the at least two feature points comprise a left-eye feature point and a right-eye feature point, connecting the left-eye feature point and the right-eye feature point to obtain a first connecting line; or
When the at least two feature points comprise a left mouth corner feature point and a right mouth corner feature point, connecting the left mouth corner feature point and the right mouth corner feature point to obtain a second connecting line;
the obtaining the angles of the connections and determining the angles of the human face based on the angles of the connections include:
acquiring the angle of the first connecting line, and determining the angle of the face based on the angle of the first connecting line; or
And acquiring the angle of the second connecting line, and determining the angle of the face based on the angle of the second connecting line.
7. The method of claim 6, further comprising:
when the at least two feature points further comprise nose tip feature points, acquiring position information of the nose tip feature points in the face region;
the determining the angle of the human face based on the angle of the first connecting line includes: determining the angle of the face based on the angle of the first connecting line and the position information of the nose tip characteristic point in the face region; or
The determining the angle of the face based on the angle of the second connecting line includes: and determining the angle of the face based on the second connecting line and the position information of the nose tip characteristic point in the face region.
8. The method of claim 4, further comprising:
acquiring the area of the face region;
the obtaining of the candidate region containing the portrait according to the angle of the face includes:
acquiring a candidate region containing a portrait according to the angle of the face and the area of the face region; the area of the candidate region is positively correlated with the area of the face region.
9. An image processing apparatus characterized by comprising:
the image acquisition module is used for acquiring an image to be identified;
the face detection module is used for detecting whether a face exists in the image to be identified;
the candidate region acquisition module is used for acquiring a candidate region containing a portrait according to the face when the face exists in the image to be identified; the portrait includes the face;
the portrait segmentation module is used for inputting the candidate region into a portrait segmentation network to obtain a portrait region, and the portrait region is used as a main region of the image to be identified;
and the main body identification module is used for inputting the image to be identified into a main body identification network when the human face does not exist in the image to be identified, so as to obtain a main body area of the image to be identified.
10. An electronic device comprising a memory and a processor, the memory having stored therein a computer program that, when executed by the processor, causes the processor to perform the steps of the image processing method according to any one of claims 1 to 8.
11. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 8.
CN201910905113.9A 2019-09-24 2019-09-24 Image processing method and device, electronic equipment and computer readable storage medium Pending CN110610171A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910905113.9A CN110610171A (en) 2019-09-24 2019-09-24 Image processing method and device, electronic equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910905113.9A CN110610171A (en) 2019-09-24 2019-09-24 Image processing method and device, electronic equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN110610171A true CN110610171A (en) 2019-12-24

Family

ID=68892144

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910905113.9A Pending CN110610171A (en) 2019-09-24 2019-09-24 Image processing method and device, electronic equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN110610171A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113221754A (en) * 2021-05-14 2021-08-06 深圳前海百递网络有限公司 Express waybill image detection method and device, computer equipment and storage medium

Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101561710A (en) * 2009-05-19 2009-10-21 重庆大学 Man-machine interaction method based on estimation of human face posture
CN107358207A (en) * 2017-07-14 2017-11-17 重庆大学 A kind of method for correcting facial image
CN107454335A (en) * 2017-08-31 2017-12-08 广东欧珀移动通信有限公司 Image processing method, device, computer-readable recording medium and mobile terminal
CN107592473A (en) * 2017-10-31 2018-01-16 广东欧珀移动通信有限公司 Exposure parameter method of adjustment, device, electronic equipment and readable storage medium storing program for executing
CN107820017A (en) * 2017-11-30 2018-03-20 广东欧珀移动通信有限公司 Image capturing method, device, computer-readable recording medium and electronic equipment
CN108009999A (en) * 2017-11-30 2018-05-08 广东欧珀移动通信有限公司 Image processing method, device, computer-readable recording medium and electronic equipment
CN108537155A (en) * 2018-03-29 2018-09-14 广东欧珀移动通信有限公司 Image processing method, device, electronic equipment and computer readable storage medium
CN108717530A (en) * 2018-05-21 2018-10-30 Oppo广东移动通信有限公司 Image processing method, device, computer readable storage medium and electronic equipment
CN108875479A (en) * 2017-08-15 2018-11-23 北京旷视科技有限公司 The acquisition methods and device of facial image
CN108921148A (en) * 2018-09-07 2018-11-30 北京相貌空间科技有限公司 Determine the method and device of positive face tilt angle
CN109191398A (en) * 2018-08-29 2019-01-11 Oppo广东移动通信有限公司 Image processing method, device, computer readable storage medium and electronic equipment
CN109325905A (en) * 2018-08-29 2019-02-12 Oppo广东移动通信有限公司 Image processing method, device, computer readable storage medium and electronic equipment
CN109360254A (en) * 2018-10-15 2019-02-19 Oppo广东移动通信有限公司 Image processing method and device, electronic equipment, computer readable storage medium
CN109389018A (en) * 2017-08-14 2019-02-26 杭州海康威视数字技术股份有限公司 A kind of facial angle recognition methods, device and equipment
CN109461186A (en) * 2018-10-15 2019-03-12 Oppo广东移动通信有限公司 Image processing method, device, computer readable storage medium and electronic equipment
CN109582811A (en) * 2018-12-17 2019-04-05 Oppo广东移动通信有限公司 Image processing method, device, electronic equipment and computer readable storage medium
CN110047053A (en) * 2019-04-26 2019-07-23 腾讯科技(深圳)有限公司 Portrait Picture Generation Method, device and computer equipment
CN110248096A (en) * 2019-06-28 2019-09-17 Oppo广东移动通信有限公司 Focusing method and device, electronic equipment, computer readable storage medium

Patent Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101561710A (en) * 2009-05-19 2009-10-21 重庆大学 Man-machine interaction method based on estimation of human face posture
CN107358207A (en) * 2017-07-14 2017-11-17 重庆大学 A kind of method for correcting facial image
CN109389018A (en) * 2017-08-14 2019-02-26 杭州海康威视数字技术股份有限公司 A kind of facial angle recognition methods, device and equipment
CN108875479A (en) * 2017-08-15 2018-11-23 北京旷视科技有限公司 The acquisition methods and device of facial image
CN107454335A (en) * 2017-08-31 2017-12-08 广东欧珀移动通信有限公司 Image processing method, device, computer-readable recording medium and mobile terminal
CN107592473A (en) * 2017-10-31 2018-01-16 广东欧珀移动通信有限公司 Exposure parameter method of adjustment, device, electronic equipment and readable storage medium storing program for executing
CN107820017A (en) * 2017-11-30 2018-03-20 广东欧珀移动通信有限公司 Image capturing method, device, computer-readable recording medium and electronic equipment
CN108009999A (en) * 2017-11-30 2018-05-08 广东欧珀移动通信有限公司 Image processing method, device, computer-readable recording medium and electronic equipment
CN108537155A (en) * 2018-03-29 2018-09-14 广东欧珀移动通信有限公司 Image processing method, device, electronic equipment and computer readable storage medium
CN108717530A (en) * 2018-05-21 2018-10-30 Oppo广东移动通信有限公司 Image processing method, device, computer readable storage medium and electronic equipment
CN109191398A (en) * 2018-08-29 2019-01-11 Oppo广东移动通信有限公司 Image processing method, device, computer readable storage medium and electronic equipment
CN109325905A (en) * 2018-08-29 2019-02-12 Oppo广东移动通信有限公司 Image processing method, device, computer readable storage medium and electronic equipment
CN108921148A (en) * 2018-09-07 2018-11-30 北京相貌空间科技有限公司 Determine the method and device of positive face tilt angle
CN109360254A (en) * 2018-10-15 2019-02-19 Oppo广东移动通信有限公司 Image processing method and device, electronic equipment, computer readable storage medium
CN109461186A (en) * 2018-10-15 2019-03-12 Oppo广东移动通信有限公司 Image processing method, device, computer readable storage medium and electronic equipment
CN109582811A (en) * 2018-12-17 2019-04-05 Oppo广东移动通信有限公司 Image processing method, device, electronic equipment and computer readable storage medium
CN110047053A (en) * 2019-04-26 2019-07-23 腾讯科技(深圳)有限公司 Portrait Picture Generation Method, device and computer equipment
CN110248096A (en) * 2019-06-28 2019-09-17 Oppo广东移动通信有限公司 Focusing method and device, electronic equipment, computer readable storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113221754A (en) * 2021-05-14 2021-08-06 深圳前海百递网络有限公司 Express waybill image detection method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110149482B (en) Focusing method, focusing device, electronic equipment and computer readable storage medium
CN110248096B (en) Focusing method and device, electronic equipment and computer readable storage medium
US11457138B2 (en) Method and device for image processing, method for training object detection model
EP3477931B1 (en) Image processing method and device, readable storage medium and electronic device
EP3598736B1 (en) Method and apparatus for processing image
CN113766125B (en) Focusing method and device, electronic equipment and computer readable storage medium
CN110493527B (en) Body focusing method and device, electronic equipment and storage medium
CN108537155B (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
CN110660090B (en) Subject detection method and apparatus, electronic device, and computer-readable storage medium
EP3793188A1 (en) Image processing method, electronic device, and computer readable storage medium
CN111932587B (en) Image processing method and device, electronic equipment and computer readable storage medium
CN109712177B (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
CN110650288B (en) Focusing control method and device, electronic equipment and computer readable storage medium
CN110191287B (en) Focusing method and device, electronic equipment and computer readable storage medium
CN110349163B (en) Image processing method and device, electronic equipment and computer readable storage medium
CN110490196B (en) Subject detection method and apparatus, electronic device, and computer-readable storage medium
US12039767B2 (en) Subject detection method and apparatus, electronic device, and computer-readable storage medium
CN110881103B (en) Focusing control method and device, electronic equipment and computer readable storage medium
CN113313626A (en) Image processing method, image processing device, electronic equipment and storage medium
CN110365897B (en) Image correction method and device, electronic equipment and computer readable storage medium
CN110399823B (en) Subject tracking method and apparatus, electronic device, and computer-readable storage medium
CN110689007B (en) Subject recognition method and device, electronic equipment and computer-readable storage medium
CN110688926B (en) Subject detection method and apparatus, electronic device, and computer-readable storage medium
CN110610171A (en) Image processing method and device, electronic equipment and computer readable storage medium
CN108737733B (en) Information prompting method and device, electronic equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20191224

RJ01 Rejection of invention patent application after publication