CN115147323A - Image enhancement method and device, electronic equipment and storage medium - Google Patents

Image enhancement method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115147323A
CN115147323A CN202210872483.9A CN202210872483A CN115147323A CN 115147323 A CN115147323 A CN 115147323A CN 202210872483 A CN202210872483 A CN 202210872483A CN 115147323 A CN115147323 A CN 115147323A
Authority
CN
China
Prior art keywords
image
posture
limb
preset
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210872483.9A
Other languages
Chinese (zh)
Other versions
CN115147323B (en
Inventor
姜文韬
金晟
刘文韬
钱晨
刘偲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen TetrasAI Technology Co Ltd
Original Assignee
Shenzhen TetrasAI Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen TetrasAI Technology Co Ltd filed Critical Shenzhen TetrasAI Technology Co Ltd
Priority to CN202210872483.9A priority Critical patent/CN115147323B/en
Publication of CN115147323A publication Critical patent/CN115147323A/en
Application granted granted Critical
Publication of CN115147323B publication Critical patent/CN115147323B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The present disclosure provides an image enhancement method, apparatus, electronic device and storage medium, wherein the method comprises: determining the posture of a first target limb of an object in a first image to obtain a first posture; performing pose transformation on the first target limb based on the first pose, and determining a plurality of candidate images based on the transformed first target limb; the postures of the first target limb in the candidate images after transformation are different; determining prediction data for each of the candidate images; the prediction data is used for indicating the probability that the posture of the first target limb in the corresponding candidate image after transformation is a preset posture; determining a first enhanced image among a plurality of the candidate images based on the prediction data; the first target limb of the object in the first enhanced image is in a first preset posture, and the first preset posture is a preset posture in which the number of images in the image set meets a preset condition.

Description

Image enhancement method, device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of artificial intelligence technologies, and in particular, to an image enhancement method and apparatus, an electronic device, and a storage medium.
Background
With the rapid development of the artificial intelligence technology, the types and the number of the images acquired by the existing image acquisition technology cannot meet the requirements of the artificial intelligence technology, which affects the accuracy of the artificial intelligence technology. For example, in a learning phase of a machine learning system, an existing image set is generally selected for learning the machine learning system. However, the poses of the objects in the images of the existing image set are not exactly the same, and the number of images belonging to each pose is also different. The gesture with a small number of images influences the learning quality of the machine learning system on the gesture, so that the gesture detection accuracy of the machine learning system is influenced.
Disclosure of Invention
The embodiment of the disclosure at least provides an image enhancement method and device, electronic equipment and a storage medium.
In a first aspect, an embodiment of the present disclosure provides an image enhancement method, including: determining the posture of a first target limb of an object in a first image to obtain a first posture; performing pose transformation on the first target limb based on the first pose, and determining a plurality of candidate images based on the transformed first target limb; the postures of the first target limb in the candidate images after transformation are different; determining prediction data for each of the candidate images; the prediction data is used for indicating the probability that the posture of the first target limb in the corresponding candidate image after transformation is a preset posture; determining a first enhanced image among a plurality of the candidate images based on the prediction data; the first target limb of the object in the first enhanced image is a first preset posture, and the first preset posture is a preset posture that the number of images in the image set meets a preset condition.
In the above embodiment, the pose of the first target limb may be transformed based on the original first pose of the object, and a plurality of candidate images including a plurality of poses may be generated based on the transformed first target limb, so as to change the first pose of the object in the first image, automatically generate a new pose for the object, and enrich the pose type of the object. Further, determining the first enhanced image based on the prediction data of each candidate image can realize the screening of the first enhanced image belonging to the first preset posture in the plurality of candidate images. Here, the first preset posture may be understood as a preset posture in which the number of images in the image set in the preset posture satisfies a preset condition. By enhancing the image set corresponding to the first preset posture, the number of images of the image set corresponding to the first preset posture can be increased, and therefore the learning quality of the machine learning system for the first preset posture is improved.
In an alternative embodiment, the first target limb comprises a sub-limb, and the first pose comprises a sub-pose of the sub-limb; the pose transforming the first target limb based on the first pose and determining a plurality of candidate images based on the transformed first target limb comprises: based on the image segmentation result of the first image, segmenting the first image to obtain a segmented image of the sub-limb; wherein the image segmentation result is used to indicate a limb segmentation result of the subject; performing posture transformation on the sub-limbs in the segmented image based on the sub-postures to obtain a first image to be fused after the posture transformation; determining the candidate image based on the first image and the first image to be fused.
In an alternative embodiment, the determining the candidate image based on the first image and the first image to be fused includes: processing the first image based on the image segmentation result to obtain a second image to be fused; the second image to be fused comprises a second target limb of the object, and the second target limb is the residual limb except the sub-limb; and determining the fusion position of each first image to be fused in the second image to be fused, fusing the first image to be fused and the second image to be fused based on the fusion position, and obtaining the candidate image after fusion.
In the above embodiment, by performing pose transformation on a plurality of sub-limbs of the object in the first image, a new object pose with diversity can be generated, so that a candidate image including the object pose with diversity can be obtained, and the probability of screening a new first image meeting requirements from a plurality of candidate images is further improved.
In an optional embodiment, the first pose is coordinates of a key point of the limb; the gesture transforming the first target limb based on the first gesture includes: determining a gesture transformation type of the first target limb; determining a pose transformation matrix for the first target limb based on the pose transformation type; and carrying out coordinate transformation on the coordinates of the limb key points of the first target limb based on the posture transformation matrix to obtain the transformed first target limb.
In the above embodiment, the pose of the first target limb may be transformed by affine transformation. Through affine transformation of the first target limb, the posture transformation of the first target limb can be more reasonably realized, so that reasonable and rich limb postures are obtained.
In an alternative embodiment, the number of the first images is multiple, and the method includes determining the preset posture by: clustering the plurality of first images based on the first postures of the objects in the first images to obtain a plurality of cluster clusters; each cluster corresponds to one attitude type, and different clusters correspond to different attitude types; and determining the gesture type corresponding to each cluster as the preset gesture.
In the above embodiment, by performing cluster analysis on the plurality of first images, the image distribution condition of the first images belonging to each preset posture can be obtained, so that the preset posture corresponding to a smaller number of images in each preset posture is determined as the first preset posture (i.e., the rare preset posture), and thus the enhancement processing on the first image corresponding to the first preset posture can be realized.
In an optional implementation manner, the clustering, based on the first pose of the object in each of the first images, the plurality of first images to obtain a plurality of cluster clusters includes: normalizing each first attitude to obtain a normalized first attitude; and clustering the plurality of first images based on the normalized first posture to obtain a plurality of cluster clusters.
In an optional embodiment, the normalizing each of the first postures to obtain a normalized first posture includes: determining a detection frame of an object in each first image, and acquiring a target image positioned in the detection frame in the first image; and normalizing the image size of the target image to obtain the normalized first posture.
In the above embodiment, the size of the target image is normalized, so that the clustering precision of the gaussian mixture model can be improved, and a more accurate clustering result can be obtained.
In an alternative embodiment, the determining a first enhanced image among a plurality of candidate images based on the prediction data includes: acquiring the attitude weight of each preset attitude; the attitude weight is used for indicating the image quantity proportion of a first image corresponding to a preset attitude in the plurality of first images; carrying out weighted summation on the prediction data of each candidate image and the attitude weight to obtain a weighted summation result of each candidate image; determining a candidate image corresponding to a target weighted summation result in the weighted summation results of the candidate images as the first enhanced image; the target weighted sum result is used for indicating that the probability that the first target limb of the object in the corresponding candidate image is the first preset posture meets the probability requirement.
In an alternative embodiment, the determining a first enhanced image in the plurality of candidate images based on the prediction data includes: determining a target candidate image among the plurality of candidate images based on the prediction data; the probability that the transformation posture of the object in the target candidate image is a second preset posture is greater than or equal to a preset probability threshold value, and the second preset posture is a preset posture that the number of images in the image set meets the number requirement; determining the first enhanced image in a candidate image of the plurality of candidate images other than the target candidate image.
In the above embodiment, the preset posture with a small number of corresponding images can be determined as the first preset posture in each preset posture, so that the enhancement processing of the first image corresponding to the first preset posture can be realized.
In a second aspect, an embodiment of the present disclosure provides an image enhancement apparatus, including: the first determining unit is used for determining the posture of a first target limb of the object in the first image to obtain a first posture; the posture transformation unit is used for carrying out posture transformation on the first target limb based on the first posture to obtain a transformed first target limb; and determining a plurality of candidate images based on the transformed first target limb; the postures of the first target limb in the candidate images after transformation are different; a second determination unit configured to determine prediction data of each of the candidate images; the prediction data is used for indicating the probability that the posture of the first target limb in the corresponding candidate image after transformation is a preset posture; a third determination unit configured to determine a first enhanced image among the plurality of candidate images based on the prediction data; the first target limb of the object in the first enhanced image is a first preset posture, and the first preset posture is a preset posture that the number of images in the image set meets a preset condition.
In a third aspect, an embodiment of the present disclosure further provides an electronic device, including: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating via the bus when the electronic device is running, the machine-readable instructions being executable by the processor to perform the steps of the first aspect or any one of the possible implementations of the first aspect.
In a fourth aspect, this disclosed embodiment also provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps in the first aspect or any one of the possible implementation manners of the first aspect.
In order to make the aforementioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings required for use in the embodiments will be briefly described below, and the drawings herein incorporated in and forming a part of the specification illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the technical solutions of the present disclosure. It is appreciated that the following drawings depict only certain embodiments of the disclosure and are therefore not to be considered limiting of its scope, for those skilled in the art will be able to derive additional related drawings therefrom without the benefit of the inventive faculty.
Fig. 1 illustrates a flowchart of an image enhancement method provided by an embodiment of the present disclosure;
FIG. 2 is a schematic diagram illustrating an effect of a clustering result of a plurality of first images provided by an embodiment of the present disclosure;
fig. 3 is a flowchart illustrating a specific method for performing pose transformation on the first target limb based on the first pose and determining a plurality of candidate images based on the transformed first target limb in the image enhancement method provided by the embodiment of the disclosure;
fig. 4 (a) is a schematic diagram illustrating an effect of a first image provided by an embodiment of the present disclosure;
fig. 4 (b) is a schematic diagram illustrating an effect of an image segmentation result of a first image provided by an embodiment of the present disclosure;
fig. 4 (c) is a schematic diagram illustrating an effect of deleting the first image of each sub-limb provided by the embodiment of the present disclosure;
fig. 4 (d) is a schematic diagram illustrating an effect of a first image to be fused provided by the embodiment of the disclosure;
fig. 4 (e) is a schematic diagram illustrating an effect of a candidate image provided by the embodiment of the disclosure;
fig. 5 is a flowchart illustrating a specific method for determining a first enhanced image in the candidate images based on the prediction data in the image enhancement method provided by the embodiment of the present disclosure;
fig. 6 is a flowchart illustrating another specific method for determining a first enhanced image in the candidate images based on the prediction data in the image enhancement method provided by the embodiment of the present disclosure;
fig. 7 shows a schematic flow chart of an image enhancement method provided by an embodiment of the present disclosure;
fig. 8 shows a schematic diagram of an image enhancement apparatus provided by an embodiment of the present disclosure;
fig. 9 shows a schematic diagram of an electronic device provided by an embodiment of the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, not all of the embodiments. The components of the embodiments of the present disclosure, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure, presented in the figures, is not intended to limit the scope of the claimed disclosure, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the disclosure without making creative efforts, shall fall within the protection scope of the disclosure.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined or explained in subsequent figures.
The term "and/or" herein merely describes an associative relationship, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of a, B, and C, and may mean including any one or more elements selected from the group consisting of a, B, and C.
The disclosure provides an image enhancement method, an image enhancement device, an electronic device and a storage medium, which can change the posture of a first target limb in an original image in an image enhancement mode so as to expand an image set with different postures of the first target limb, thereby improving the accuracy of a machine learning system in posture detection. Since the postures of the first target limbs have diversity, but the probability of occurrence of each posture of the first target limbs, or the probability or difficulty of acquiring the image with each posture of the first target limbs are different, it is inevitable that the number of images with certain types of postures is small. The rare gesture categories, referred to as first preset gestures in the disclosure, can be expanded by image enhancement to the number of images with the first preset gestures, so that the machine learning system can more accurately recognize the first preset gestures of the object.
In a specific implementation, posture transformation can be performed on a first target limb based on an original first posture of an object, and a plurality of candidate images containing multiple postures can be generated based on the transformed first target limb, so that the first posture of the object in the first image is changed, a new posture is automatically generated for the object, and the posture type of the object is enriched. Further, determining the first enhanced image based on the prediction data of each candidate image can realize the screening of the first enhanced image belonging to the first preset posture in the plurality of candidate images. Here, the first preset posture may be understood as a preset posture in which the number of images in the image set in the preset posture satisfies a preset condition. By enhancing the image set corresponding to the first preset posture, the number of images in the image set corresponding to the first preset posture can be increased, and therefore the learning quality of the machine learning system on the first preset posture is improved.
To facilitate understanding of the present embodiment, first, an image enhancement method disclosed in an embodiment of the present disclosure is described in detail, where an execution subject of the image enhancement method provided in an embodiment of the present disclosure is generally an electronic device with certain computing capability, and the electronic device includes, for example: a terminal device, which may be a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a Personal Digital Assistant (PDA), a handheld device, a computing device, a vehicle mounted device, a wearable device, or a server or other processing device. In some possible implementations, the image enhancement method may be implemented by a processor invoking computer readable instructions stored in a memory.
Referring to fig. 1, a flowchart of an image enhancement method provided in an embodiment of the present disclosure is shown, where the method includes steps S101 to S107, where:
s101: the posture of a first target limb of the object in the first image is determined, and a first posture is obtained.
Here, the first image may be an image in a known image set (e.g., a known training set). Wherein each first image of the set of known images contains the same object. The first target limb of the subject may be understood as at least part of a limb part of the subject, for example, an unobstructed limb (thigh, calf, forearm) of the subject. In addition, the designated limb portion may also include other limb portions of the subject, such as a hand, a head, and the like, which is not specifically limited by the present disclosure so as to be implemented as a standard.
Here, the first pose may be determined from coordinates of limb keypoints of the first target limb. If the first target limb comprises a plurality of sub-limbs, the posture of the first posture comprising each sub-limb can be determined according to the coordinates of the limb key points of each sub-limb.
In an embodiment of the present disclosure, the first target limb may be a whole limb of the subject, and may also be a partial limb of the subject. Based on this, a whole-body posture detection may be performed on the object in each first image, so as to obtain a whole-body posture detection result of the object, where the whole-body posture detection result includes coordinates of all the limb key points of the object. Then, the coordinates of the limb key points of the first target limb are determined in the whole body posture detection result. For example, in the case where the first target limb is a full body limb, the full body posture detection result may be determined as the above-described first posture; in the case where the first target limb is a partial limb, the detection result of the partial limb in the detection result of the whole body posture may be determined as the first posture.
S103: performing pose transformation on the first target limb based on the first pose, and determining a plurality of candidate images based on the transformed first target limb; and the postures of the first target limb in the candidate images after transformation are different.
In an embodiment of the present disclosure, the first target limb may be subjected to at least one of the following types of pose transformations based on the first pose: translation, zoom, flip, rotate, and crop. In particular implementations, at least one of the pose transformations described above may be performed on the coordinates of the limb key points of the first target limb based on the first pose.
When the first target limb comprises a plurality of sub-limbs, the sub-limbs can be subjected to posture transformation based on the sub-postures of the sub-limbs, that is, the sub-limbs are subjected to posture transformation based on the coordinates of the limb key points of the sub-limbs, so that the transformed sub-limbs are obtained. The sub-limbs in the first image may then be replaced with transformed sub-limbs, resulting in a candidate image. Here, the above-described operation may be repeatedly performed for each first image, and each operation may result in one candidate image, thereby resulting in a plurality of candidate images. The transformed poses for the first target limb in the different candidate images are not the same, i.e., the transformed poses for the sub-limbs in the different candidate images are not exactly the same.
Here, in the case where the first target limb includes a plurality of sub-limbs, the manner of performing pose transformation on each sub-limb may be completely different, or at least partially different, and this disclosure is not limited thereto in particular. For example, a rotation transformation may be performed for the thigh, and a translation and rotation transformation may be performed for the forearm.
S105: determining prediction data for each of the candidate images; the prediction data is used for indicating the probability that the posture of the first target limb in the corresponding candidate image after transformation is a preset posture.
In the embodiment of the disclosure, the plurality of first images may be subjected to cluster analysis, so that the first images corresponding to the same posture type in the plurality of first images are divided into the same cluster, wherein each cluster corresponds to one posture type. Wherein one gesture type may be used to indicate one preset gesture. At this time, the preset gesture may be determined based on the gesture type indicated by the cluster.
For example, as shown in fig. 2, what is indicated by ellipses 1, 2 and 3 is a partial cluster in the plurality of clusters, and for ellipse 1, the preset posture corresponding to the cluster is standing; aiming at the ellipse 2, the preset posture corresponding to the cluster is laterals; for the ellipse 3, the preset posture corresponding to the cluster is squaring.
After the preset postures are determined, the probability that the posture of the first target limb in each candidate image after transformation is the preset posture can be determined, and the probability is determined as prediction data. Wherein the greater the probability, the higher the probability that the candidate image belongs to the preset posture.
S107: determining a first enhanced image among the plurality of candidate images based on the prediction data; the first target limb of the object in the first enhanced image is in a first preset posture, and the first preset posture is a preset posture in which the number of images in the image set meets a preset condition.
Here, an image set may be obtained in advance, and may be referred to as an image set a, where the image set a may be a known image set including the first image described above, and may also be a sub-image set in other known image sets, and this disclosure is not limited in this respect. The image set A comprises a plurality of images, and the posture of a first target limb of an object in each image is a preset posture.
Based on this, the first preset posture may be understood as a preset posture in which the number of images in the image set a satisfies a preset condition, for example, the preset posture may be a preset posture in which the number of images in the image set a does not satisfy a number requirement, wherein the number of images is the number of images in the first preset posture in the posture of the first target limb of the object in the image set a. For example, the first preset pose may be understood as a preset pose in which the number of images in the image set a is less than or equal to a first number threshold.
It should be understood that if the number of images in the image set to which the preset posture a belongs is smaller in the plurality of preset postures, it may indicate that the number of images including the object in the preset posture a is smaller, and at this time, the preset posture a may be regarded as the first preset posture.
Here, the image set to which the first preset pose belongs may be a sub-image set of the known image set to which the first image corresponds, that is: the sub-image set is an image set formed by first images corresponding to a first preset posture in the known image set. Besides, the image set to which the first preset pose belongs may also be a sub-image set in other known image sets, which is not specifically limited by the present disclosure.
Here, the first preset posture described above may also be referred to as a rare preset posture, i.e., the number of images included in the belonging image set is less than or equal to a preset number threshold value, or less, and conversely, the remaining postures of the plurality of preset postures other than the first preset posture may be referred to as a normal posture, i.e., the number of images included in the belonging image set is greater than a preset number threshold value.
In an optional embodiment, in the case that the first target limb includes a sub-limb, and the first pose includes a sub-pose of the sub-limb, as shown in fig. 3, the step S103 performs pose transformation on the first target limb based on the first pose, and determines multiple candidate images based on the transformed first target limb, specifically including the following steps:
step S301, based on the image segmentation result of the first image, segmenting the first image to obtain a segmented image of the sub-limb; wherein the image segmentation result is used for indicating a limb segmentation result of the object;
step S302, posture transformation is carried out on the sub-limbs in the segmentation image based on the sub-postures, and a first image to be fused is obtained after the posture transformation;
step S303, determining the candidate image based on the first image and the first image to be fused.
In the embodiment of the present disclosure, after the first image is acquired, the first image may be input into an image segmentation network, so as to obtain an image segmentation result of the first image; the image segmentation result indicates the segmentation result of each limb of the subject in the first image, and for example, the segmentation results of the limbs (thigh, calf, forearm), head, upper limb, lower limb, and hand of the subject can be obtained. The image segmentation network may be a pre-trained deep learning model, and the present disclosure does not specifically limit the model result of the deep learning model.
After the image segmentation result is obtained, the limb image of each sub-limb, that is, the segmented image of the sub-limb, can be segmented in the first image based on the image segmentation result. Each sub-limb corresponds to one limb image, for example, the thigh, the calf, the forearm and the forearm correspond to one limb image respectively. Here, the limb image for each sub-limb also carries a corresponding image tag, which is used to indicate the limb type of the sub-limb included in the limb image, for example, the image tag may be: 00. 01, 10, 11, wherein 00, 01, 10, 11 respectively represent that the limb types of the sub-limbs included in the corresponding limb image are: thigh, calf, upper arm, lower arm.
As can be seen from the above description, for each sub-limb, a sub-pose is corresponding, where the sub-pose can be understood as coordinates (x, y) of a key point of the limb corresponding to the sub-limb, where the determination process of the sub-pose of each sub-limb is described as follows:
first, a whole body pose detection may be performed on the object in each first image, so as to obtain a whole body pose detection result of the object, wherein the whole body pose detection result may be understood as coordinates of all the limb key points of the object. Then, the limb key points belonging to each limb in the coordinates of all the limb key points can be determined based on the image segmentation result. Next, each of the sub-limbs may be determined in a plurality of limbs of the subject, and a limb key point of each sub-limb may be obtained, so as to obtain a sub-pose of each sub-limb.
After the sub-poses of the sub-limbs are determined in the manner described above, the corresponding sub-limbs may be pose-transformed based on the sub-poses. Here, the coordinates of the limb key points of the sub-limb may be affine transformed, so as to obtain a transformed limb image (i.e., the first image to be fused).
Here, since each sub-limb includes at least two limb key points, by performing affine transformation on the at least two limb key points, translation, scaling, flipping, rotating, and shearing of a vector between the at least two limb key points can be achieved.
After the first image to be fused of each sub-limb is obtained, the candidate images can be determined based on the first image and the plurality of first images to be fused. In particular, the limb type of the sub-limb included in each first image to be fused may be determined based on the image tag of the first image to be fused, so that the corresponding fusion position is determined in the first image based on the limb type. And then, obtaining a candidate image by using the first image and the first image to be fused according to the fusion position.
In an optional implementation manner, the step S303 of determining the candidate image based on the first image and the first image to be fused specifically includes the following steps:
s3031, processing the first image based on the image segmentation result to obtain a second image to be fused; the second image to be fused comprises a second target limb of the object, and the second target limb is the residual limb except the sub-limb;
s3032, determining a fusion position of the first image to be fused in the second image to be fused, fusing the first image to be fused and the second image to be fused based on the fusion position, and obtaining the candidate image after fusing.
As can be seen from the above description, the image segmentation result is used to indicate the segmentation result of each limb of the object in the first image, wherein the segmentation result of each limb can be used to indicate the position of each limb in the first image.
On the basis, the image area of each sub limb can be determined in the first image based on the segmentation result of each limb; then, the sub-limbs in the image area are erased, so that the first image (i.e., the second image to be fused) with each sub-limb deleted is obtained.
For example, the image is a schematic diagram of a first image as shown in fig. 4 (a), an image segmentation result of the first image as shown in fig. 4 (b), and a second image to be fused including a second target limb as shown in fig. 4 (c), that is, a second image to be fused without each sub-limb.
After the second image to be fused is obtained, the first image to be fused and the second image to be fused may be subjected to image fusion processing, and a candidate image, that is, a candidate image shown in fig. 4 (e) is obtained after fusion, where the first image to be fused may be shown in fig. 4 (d).
In a specific implementation, the limb type of the sub-limb included in the first image to be fused may be determined based on the image tag of the first image to be fused, so that the corresponding fusion position is determined in the second image to be fused based on the limb type. Then, the second image to be fused and the first image to be fused are subjected to fusion processing according to the fusion position, and a candidate image as shown in fig. 4 (e) is obtained.
In the above embodiment, by performing pose transformation on a plurality of sub-limbs of the object in the first image, a new object pose with diversity can be generated, so that candidate images including the object pose with diversity can be obtained, and the probability of screening a new first image meeting requirements from a plurality of candidate images is improved.
In an optional embodiment, in the case that the first pose is a coordinate of a key point of a limb, the step S103 performs pose transformation on the first target limb based on the first pose, and specifically includes the following steps;
s1031, determining the posture transformation type of the first target limb;
s1032, determining a posture transformation matrix of the first target limb based on the posture transformation type;
and S1033, performing coordinate transformation on the coordinates of the limb key points of the first target limb based on the posture transformation matrix to obtain the transformed first target limb.
Here, the pose transformation may be an affine transformation, and then the pose transformation type may be understood as an affine transformation type. As can be seen from the above description, the affine transformation type includes at least one of: translation, zoom, flip, rotate, and crop. At this point, the pose transform type of the first target limb may be determined in pan, zoom, flip, rotate, and cut.
When the first target limb comprises a plurality of sub-limbs, the posture transformation type of each sub-limb can be determined in translation, scaling, overturning, rotating and shearing, wherein the posture transformation types corresponding to different sub-limbs can be the same or different.
In order to improve the accuracy of posture transformation of a plurality of sub-limbs and prevent the posture which does not conform to the physiological structure of the subject from occurring, one or more corresponding transformation types may be preset for each sub-limb in advance. Thereafter, a pose transform type may be randomly determined for the sub-limb among the one or more transform types.
After the posture transformation type of each sub-limb is determined, the posture transformation parameters of each posture transformation type are required to be set. For example, in the case where the gesture change type is panning, a distance of panning, a direction of panning, and the like may be set. For example, when the posture change type is rotation, information such as the angle of rotation and the direction of rotation may be set.
In specific implementation, for each posture transformation type, corresponding transformation constraint information may be preset, where the transformation constraint information of different types of sub-limbs under the posture transformation type may be different or may be the same. That is, transformation constraint information of a sub-limb with respect to each posture transformation type may be set based on a limb type of the sub-limb, thereby preventing a posture that does not conform to a physiological configuration of a subject from occurring under the constraint of the transformation constraint information. Then, under the constraint of the transformation constraint information, corresponding posture transformation parameters can be set for each posture transformation type. For example, the transformation constraint information may be a parameter value interval of the posture transformation parameter, and at this time, a corresponding parameter value may be randomly selected as the posture transformation parameter for the posture transformation type in the parameter value interval.
After determining the pose transformation parameters for each pose transformation type, a pose transformation matrix may be determined based on the pose transformation parameters. In the case that the gesture transformation type is multiple, each gesture transformation parameter is a parameter matrix, and at this time, the multiple parameter matrices may be multiplied to obtain the gesture transformation matrix. And then, carrying out coordinate transformation on the limb key points of the corresponding sub-limbs through the posture transformation matrix to obtain the transformed sub-limbs.
In the above embodiment, through carrying out affine transformation on the first target limb, more reasonable posture transformation on the first target limb can be realized, so that reasonable and abundant limb postures are obtained.
In the embodiment of the present disclosure, after the plurality of candidate images are determined in the above-described manner, prediction data of the transformed posture of the object in each candidate image as each preset posture may be determined through a gaussian mixture model, for example, the prediction probability of the transformed posture of the object as each preset posture.
Therefore, before determining the transformed pose of the object as the predicted probability of each preset pose, a gaussian mixture model needs to be fitted based on the plurality of first images, and then a clustering result of the plurality of first images is determined through the fitted gaussian mixture model, so that the preset pose is determined based on the clustering result.
On this basis, the method provided by the embodiment of the present disclosure further includes, in a case that the number of the first images is multiple, determining the preset posture by:
s11, clustering the first images based on the first postures of the objects in the first images to obtain a plurality of cluster clusters; each cluster corresponds to one attitude type, and different clusters correspond to different attitude types;
and S12, determining the gesture type corresponding to each cluster as the preset gesture.
In the embodiment of the present disclosure, the first pose of the object in the first image may be processed through a gaussian mixture model, so that the multiple first images are clustered, and multiple cluster clusters shown in fig. 2 are obtained.
Here, the above gaussian mixture model can be determined by:
firstly, constructing an initial Gaussian mixture model; then, solving and calculating the initial Gaussian mixture model through a target optimization iterative algorithm and the first posture to obtain a solving result of the initial Gaussian mixture model, wherein the solving result is used for determining model parameters of the initial Gaussian mixture model; finally, the Gaussian mixture model can be determined based on the solution result.
In the disclosed embodiment, the constructed initial gaussian mixture model contains unknown model parameters. At this time, unknown model parameters in the initial gaussian mixture model may be solved based on the first pose, wherein the model parameters include a mean and a variance. Then, the gaussian mixture model can be determined based on the known model parameters, so that the first images are subjected to cluster analysis through the gaussian mixture model to obtain a plurality of cluster clusters.
According to the description, the first pose can be coordinates of the key points of the limbs, and in this case, the initial Gaussian mixture model can be optimally solved through a target optimization iterative algorithm and the coordinates of the key points of the limbs, so that known model parameters are obtained. Here, the target optimization iterative Algorithm may be an iterative optimization (EM) Algorithm.
After the known model parameters are obtained through solving, clustering the first posture through the solved initial Gaussian mixture model, and thus obtaining the probability that the first posture of the object in each first image is each preset posture; the plurality of first images may then be grouped based on the probability, resulting in a plurality of cluster clusters. At this time, the preset gesture may be determined based on the gesture types corresponding to the plurality of cluster clusters. Meanwhile, the prediction data of each candidate image can be determined through the solved initial Gaussian mixture model.
In an optional embodiment, the step S12 of clustering the plurality of first images based on the first pose of the object in each first image to obtain a plurality of clusters specifically includes the following steps:
step S121, carrying out normalization processing on each first posture to obtain a normalized first posture;
and S122, clustering the plurality of first images based on the normalized first posture to obtain a plurality of cluster clusters.
In the embodiment of the present disclosure, first, a detection frame of an object in each of the first images may be determined, and a target image located in the detection frame in the first image may be acquired.
Here, a detection box (also called a bounding box) of whether the object is contained in the first image of the training set may be determined; if the detection frame is determined to be included, acquiring an image positioned in the detection frame in the first image so as to obtain a target image; if the detection frame is determined not to be included, the first image can be input into a target detection model for detection processing, so that the detection frame of the object is detected. The target detection model and the image segmentation network are two neural network models with different functions.
After the image (i.e., the target image) within the detection frame is determined, the image size of the target image may be normalized according to a preset image size (e.g., 256 × 256), resulting in a normalized first pose.
And then, processing the normalized first posture through the Gaussian mixture model, thereby clustering the plurality of first images to obtain a plurality of cluster clusters.
In the above embodiment, the size of the target image is normalized, so that the clustering precision of the gaussian mixture model can be improved, and a more accurate clustering result can be obtained. The clustering analysis is carried out on the plurality of first images through the Gaussian mixture model, the image distribution condition of the first images belonging to each preset posture can be obtained, and therefore the preset posture with a small number of corresponding images is determined in each preset posture to serve as the first preset posture (namely, the rare preset posture), and the first image corresponding to the first preset posture can be enhanced.
In an alternative embodiment, as shown in fig. 5, the step S107 of determining a first enhanced image in the candidate images based on the prediction data specifically includes the following steps:
s21, acquiring the attitude weight of each preset attitude; the attitude weight is used for indicating the image quantity proportion of the first image corresponding to the preset attitude in the plurality of first images;
s22, performing weighted summation on the prediction data of each candidate image and the attitude weight to obtain a weighted summation result of each candidate image;
step S23, determining a candidate image corresponding to a target weighted summation result in the weighted summation results of the candidate images as the first enhanced image; the target weighted sum result is used for indicating that the probability that the first target limb of the object in the corresponding candidate image is the first preset posture meets the probability requirement.
In the embodiment of the present disclosure, a corresponding pose weight may be set for each preset pose based on the number of images of the first image belonging to each preset pose among the plurality of first images, so as to determine the image number proportion of the first image belonging to the preset pose among the plurality of first images, that is, the proportion of the total number of images of the first image belonging to each preset pose among the plurality of first images, by the pose weight. The higher the proportion is, the more common the preset postures corresponding to the proportion can be determined to be; the lower the occupancy, the more rare the preset posture corresponding to the occupancy can be determined. Here, the sum of the attitude weights of all the preset attitudes is 1.
After the pose weight is determined, for each candidate image of each first image, the prediction data of each candidate image (i.e., the prediction probability of the candidate image belonging to each preset pose) and the pose weight corresponding to the preset pose may be calculated and weighted and summed, so as to obtain a weighted and summed result of the candidate image.
Then, the candidate image corresponding to the target weighted summation result in the weighted summation results may be determined as the first enhanced image. In the following, several ways of determining a target weighted sum result among a plurality of weighted sum results will be described:
the method I comprises the following steps: a minimum weighted sum result in the plurality of candidate images of the first image may be determined as a target weighted sum result; then, the candidate image corresponding to the minimum weighted sum result is used as a first enhanced image.
Here, the minimum weighted sum result may be used to indicate that a difference between probabilities that a posture of the first target limb in the corresponding candidate image after transformation is each preset posture is minimum, that is, a probability that a posture of the first target limb in the candidate image after transformation is a common preset posture is minimum, or a probability that a posture of the first target limb in the candidate image after transformation is a first preset posture is maximum, where the common preset posture is a second preset posture described in the following embodiment.
The second method comprises the following steps: determining a weighted sum result smaller than or equal to a preset threshold value from among the plurality of candidate images of the first image as a target weighted sum result; then, the candidate image corresponding to the weighted summation result smaller than or equal to the preset threshold is used as the first enhanced image.
Based on the screening principle described in the first mode, a preset threshold may be preset, and candidate images meeting the probability requirement are screened out from the multiple candidate images as first enhanced images through the preset threshold.
In the above embodiment, the preset posture with a small number of corresponding images can be determined as the first preset posture in each preset posture, so that the enhancement processing of the first image corresponding to the first preset posture can be realized.
In an alternative embodiment, as shown in fig. 6, the step S107 of determining a first enhanced image in the candidate images based on the prediction data specifically includes the following steps:
a step S31 of determining a target candidate image among the plurality of candidate images based on the prediction data; the prediction probability that the posture of the first target limb in the target candidate image after transformation is a second preset posture is greater than or equal to a preset probability threshold value;
step S32 of determining the new first image in a candidate image other than the target candidate image among the plurality of candidate images.
Here, the second preset posture may be understood as a preset posture in which the number of images in the image set is greater than or equal to a second number threshold value among the plurality of preset postures. Wherein the first quantity threshold is less than the second quantity threshold. That is, the second preset pose may be the highest number of images in the image set to which the second preset pose belongs, or the number of images exceeds a second number threshold; that is, the second preset posture may also be referred to as a common preset posture with respect to the first preset posture.
Based on this, it may be determined whether the transformed pose of the object in each candidate image is the first preset pose based on the probability that the transformed pose of the object in the candidate image is the second preset pose. And if the probability is determined to be greater than or equal to the preset probability threshold, determining that the transformation posture of the object in the candidate image is not the first preset posture but the second preset posture. For example, the preset probability threshold may be set to 0.5, and in a case where the predicted probability of determining that the transformed posture of the object in the candidate image is the second preset posture is greater than or equal to 0.5, it is determined that the transformed posture of the object in the candidate image is not the first preset posture.
In the embodiment of the disclosure, the probability that the posture of the first target limb after transformation in each candidate image is the second preset posture can be determined, so as to obtain a plurality of probabilities; then, an image having a probability greater than or equal to a preset probability threshold among the plurality of probabilities is determined as a target candidate image. Then, a first enhanced image is determined in a candidate image other than the target candidate image among the plurality of candidate images.
In specific implementation, the weighted summation result between the prediction data and the pose weight of each remaining candidate image can be calculated through the steps S11 to S13; then, determining the residual candidate image corresponding to the minimum weighted sum result as the first enhanced image; and/or determining the remaining candidate images corresponding to the weighted summation result smaller than or equal to a preset threshold value as the first enhanced image. Alternatively, the first enhanced image may be determined directly from candidate images other than the target candidate image among the plurality of candidate images.
In the above embodiment, the preset posture with a small number of corresponding images can be determined as the first preset posture in each preset posture, so that the image set corresponding to the first preset posture can be enhanced.
The following takes an image set as an example of a training set, and introduces an image enhancement method provided by the embodiment of the disclosure with reference to fig. 7.
Firstly, a training set to be enhanced is obtained, wherein the training set to be enhanced comprises a plurality of first images. Then, a whole-body posture detection is performed on the object in each first image, resulting in a whole-body posture detection result, wherein the whole-body posture detection result can be understood as coordinates of all the limb key points of the object, for example, coordinates (x, y) as shown in fig. 7. Then, image segmentation processing can be carried out on the first image to obtain an image segmentation result of the first image; wherein the image segmentation result is used to indicate the limb segmentation result of the object, e.g. to indicate the position of each limb of the object in the first image.
Next, a Pose Transformation Module (PTM) may be used to perform Pose Transformation on the first target limb of the object in each first image, so as to obtain a plurality of candidate images of the first image, where the specific process is described as follows:
a pose of a first target limb of the subject in the first image may be determined based on the whole-body pose detection result, resulting in a first pose. If the first target limb is a limb of the subject, the pose of the first target limb may be the coordinates of a limb keypoint of the limb. Then, based on the image segmentation result of the first image, segmenting the first image to obtain a segmentation image of each sub-limb; secondly, posture transformation is carried out on the sub-limbs in the corresponding segmented image based on the sub-posture of each sub-limb in the first posture, and a first image to be fused is obtained after the posture transformation; then, processing the first image based on the image segmentation result to obtain a second image to be fused; and finally, determining the fusion position of each first image to be fused in the second image to be fused, fusing the first image to be fused and the second image to be fused based on the fusion position, and obtaining the candidate image after fusion.
For example, as shown in fig. 7, for any one of the first images, three candidate images may be obtained, where the poses of the objects in the three candidate images may be represented by (x 1, y 1), (x 2, y 2), and (x 3, y 3), respectively.
After a plurality of candidate images are determined, a first enhanced image can be determined in the plurality of candidate images through a Pose Clustering Module (PCM for short), and the specific process is described as follows;
firstly, determining the transformed posture of the object in each candidate image as the prediction data of each preset posture through a Gaussian mixture model, and assuming that the preset postures comprise a preset posture A, a preset posture B and a preset posture C, determining the prediction probability that the transformed posture of the object in each candidate image is the preset posture A, the preset posture B and the preset posture C. Then, acquiring attitude weights of a preset attitude A, a preset attitude B and a preset attitude C respectively; the attitude weight is used for indicating the image quantity ratio of the plurality of first images corresponding to the first images in the preset attitude; then, carrying out weighted summation on the prediction data of each candidate image and the attitude weight to obtain a weighted summation result of each candidate image; and finally, determining the candidate image corresponding to the minimum weighted sum result as the first enhanced image, and/or determining the candidate image corresponding to the weighted sum result which is less than or equal to a preset threshold value as the first enhanced image.
For example, as shown in fig. 7, it is assumed that the probability that the transformed posture of the object in the candidate image 1 belongs to the preset posture a is 0.5, the probability that the object belongs to the preset posture B is 0.3, and the probability that the object belongs to the preset posture C is 0.2. The probability that the transformation posture of the object in the candidate image 2 belongs to the preset posture A is 0.3, the probability that the transformation posture belongs to the preset posture B is 0.4, and the probability that the transformation posture belongs to the preset posture C is 0.3. The probability that the transformation posture of the object in the candidate image 3 belongs to the preset posture a is 0.6, the probability that the transformation posture belongs to the preset posture B is 0.3, and the probability that the transformation posture belongs to the preset posture C is 0.1.
Assume that the attitude weight of the preset attitude a is 0.7, the attitude weight of the preset attitude B is 0.2, and the attitude weight of the preset attitude C is 0.1. As can be seen from the weighted summation result, the weighted summation result between the preset data of the candidate image 2 and each pose weight is the smallest, and therefore, the candidate image 2 is the first enhanced image.
It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.
Based on the same inventive concept, the embodiment of the present disclosure further provides a first image enhancement apparatus corresponding to the image enhancement method, and since the principle of solving the problem of the apparatus in the embodiment of the present disclosure is similar to the image enhancement method in the embodiment of the present disclosure, the implementation of the apparatus may refer to the implementation of the method, and the repeated parts are not described again.
Referring to fig. 8, a schematic diagram of an image enhancement apparatus provided in an embodiment of the present disclosure is shown, where the apparatus includes: a first determination unit 10, an attitude conversion unit 20, a second determination unit 30, a third determination unit 40; wherein, the first and the second end of the pipe are connected with each other,
a first determining unit 10, configured to determine a posture of a first target limb of an object in a first image, to obtain a first posture;
a posture transformation unit 20, configured to perform posture transformation on the first target limb based on the first posture, and determine a plurality of candidate images based on the transformed first target limb; the postures of the first target limb in the candidate images after transformation are different;
a second determining unit 30 for determining prediction data of each of the candidate images; the prediction data is used for indicating the probability that the posture of the first target limb in the corresponding candidate image after transformation is a preset posture;
a third determining unit 40 for determining a first enhanced image among the plurality of candidate images based on the prediction data; the first target limb of the object in the first enhanced image is a first preset posture, and the first preset posture is a preset posture that the number of images in the image set meets a preset condition.
In the above embodiment, by performing pose transformation on the first target limb based on the first pose and generating a plurality of candidate images based on the transformed first target limb, the first pose of the object in the first image may be changed, so as to automatically generate a new pose for the object, thereby enriching the pose types of the object and obtaining candidate images including multiple poses. By determining the first enhanced image based on the prediction data of each candidate image, the first enhanced image belonging to the first preset posture can be screened from the plurality of candidate images. Here, the first preset posture can be understood as a preset posture in which the number of images in the image set does not satisfy the number requirement in the preset posture. By enhancing the image set corresponding to the first preset posture, the number of images of the image set corresponding to the first preset posture can be increased, and therefore the learning quality of the machine learning system for the first preset posture is improved.
In one possible embodiment, the posture changing unit 20 is further configured to: based on the image segmentation result of the first image, segmenting the first image to obtain a segmented image of the sub-limb; wherein the image segmentation result is used to indicate a limb segmentation result of the subject; performing posture transformation on the sub-limbs in the segmented image based on the sub-postures to obtain a first image to be fused after the posture transformation;
in one possible embodiment, the posture changing unit 20 is further configured to: processing the first image based on the image segmentation result to obtain a second image to be fused, wherein the second image to be fused comprises a second target limb of the object, and the second target limb is the residual limb except the sub-limb; and determining the fusion position of each first image to be fused in the second image to be fused, fusing the first image to be fused and the second image to be fused based on the fusion position, and obtaining the candidate image after fusion.
In one possible embodiment, the posture changing unit 20 is further configured to: determining a gesture transformation type of the first target limb; determining a pose transformation matrix for the first target limb based on the pose transformation type; and carrying out coordinate transformation on the coordinates of the limb key points of the first target limb based on the posture transformation matrix to obtain the transformed first target limb.
In a possible embodiment, in the case of a plurality of first images, the device is further configured to determine the preset pose by: clustering the plurality of first images based on the first postures of the objects in the first images to obtain a plurality of cluster clusters; each cluster corresponds to one attitude type, and different clusters correspond to different attitude types; and determining the gesture type corresponding to each cluster as the preset gesture.
In one possible embodiment, the apparatus is further configured to: normalizing each first attitude to obtain a normalized first attitude; and clustering the plurality of first images based on the normalized first posture to obtain a plurality of cluster clusters.
In one possible embodiment, the apparatus is further configured to: determining a detection frame of an object in each first image, and acquiring a target image positioned in the detection frame in the first image; and normalizing the image size of the target image to obtain the normalized first posture.
In one possible embodiment, the third determining unit is configured to: acquiring the attitude weight of each preset attitude; the attitude weight is used for indicating the image number proportion of a first image corresponding to a preset attitude in the plurality of first images; carrying out weighted summation on the prediction data of each candidate image and the attitude weight to obtain a weighted summation result of each candidate image; determining a candidate image corresponding to a target weighted summation result in the weighted summation results of the candidate images as the first enhanced image; the target weighted sum result is used for indicating that the probability that the first target limb of the object in the corresponding candidate image is the first preset posture meets the probability requirement.
In one possible embodiment, the third determining unit is configured to: determining a target candidate image among the plurality of candidate images based on the prediction data; the probability that the transformation posture of the object in the target candidate image is a second preset posture is greater than or equal to a preset probability threshold value, and the second preset posture is a preset posture that the number of images in the image set meets the number requirement; determining the first enhanced image in a candidate image of the plurality of candidate images other than the target candidate image.
The description of the processing flow of each module in the device and the interaction flow between the modules may refer to the related description in the above method embodiments, and will not be described in detail here.
Corresponding to the image enhancement method in fig. 1, an embodiment of the present disclosure further provides an electronic device 900, as shown in fig. 9, a schematic structural diagram of the electronic device 900 provided in the embodiment of the present disclosure includes:
a processor 91, a memory 92, and a bus 93; the storage 92 is used for storing execution instructions and includes a memory 921 and an external storage 922; the memory 921 is also referred to as an internal memory, and is configured to temporarily store operation data in the processor 91 and data exchanged with an external memory 922 such as a hard disk, the processor 91 exchanges data with the external memory 922 through the memory 921, and when the electronic apparatus 900 operates, the processor 91 communicates with the memory 92 through the bus 93, so that the processor 91 executes the following instructions:
determining the posture of a first target limb of an object in a first image to obtain a first posture;
performing pose transformation on the first target limb based on the first pose, and determining a plurality of candidate images based on the transformed first target limb; the postures of the first target limb in the candidate images after transformation are different;
determining prediction data for each of the candidate images; the prediction data is used for indicating the probability that the posture of the first target limb in the corresponding candidate image after transformation is a preset posture;
determining a first enhanced image among a plurality of the candidate images based on the prediction data; the first target limb of the object in the first enhanced image is a first preset posture, and the first preset posture is a preset posture that the number of images in the image set meets a preset condition.
The embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of the image enhancement method described in the above method embodiments. The storage medium may be a volatile or non-volatile computer-readable storage medium.
The embodiments of the present disclosure also provide a computer program product, where the computer program product carries a program code, and instructions included in the program code may be used to execute the steps of the image enhancement method in the foregoing method embodiments, which may be referred to specifically in the foregoing method embodiments, and are not described herein again.
The computer program product may be implemented by hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied in a computer storage medium, and in another alternative embodiment, the computer program product is embodied in a Software product, such as a Software Development Kit (SDK), or the like.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In the several embodiments provided in the present disclosure, it should be understood that the disclosed system, apparatus, and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solutions of the present disclosure, which are essential or part of the technical solutions contributing to the prior art, may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing an electronic device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the methods described in the embodiments of the present disclosure. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, an optical disk, or other various media capable of storing program codes.
Finally, it should be noted that: the above-mentioned embodiments are merely specific embodiments of the present disclosure, which are used for illustrating the technical solutions of the present disclosure and not for limiting the same, and the scope of the present disclosure is not limited thereto, and although the present disclosure is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive of the technical solutions described in the foregoing embodiments or equivalent technical features thereof within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present disclosure, and should be construed as being included therein. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims (12)

1. An image enhancement method, comprising:
determining the posture of a first target limb of an object in a first image to obtain a first posture;
performing pose transformation on the first target limb based on the first pose, and determining a plurality of candidate images based on the transformed first target limb; the postures of the first target limb in the candidate images after transformation are different;
determining prediction data for each of the candidate images; the prediction data is used for indicating the probability that the posture of the first target limb in the corresponding candidate image after transformation is a preset posture;
determining a first enhanced image among a plurality of the candidate images based on the prediction data; the first target limb of the object in the first enhanced image is a first preset posture, and the first preset posture is a preset posture that the number of images in the image set meets a preset condition.
2. The method of claim 1, wherein the first target limb comprises a sub-limb, and the first pose comprises a sub-pose of the sub-limb;
the pose transforming the first target limb based on the first pose and determining a plurality of candidate images based on the transformed first target limb comprises:
based on the image segmentation result of the first image, segmenting the first image to obtain a segmented image of the sub-limb; wherein the image segmentation result is used to indicate a limb segmentation result of the subject;
performing posture transformation on the sub-limbs in the segmented image based on the sub-postures to obtain a first image to be fused after the posture transformation;
determining the candidate image based on the first image and the first image to be fused.
3. The method according to claim 2, wherein the determining the candidate image based on the first image and the first image to be fused comprises:
processing the first image based on the image segmentation result to obtain a second image to be fused; the second image to be fused comprises a second target limb of the object, and the second target limb is the residual limb except the sub-limb;
and determining the fusion position of each first image to be fused in the second image to be fused, fusing the first image to be fused and the second image to be fused based on the fusion position, and obtaining the candidate image after fusion.
4. The method of any one of claims 1 to 3, wherein the first pose is a coordinate of a limb keypoint; the gesture transforming the first target limb based on the first gesture includes:
determining a gesture transformation type of the first target limb;
determining a pose transformation matrix for the first target limb based on the pose transformation type;
and carrying out coordinate transformation on the coordinates of the limb key points of the first target limb based on the posture transformation matrix to obtain the transformed first target limb.
5. The method according to any one of claims 1 to 4, wherein the first image is plural in number, the method comprising determining the preset pose by:
clustering the plurality of first images based on the first postures of the objects in the first images to obtain a plurality of cluster clusters; each cluster corresponds to one attitude type, and different clusters correspond to different attitude types;
and determining the gesture type corresponding to each cluster as the preset gesture.
6. The method of claim 5, wherein clustering the plurality of first images based on the first pose of the object in each of the first images to obtain a plurality of clusters comprises:
normalizing each first attitude to obtain a normalized first attitude;
and clustering the plurality of first images based on the normalized first posture to obtain a plurality of cluster clusters.
7. The method of claim 6, wherein the normalizing each of the first poses to obtain a normalized first pose comprises:
determining a detection frame of an object in each first image, and acquiring a target image positioned in the detection frame in the first image;
and normalizing the image size of the target image to obtain the normalized first posture.
8. The method according to any of claims 1 to 7, wherein said determining a first enhanced image among a plurality of said candidate images based on said prediction data comprises:
acquiring the attitude weight of each preset attitude; the attitude weight is used for indicating the image quantity proportion of a first image corresponding to a preset attitude in the plurality of first images;
carrying out weighted summation on the prediction data of each candidate image and the attitude weight to obtain a weighted summation result of each candidate image;
determining a candidate image corresponding to a target weighted summation result in the weighted summation results of the candidate images as the first enhanced image; the target weighted sum result is used for indicating that the probability that the first target limb of the object in the corresponding candidate image is the first preset posture meets the probability requirement.
9. The method according to any of claims 1 to 8, wherein said determining a first enhanced image among said plurality of candidate images based on said prediction data comprises:
determining a target candidate image among the plurality of candidate images based on the prediction data; the probability that the transformation posture of the object in the target candidate image is a second preset posture is greater than or equal to a preset probability threshold value, and the second preset posture is a preset posture that the number of images in the image set meets the number requirement;
determining the first enhanced image in a candidate image of the plurality of candidate images other than the target candidate image.
10. An image enhancement apparatus, comprising:
the first determining unit is used for determining the posture of a first target limb of the object in the first image to obtain a first posture;
the posture transformation unit is used for carrying out posture transformation on the first target limb based on the first posture to obtain a transformed first target limb; and determining a plurality of candidate images based on the transformed first target limb; the postures of the first target limb in the candidate images after transformation are different;
a second determination unit configured to determine prediction data of each of the candidate images; the prediction data is used for indicating the probability that the posture of the first target limb in the corresponding candidate image after transformation is a preset posture;
a third determination unit configured to determine a first enhanced image among the plurality of candidate images based on the prediction data; the first target limb of the object in the first enhanced image is a first preset posture, and the first preset posture is a preset posture that the number of images in the image set meets a preset condition.
11. An electronic device, comprising: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating over the bus when the electronic device is operating, the machine-readable instructions when executed by the processor performing the steps of the image enhancement method according to any one of claims 1 to 9.
12. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the image enhancement method according to one of claims 1 to 9.
CN202210872483.9A 2022-07-20 2022-07-20 Image enhancement method, device, electronic equipment and storage medium Active CN115147323B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210872483.9A CN115147323B (en) 2022-07-20 2022-07-20 Image enhancement method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210872483.9A CN115147323B (en) 2022-07-20 2022-07-20 Image enhancement method, device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN115147323A true CN115147323A (en) 2022-10-04
CN115147323B CN115147323B (en) 2024-07-12

Family

ID=83415082

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210872483.9A Active CN115147323B (en) 2022-07-20 2022-07-20 Image enhancement method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115147323B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170148155A1 (en) * 2015-11-20 2017-05-25 Magic Leap, Inc. Methods and Systems for Large-Scale Determination of RGBD Camera Poses
CN110096925A (en) * 2018-01-30 2019-08-06 普天信息技术有限公司 Enhancement Method, acquisition methods and the device of Facial Expression Image
CN111882492A (en) * 2020-06-18 2020-11-03 天津中科智能识别产业技术研究院有限公司 Method for automatically enhancing image data
CN112138394A (en) * 2020-10-16 2020-12-29 腾讯科技(深圳)有限公司 Image processing method, image processing device, electronic equipment and computer readable storage medium
CN113673517A (en) * 2021-08-23 2021-11-19 平安科技(深圳)有限公司 Data enhancement method, device, equipment and medium based on attitude detection

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170148155A1 (en) * 2015-11-20 2017-05-25 Magic Leap, Inc. Methods and Systems for Large-Scale Determination of RGBD Camera Poses
CN110096925A (en) * 2018-01-30 2019-08-06 普天信息技术有限公司 Enhancement Method, acquisition methods and the device of Facial Expression Image
CN111882492A (en) * 2020-06-18 2020-11-03 天津中科智能识别产业技术研究院有限公司 Method for automatically enhancing image data
CN112138394A (en) * 2020-10-16 2020-12-29 腾讯科技(深圳)有限公司 Image processing method, image processing device, electronic equipment and computer readable storage medium
CN113673517A (en) * 2021-08-23 2021-11-19 平安科技(深圳)有限公司 Data enhancement method, device, equipment and medium based on attitude detection

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
欧攀;尉青锋;陈末然;: "基于热力图的手部姿态识别研究", 计算机应用研究, no. 1, 30 June 2020 (2020-06-30) *

Also Published As

Publication number Publication date
CN115147323B (en) 2024-07-12

Similar Documents

Publication Publication Date Title
CN110147744B (en) Face image quality assessment method, device and terminal
CN109145766B (en) Model training method and device, recognition method, electronic device and storage medium
JP3088171B2 (en) Self-organizing pattern classification system and classification method
CN109886998A (en) Multi-object tracking method, device, computer installation and computer storage medium
JP7266674B2 (en) Image classification model training method, image processing method and apparatus
CN111882666B (en) Method, device and equipment for reconstructing three-dimensional grid model and storage medium
CN110188223A (en) Image processing method, device and computer equipment
JP2017529582A (en) Touch classification
CN110232373A (en) Face cluster method, apparatus, equipment and storage medium
CN108428015B (en) Wind power prediction method based on historical meteorological data and random simulation
CN111598111B (en) Three-dimensional model generation method, device, computer equipment and storage medium
CN110414550B (en) Training method, device and system of face recognition model and computer readable medium
CN107729809A (en) A kind of method, apparatus and its readable storage medium storing program for executing of adaptive generation video frequency abstract
CN107944381B (en) Face tracking method, face tracking device, terminal and storage medium
CN109343920A (en) A kind of image processing method and its device, equipment and storage medium
CN111161331B (en) Registration method of BIM model and GIS model
CN111729310B (en) Method and device for sorting game props and electronic equipment
Yasir et al. Two-handed hand gesture recognition for Bangla sign language using LDA and ANN
KR20220051162A (en) Visual positioning methods, training methods for related models, and related devices and devices
CN112464798A (en) Text recognition method and device, electronic equipment and storage medium
CN112966652A (en) Trajectory convergence method and device, computer equipment and storage medium
CN113642400A (en) Graph convolution action recognition method, device and equipment based on 2S-AGCN
CN111353325A (en) Key point detection model training method and device
CN115131437A (en) Pose estimation method, and training method, device, equipment and medium of relevant model
CN113434722B (en) Image classification method, device, equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant