WO2021068589A1 - Method and apparatus for determining object and key points thereof in image - Google Patents

Method and apparatus for determining object and key points thereof in image Download PDF

Info

Publication number
WO2021068589A1
WO2021068589A1 PCT/CN2020/102518 CN2020102518W WO2021068589A1 WO 2021068589 A1 WO2021068589 A1 WO 2021068589A1 CN 2020102518 W CN2020102518 W CN 2020102518W WO 2021068589 A1 WO2021068589 A1 WO 2021068589A1
Authority
WO
WIPO (PCT)
Prior art keywords
target object
image
detection
position information
loss
Prior art date
Application number
PCT/CN2020/102518
Other languages
French (fr)
Chinese (zh)
Inventor
周婷
吕晋
周伟杰
Original Assignee
东软睿驰汽车技术(沈阳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 东软睿驰汽车技术(沈阳)有限公司 filed Critical 东软睿驰汽车技术(沈阳)有限公司
Publication of WO2021068589A1 publication Critical patent/WO2021068589A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Definitions

  • the present invention claims the priority of a Chinese patent application filed with the Chinese Patent Office on October 9, 2019, the application number is CN201910954556.7, and the application title is "a method and device for determining an object and its key points in an image", which The entire content is incorporated into the present invention by reference.
  • This application relates to the field of image processing technology, and in particular to a method and device for determining objects and their key points in an image.
  • the fatigue driving detection system of the vehicle is usually used to determine whether the driver is in a fatigue driving state, so as to take corresponding reminder measures to the driver when it is determined that the driver is in a fatigue driving state.
  • the detection result determines the mental state of the driver.
  • the present application provides a method and device for determining the object and its key points in an image, which can effectively determine the position of the object and its key point in the image, so as to accurately detect Figure out the objects and their key points in the image.
  • the embodiment of the present application provides a method for determining an object and its key points in an image, including:
  • the position information of the target object on the image to be detected and the position information of the key points of the target object are determined; wherein, the key points of the target object are used to characterize the Structure.
  • the determining the position information of the target object on the image to be detected and the key point position information of the target object according to the image feature corresponding to the image to be detected is specifically:
  • the position information of the target object on the image to be detected is determined according to the image feature corresponding to the image to be detected, and the position information of the target object on the image to be detected is determined according to the image feature corresponding to the image to be detected and the target object is on the image to be detected
  • the location information to determine the key point location information of the target object including:
  • the image features corresponding to the image to be detected use a pre-built first detection model to detect the target object and its key points, and determine the position information of the target object on the image to be detected and the key point position information of the target object;
  • the first detection model includes a first object position detection network layer and a first key point position detection network layer
  • the output result of the first object position detection network layer is the first key point position detection network layer
  • the first object position detection network layer is used to determine the position information of the target object on the image to be detected according to the image features corresponding to the image to be detected
  • the first key point position detection network layer is used to According to the image feature corresponding to the image to be detected and the position information of the target object on the image to be detected, the key point position information of the target object is determined.
  • the training process of the first detection model specifically includes:
  • the image features corresponding to the training image use the first detection model to detect the target object and its key points, and determine the predicted position information of the target object on the image to be detected and the predicted key point position information of the target object;
  • the actual position information of the target object on the training image the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object, Judging whether the first detection model meets the first preset condition;
  • the first detection model does not meet the first preset condition, update the first detection model, and continue to execute "Using the first detection model to detect the target object and its key points according to the image features corresponding to the training image , Determine the predicted location information of the target object on the image to be detected and the predicted key point location information of the target object".
  • the first preset condition is: the rate of change of the detection loss of the first detection model is lower than a first preset loss threshold, and/or the number of training rounds of the first detection model reaches the first preset round Number threshold.
  • the first preset condition includes that the rate of change of the detection loss of the first detection model is lower than the first preset loss threshold
  • the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object determine whether the first detection model meets the first preset condition, Specifically:
  • the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
  • the detection loss change rate of the first detection model is determined; the historical detection loss of the first detection model is the first determined in the historical training process. 1. Detection loss of the detection model;
  • the detection loss change rate of the first detection model is lower than the first preset loss threshold, determining that the first detection model meets the first preset condition
  • the detection loss change rate of the first detection model is not lower than the first preset loss threshold, it is determined that the first detection model does not meet the first preset condition.
  • the actual position information of the target object on the training image specifically includes:
  • the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
  • the detection loss change rate of the first detection model is determined; the historical detection loss of the first detection model is the first determined in the historical training process. 1. Detection loss of the detection model;
  • the detection loss change rate of the first detection model is lower than the first preset loss threshold, or the number of training rounds of the first detection model reaches the first preset number of rounds threshold, it is determined that the first detection model reaches the first Precondition
  • the detection loss change rate of the first detection model is not lower than the first preset loss threshold, and the number of training rounds of the first detection model does not reach the first preset number of rounds threshold, it is determined that the first detection model is not The first preset condition is reached.
  • the determining the position information of the target object on the image to be detected and the key point position information of the target object according to the image feature corresponding to the image to be detected is specifically:
  • the position information of the target object on the image to be detected and the position information of key points in each image subregion on the image to be detected are determined, and according to the position of the target object on the image to be detected
  • the position information and the key point position information in each image sub-region on the image to be detected determine the key point position information of the target object; the image sub-region is the region of interest corresponding to the anchor point.
  • the position information of the target object on the image to be detected and the position information of key points in each image subregion on the image to be detected are determined according to the image features corresponding to the image to be detected, and based on the target object
  • the location information on the image to be detected and the location information of the key points in each image sub-region on the image to be detected, to determine the location information of the key points of the target object specifically:
  • the image features corresponding to the image to be detected use a pre-built second detection model to detect the target object and its key points, and determine the position information of the target object on the image to be detected and the key point position information of the target object;
  • the second detection model includes a second object position detection network layer, a second key point position detection network layer, and an object key point position determination network layer, and the output result of the second object position detection network layer and the The output result of the second key point position detection network layer is the input data of the object key point position determination network layer, and the second object position detection network layer is used to determine the target object according to the image feature corresponding to the image to be detected Position information on the image to be detected; and the second key point position detection network layer is used to determine the position information of key points in each image subregion on the image to be detected according to the image characteristics corresponding to the image to be detected; and The object key point position determination network layer is used to determine the key point position of the target object according to the position information of the target object on the image to be detected and the key point position information in each image subregion of the image to be detected information.
  • the training process of the second detection model specifically includes:
  • the second detection model uses the second detection model to detect the target object and its key points, and determine the predicted position information of the target object on the image to be detected and the predicted key point position information of the target object;
  • the actual position information of the target object on the training image the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object, Judging whether the second detection model meets a second preset condition;
  • the second detection model does not meet the second preset condition, update the second detection model, and continue to execute "Using the second detection model to detect the target object and its key points according to the image features corresponding to the training image , Determine the predicted location information of the target object on the image to be detected and the predicted key point location information of the target object".
  • the second preset condition is: the rate of change of the detection loss of the second detection model is lower than a second preset loss threshold, and/or the number of training rounds of the second detection model reaches the second preset round Number threshold.
  • the second preset condition includes that the rate of change of the detection loss of the second detection model is lower than a second preset loss threshold
  • the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object determine whether the second detection model meets a second preset condition, Specifically:
  • the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
  • the detection loss change rate of the second detection model is determined; the historical detection loss of the second detection model is the first determined in the historical training process. 2. Detection loss of the detection model;
  • the detection loss change rate of the second detection model is lower than a second preset loss threshold, determining that the second detection model meets the second preset condition
  • the detection loss change rate of the second detection model is not lower than the second preset loss threshold, it is determined that the second detection model does not meet the second preset condition.
  • the actual position information of the target object on the training image specifically includes:
  • the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
  • the detection loss change rate of the second detection model is determined; the historical detection loss of the second detection model is the first determined in the historical training process. 2. Detection loss of the detection model;
  • the detection loss change rate of the second detection model is lower than the second preset loss threshold, or the number of training rounds of the second detection model reaches the second preset number of rounds threshold, it is determined that the second detection model reaches the second Precondition
  • the detection loss change rate of the second detection model is not lower than the second preset loss threshold, and the number of training rounds of the second detection model does not reach the second preset number of rounds threshold, it is determined that the second detection model is not The second preset condition is reached.
  • the embodiment of the present application also provides an apparatus for determining an object and its key points in an image, including:
  • An extraction unit configured to perform feature extraction on the image to be detected to obtain image features corresponding to the image to be detected
  • the determining unit is configured to determine the position information of the target object on the image to be detected and the key point position information of the target object based on the image feature corresponding to the image to be detected; wherein, the key point of the target object is used for Characterize the structural characteristics of the target object.
  • An embodiment of the present application also provides a device, which includes a processor and a memory:
  • the memory is used to store a computer program
  • the processor is configured to execute any embodiment of the method for determining the object and its key points in the image provided above according to the computer program.
  • the embodiments of the present application also provide a computer-readable storage medium, the computer-readable storage medium is used to store a computer program, and the computer program is used to execute any of the methods for determining objects and their key points in the image provided above.
  • the computer-readable storage medium is used to store a computer program
  • the computer program is used to execute any of the methods for determining objects and their key points in the image provided above.
  • the target object is determined directly according to the image feature corresponding to the image to be detected The position information on the image to be detected and the key point position information of the target object.
  • the position information of the target object on the image to be detected can accurately represent the position of the target object in the image to be detected
  • the key point position information of the target object can accurately represent the key point position of the target object in the image to be detected
  • the method for determining the object and its key points in the image provided by the embodiments of the application can effectively determine the position information of the target object and the position information of its key points from the image to be detected, so as to accurately detect the object and the key point in the image. Its key point.
  • FIG. 1 is a schematic diagram of using two cascaded models to determine objects and their key points in an image according to an embodiment of the application;
  • Embodiment 2 is a flowchart of a method for determining an object and its key points in an image provided by Embodiment 1 of the method of this application;
  • FIG. 3 is a flowchart of a method for determining objects and their key points in an image provided by the second embodiment of the method of this application;
  • FIG. 4 is a schematic diagram when the first detection model provided by an embodiment of the application is applied to the detection of the position of a human face and its key points;
  • FIG. 5 is a flowchart of the training process of the first detection model provided by an embodiment of the application.
  • FIG. 6 is a flowchart of an implementation manner of step S53 according to an embodiment of this application.
  • FIG. 7 is a flowchart of another implementation manner of step S53 according to an embodiment of the application.
  • FIG. 8 is a flowchart of a method for determining an object and its key points in an image provided by the fourth embodiment of the method of this application;
  • FIG. 9 is a schematic diagram when the second detection model provided by an embodiment of the application is applied to the detection of the position of a human face and its key points;
  • FIG. 10 is a flowchart of the training process of the second detection model provided by an embodiment of the application.
  • FIG. 11 is a flowchart of an implementation manner of step S103 according to an embodiment of this application.
  • FIG. 12 is a flowchart of another implementation manner of step S103 according to an embodiment of this application.
  • FIG. 13 is a schematic structural diagram of an apparatus for determining an object and its key points in an image provided by an embodiment of the application;
  • FIG. 14 is a schematic diagram of the device structure provided by an embodiment of the application.
  • cascaded object detection models for example, face detection models
  • key point detection models of objects for example, face key point detection models
  • the position information of the target object (for example, human face) on the image to be detected and the key point position information of the target object are sequentially determined.
  • the object detection model includes the image feature extraction network and the object position detection network
  • the key point detection model of the object includes the image feature extraction network and the object key point detection network
  • the cascaded object detection model and the key point of the object are used.
  • the point detection model determines the target object and its key points, it needs to go through the following steps in sequence: image feature extraction of the image to be detected ⁇ detection of the location information of the target object ⁇ image feature extraction of the image of the target object ⁇ detection of the key points of the target object location information.
  • the cascaded face detection model and the face key point detection model are used to determine the face and its key points.
  • the process is as follows: First, the detected image Perform feature extraction to obtain the first image feature; secondly, use the face detection network to detect the first image feature to obtain the face image (that is, the face detection result); then, perform feature extraction on the face image to obtain The second image feature; finally, the second image feature is detected using the face key point detection network to obtain an image that carries the face key point information (that is, the face key point detection result).
  • the inventor has conducted further research and found that when using the cascaded object detection model and the key point detection model of the object to determine the object and its key points, two image feature extractions are required. This increases the complexity of the process of determining objects and their key points in the image, reduces the efficiency of determining objects and their key points in the image, and also increases the memory consumption when determining objects and their key points in the image.
  • the object detection model and the key point detection model of the object need to be implemented by deep neural networks, and each deep neural network consumes a large amount of memory and time during use and training, therefore, cascaded object detection
  • the use process and training process of the model and the key point detection model of the object require a large amount of memory and time, which increases the acquisition cost of the object and its key point in the image, and limits the cascade-based object detection model and the object The scope of application of the detection method of objects and their key points in the image of the key point detection model.
  • an embodiment of the present application provides a method for determining objects and their key points in an image.
  • the method specifically includes: After the image feature corresponding to the image to be detected is extracted from the image to be detected, the location information of the target object on the image to be detected and the key point location information of the target object are determined directly according to the image feature corresponding to the image to be detected.
  • the method for determining the object and its key points in the image provided by the embodiments of the present application, since the position information of the target object on the image to be detected can accurately represent the position of the target object in the image to be detected, and the key point position of the target object The information can accurately characterize the position of the key points of the target object in the image to be detected. Therefore, the method for determining the object and its key points in the image provided by the embodiments of the present application can effectively determine the position information and the position of the target object from the image to be detected. The location information of its key points can accurately detect the objects and their key points in the image.
  • the location information of the target object on the image to be detected and the key point location information of the target object can be determined directly based on the image feature corresponding to the image to be detected.
  • the process of restoring the position information of the target object on the image to be detected into an object detection image and performing feature extraction on the object detection image, so that the position information of the target object on the image to be detected and the key point position information of the target object are determined Only one image feature extraction process is required in the process, which simplifies the process of determining objects and their key points in the image, improves the efficiency of determining objects and their key points in the image, and reduces the determination of objects and their key points in the image Memory loss at the time.
  • FIG. 2 is a flowchart of a method for determining an object and its key points in an image provided in the first method of the application.
  • the method for determining objects and their key points in an image provided by the embodiments of the present application specifically includes steps S21-S22:
  • S21 Perform feature extraction on the image to be detected to obtain image features corresponding to the image to be detected.
  • the image to be detected refers to the image that needs to be detected.
  • the image feature corresponding to the image to be detected is used to characterize the feature information of the image to be detected.
  • the embodiment of the present application does not limit the image feature extraction method, and any existing or future image feature extraction method capable of extracting features from the image to be detected can be used.
  • the image feature extraction method may be a convolution algorithm.
  • S22 Determine the position information of the target object on the image to be detected and the key point position information of the target object according to the image features corresponding to the image to be detected; wherein the key points of the target object are used to characterize the target The structural characteristics of the object.
  • the target object refers to the object that needs to be recognized on the image; moreover, the embodiment of the present application does not limit the target object, and the target object may be a human face, a vehicle, a tea cup, and other objects.
  • the target object can be set in advance, especially according to the application scenario.
  • the position information of the target object on the image to be detected is used to record the position-related information of the area where the target object is located in the image to be detected; moreover, the embodiment of this application does not limit the representation form of the position information of the target object on the image to be detected , Can be represented by at least one of words, numbers, symbols, etc.
  • the position information of the target object on the image to be detected can use the region of interest (roi) corresponding to the anchor point and the deviation of the area of the target object in the image to be detected relative to the roi corresponding to the anchor. Shift parameters to describe.
  • the roi corresponding to each anchor can use the regional parameters (x, y, w, h), and x and y are used to represent the center point pixel coordinates of the roi corresponding to the anchor; w is used to represent the roi corresponding to the anchor Width; h is used to indicate the height of the roi corresponding to the anchor.
  • the offset parameter can use the area offset parameter ( ⁇ x, ⁇ y, ⁇ w, ⁇ h), and ⁇ x and ⁇ y are used to indicate that the pixel coordinates of the center point of the target object in the image to be detected are relative to the anchor
  • the target object is a human face
  • the image to be detected includes the first anchor to the Nth anchor
  • the area parameter of the roi corresponding to the first anchor is (x 1 , y 1 , w 1 , h 1 )
  • the area parameter of the roi corresponding to the second anchor is (x 2 , y 2 , w 2 , h 2 ),...
  • the area parameter of the roi corresponding to the Nth anchor is (x N , y N , w N , h N );
  • the roi corresponding to the T-th anchor carries a human face; and the offset parameter of the area of the face in the image to be detected relative to the roi corresponding to the T-th anchor is ( ⁇ x, ⁇ y, ⁇ w , ⁇ h), and 1 ⁇ T ⁇ N.
  • the position information of the face on the image to be detected can use the regional parameters of the area where the face is located in the image to be detected (x N + ⁇ x,y N + ⁇ y,w N + ⁇ w,h N + ⁇ h) means.
  • the key points of the target object are used to characterize the structural characteristics of the target object.
  • the key points of the target object may include related points that can characterize the position of the facial features of the human face and the position of the contour of the face.
  • the position information of the target object on the image to be detected and the key point position of the target object can be directly determined according to the image feature corresponding to the image to be detected information.
  • step S22 further provides two specific embodiments of step S22, and step S22 DETAILED DESCRIPTION two embodiments respectively four and two embodiments are described in the methods described below the methods described below
  • step S22 DETAILED DESCRIPTION two embodiments respectively four and two embodiments are described in the methods described below the methods described below
  • step of both S22 DETAILED DESCRIPTION hereinafter embodiments are the method of Example II step S32 and step four methods described below in Example S82 embodiment, technical details, see below.
  • the above is the specific implementation of the method for determining the object and its key points in the image provided by method embodiment 1.
  • the image feature corresponding to the image to be detected is extracted from the image to be detected, it is directly based on the The image feature corresponding to the image to be detected determines the position information of the target object on the image to be detected and the key point position information of the target object.
  • the position information of the target object on the image to be detected can accurately represent the position of the target object in the image to be detected, and the key point position information of the target object can accurately represent the key point position of the target object in the image to be detected
  • the method for determining the object and its key points in the image provided by the embodiments of the application can effectively determine the position information of the target object and the position information of its key points from the image to be detected, so as to accurately detect the object and the key point in the image. Its key point.
  • the location information of the target object on the image to be detected and the key point location information of the target object can be determined directly based on the image feature corresponding to the image to be detected.
  • this embodiment of the application also provides the determination of objects and their key points in the image. Another embodiment of the method is explained and illustrated below in conjunction with the drawings.
  • the second method embodiment is an improvement made on the basis of the first method embodiment.
  • the parts in the second method embodiment with the same content as in the first method embodiment will not be repeated here.
  • FIG. 3 is a flowchart of a method for determining an object and its key points in an image provided by the second method of the application.
  • the method for determining the object and its key points in the image provided by the embodiment of the present application specifically includes steps S31-S32:
  • S31 Perform feature extraction on the image to be detected to obtain image features corresponding to the image to be detected.
  • step S31 is the same as the content of step S21 in the first embodiment of the above method, and for the sake of brevity, it will not be repeated here.
  • S32 Determine the position information of the target object on the image to be detected according to the image feature corresponding to the image to be detected, and determine the position information of the target object on the image to be detected according to the image feature corresponding to the image to be detected and the position information of the target object on the image to be detected.
  • the key point position information of the target object is determined; wherein, the key point of the target object is used to characterize the structural feature of the target object.
  • the image features corresponding to the image to be detected need to be used to determine the position information of the target object on the image to be detected, and then the target object is used in the image to be detected.
  • the location information on the image and the image features corresponding to the image to be detected determine the location information of the key points of the target object.
  • step S32 may specifically be: according to the image features corresponding to the image to be detected, using a pre-built first detection model to pair The target object and its key points are detected, and the position information of the target object on the image to be detected and the key point position information of the target object are determined.
  • the first detection model is used to detect the position information of the target object on the image to be detected and the key point position information of the target object according to the image features corresponding to the image to be detected.
  • the first detection model includes a first object position detection network layer and a first key point position detection network layer, and the output result of the first object position detection network layer is the input data of the first key point position detection network layer, And the first object position detection network layer is used to determine the position information of the target object on the image to be detected according to the image features corresponding to the image to be detected, and the first key point position detection network layer is used to determine the position information of the target object on the image to be detected according to the image feature to be detected. The image feature corresponding to the detected image and the position information of the target object on the image to be detected are determined, and the key point position information of the target object is determined.
  • the input data "image feature corresponding to the image to be detected" of the first detection model is the input data of the first object position detection network layer
  • the output data of the first object position detection network layer is The first key point position detects the input data of the network layer.
  • the input data of the first detection model "image features corresponding to the image to be detected” is the input data of the face detection network layer
  • face detection network layer is used to indicate the “first object position detection network layer” when applied to face position and key point detection
  • face key point detection network layer “Is used to indicate the “first key point position detection network layer” when it is applied to the detection of the face position and its key points.
  • the embodiment of the present application does not limit the way in which the first key point position detection network layer in the first detection model uses the output data of the first object position detection network layer.
  • the first key point position detection network can be used The layer directly uses the output data of the first object position detection network layer; it can also first determine the image area related information including the target object according to the output data of the first object position detection network layer, and then use the first key point position detection network layer to include The image area related information of the target object is not specifically limited in the embodiment of the present application.
  • the first detection model is a deep neural network
  • the deep neural network is composed of a first object position detection network layer and a first key point position detection network layer.
  • the embodiment is also provided a process of training a first detection model in the present application embodiment, and the training process according to a third embodiment will be described in detail in the process, technical details, refer to the method of Example III.
  • the above is the specific implementation of the method for determining the object and its key points in the image provided by method embodiment 2.
  • the Image features corresponding to the image to be detected determine the position information of the target object on the image to be detected, and then directly determine the key point position of the target object based on the position information of the target object on the image to be detected and the corresponding image features of the image to be detected information.
  • the position information of the target object on the image to be detected can accurately represent the position of the target object in the image to be detected, and the key point position information of the target object can accurately represent the key point position of the target object in the image to be detected
  • the method for determining the object and its key points in the image provided by the embodiments of the application can effectively determine the position information of the target object and the position information of its key points from the image to be detected, so as to accurately detect the object and the key point in the image. Its key point.
  • the position information of the target object on the image to be detected since the position information of the target object on the image to be detected is obtained, the position information of the target object on the image to be detected can be directly used as the basis for determining the key point position information of the target object.
  • An image feature extraction process simplifies the process of determining objects and their key points in the image, improves the efficiency of determining objects and their key points in the image, and reduces the memory loss when determining objects and their key points in the image.
  • the first detection model can be used to detect the target object and its key points to determine the position information of the target object on the image to be detected and the target The key point position information of the object.
  • the first detection model is a deep neural network that includes the first object position detection network layer and the first key point position detection network layer, only one training model is required when training the first detection model.
  • a deep neural network is sufficient, which reduces the memory and time consumption when training the first detection model; moreover, when using the first detection model to detect objects and their key points, you only need to run a deep neural network, which reduces Memory and time loss when determining objects and their key points.
  • the first detection model can be trained to Improve the detection effect of the first detection model. Based on this, the embodiment of the present application also provides a training process of the first detection model, which will be explained and described below with reference to the accompanying drawings.
  • FIG. 5 is a flowchart of the training process of the first detection model provided in an embodiment of the application.
  • the training process of the first detection model provided in the embodiment of the application includes steps S51-S55:
  • S51 Obtain image features corresponding to the training image, actual position information of the target object on the training image, and actual key point position information of the target object.
  • the training image is used to represent the image used when training the first detection model; moreover, this application does not limit the type of training image.
  • the training image may be an image collected by an image acquisition device (for example, a camera), or It can be an image that carries the position information of the target object, or it can be an image that carries the key point position information of the target object.
  • the embodiment of the present application does not limit the number of training images, and the number of training images can be preset, especially according to application scenarios.
  • the actual position information of the target object on the training image is used to record the relevant information about the actual position of the target object in the training image.
  • the actual key point position information of the target object is used to record the actual position related information of the key point of the target object in the training image.
  • the embodiment of the application does not limit the method for acquiring image features corresponding to the training image, and any method for acquiring the image features corresponding to the training image can be used; the embodiment of the application does not limit the target object on the training image.
  • the method for obtaining actual position information of the target object can be any method that can obtain the actual position information of the target object on the training image; the embodiment of this application does not limit the method for obtaining the actual key point position information of the target object, and any method can be used.
  • An acquisition method that can acquire the actual key point position information of the target object.
  • S52 Use the first detection model to detect the target object and its key points according to the image features corresponding to the training image, and determine the predicted position information of the target object on the image to be detected and the predicted key point position information of the target object.
  • the predicted position information of the target object on the image to be detected is used to record the position related information of the target object on the image to be detected obtained by the first detection model.
  • the predicted key point position information of the target object is used to record the information about the predicted key point position of the target object in the image to be detected obtained by using the first detection model.
  • the image feature corresponding to the training image is input into the first detection model, so that the first detection model can target the target according to the image feature corresponding to the training image.
  • the object and its key points are detected, the predicted position information of the target object on the image to be detected and the predicted key point position information of the target object are determined, and the predicted position information of the target object on the image to be detected and the target are determined by the first detection model.
  • the predicted key point position information of the object is output.
  • step S53 According to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position of the target object Information, it is determined whether the first detection model meets the first preset condition; if it is, step S55 is executed; if not, step S54 is executed.
  • the first preset condition is used to record the training cut-off condition information of the training process of the first detection model; moreover, the first preset condition may be preset or set according to the application scenario.
  • the first preset condition may specifically include: the rate of change of the detection loss of the first detection model is lower than the first preset loss threshold, and/or the number of training rounds of the first detection model reaches the first preset number of rounds threshold .
  • the predicted position information of the object and its key points detected by the first detection model can be compared with the actual position information of the object and its key points, so as to determine that the prediction of the first detection model is accurate according to the comparison result. It is specifically: according to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the position information of the target object. Predict the location information of key points, and determine whether the first detection model meets the first preset condition.
  • the first detection model If the first detection model meets the first preset condition, it indicates the prediction of the object and its key points detected by the first detection model The position information is very close to the actual position information of the object and its key points, which indicates that the prediction accuracy of the first detection model is high, and there is no need to train the first detection model. At this time, the first detection model can be ended.
  • the training process if the first detection model does not meet the first preset condition, it means the difference between the predicted position information of the object and its key points detected by the first detection model and the actual position information of the object and its key points Larger, which indicates that the prediction accuracy of the first detection model is low, and further training of the first detection model is needed. At this time, the first detection model needs to be updated to improve the prediction accuracy of the updated first detection model Sex.
  • step S53 when the condition includes that the change rate of the detection loss of the first detection model is lower than the first preset loss threshold, step S53 may specifically include steps S53A1-S53A7:
  • S53A1 Determine the detection loss of the target object according to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected.
  • the detection loss of the target object is used to record the loss caused by the first detection model in the process of acquiring the position information of the target object on the image to be detected; moreover, the detection loss of the target object includes the recognition loss of the target object (L- softmax) and the position detection loss of the target object (L-regression).
  • the recognition loss of the target object is used to record the loss caused by the first detection model when recognizing the target object in the image. That is, the recognition loss of the target object is used to record whether the first detection model can recognize the target object from the image.
  • the recognition loss of the target object may include the loss caused by not recognizing the target object in the image, the loss caused by recognizing other objects in the image as the target object, and the loss caused by recognizing the target object in the image to be detected as other objects. Damage caused by objects, etc.
  • the position detection loss of the target object is used to record the loss caused by the first detection model in the process of determining the position information of the target object on the image to be detected; moreover, the position detection loss of the target object may include Detect the loss caused by inaccurate predicted position information on the image.
  • a two-dimensional loss function (l2-loss) can usually be used for calculation, that is, the distance difference between the predicted position coordinate and the actual position coordinate can be calculated.
  • the actual position information after acquiring the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected, the actual position information can be compared with the predicted position information to determine the first detection model At the same time, the distance difference between the actual position information and the predicted position information can be calculated to determine the position detection loss of the target object of the first detection model.
  • the embodiment of the application does not limit the calculation method of the recognition loss of the target object, and any existing or future calculation method that can measure the recognition effect of the target object in the image can be used to determine the recognition of the target object. loss.
  • the embodiment of the application does not limit the calculation method of the position detection loss of the target object, and any existing or future calculation method that can measure the position detection effect of the target object can be used to determine the position detection loss of the target object. .
  • S53A2 Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object.
  • the loss-dpoint detection loss of the key point position of the target object is used to record the loss caused by the first detection model in the process of obtaining the key point position information of the target object; moreover, the key point position detection loss of the target object includes the loss-dpoint caused by the target object. Loss caused by inaccurate position information of the predicted key points of the object.
  • the two-dimensional loss function (l2-loss) can usually be used to calculate the detection loss of the key point position of the target object, that is, to calculate the distance difference between the predicted position coordinate and the actual position coordinate.
  • the embodiment of the application does not limit the calculation method of the key point position detection loss of the target object, and any existing or future calculation method that can measure the position detection effect of the key point of the target object can be used. To determine the detection loss of the key point position of the target object.
  • S53A3 Determine the detection loss of the first detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object.
  • the loss-detect of the first detection model is used to record the loss caused by the first detection model in the process of acquiring the position information of the target object on the image to be detected and the key point position information of the target object.
  • a preset loss function can be used to determine the detection loss of the first detection model, and the loss function is:
  • loss-detect represents the detection loss of the first detection model
  • represents the influence weight of the recognition loss of the target object
  • L-softmax represents the recognition loss of the target object
  • represents the influence weight of the position detection loss of the target object
  • L- Regression represents the location detection loss of the target object
  • F represents the existence identification of the key point location detection loss of the target object. If the training image includes the target object (that is, the positive sample training image), the F value corresponding to the training image is 1 , If the training image does not include the target object (that is, the negative sample training image), the F value corresponding to the training image is 0; ⁇ represents the influence weight of the key point position detection loss of the target object; loss-dpoint represents the target object The key point position detection loss.
  • the value of F is determined according to whether the target object is included in the training image, specifically: if a target object is included in a training image, the key point related information of the target object needs to be considered.
  • the first detection is measured
  • the model detects the loss of the training image, it needs to consider the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object; if the target object is not included in a training image, there is no need to consider the key of the target object Point-related information.
  • the detection loss of the training image by the first detection model only the recognition loss of the target object and the position detection loss of the target object need to be considered.
  • the above-mentioned loss function is to jointly supervise the detection effect of the first detection model by using three loss indicators: the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object. Therefore, when using the above loss function to evaluate the detection effect of the first detection model on the object and its key points, it can effectively determine whether the current first detection model can accurately detect the object and its key points in the image. Thus, the detection accuracy of the trained first detection model is improved.
  • S53A4 Determine the detection loss change rate of the first detection model according to the detection loss of the first detection model and the historical detection loss of the first detection model.
  • the historical detection loss of the first detection model is the detection loss of the first detection model determined during the historical training process. For example, suppose that the first detection model is sequentially trained from the first round to the Y-round training, and the first-round training is earlier than the second-round training, and the second-round training is earlier than the third-round training..., Y-th The first round of training is earlier than the Y-round training, and the Y-round training is the current training process being executed.
  • the first-round training to the Y-1th round of training are all the Y-round
  • the historical training process of training then, the detection loss of the first detection model determined in the first round of training to the detection loss of the first detection model determined in the Y-1 round of training are all historical detections of the Y-round training loss.
  • the detection loss change rate of the first detection model is used to characterize the change information of the detection effect of the first detection model during multiple rounds of training; moreover, the smaller the detection loss change rate of the first detection model, the greater the detection effect of the first detection model. Good, and the larger the detection loss change rate of the first detection model is, the worse the detection effect of the first detection model is.
  • S53A5 Determine whether the detection loss change rate of the first detection model is lower than the first preset loss threshold, if yes, execute step S53A6; if not, execute step S53A7.
  • the first preset loss threshold can be set in advance, especially can be set according to application scenarios.
  • the detection loss of the first detection model is determined based on the three supervision factors: the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object
  • the detection loss is based on the first
  • the detection loss of the detection model determines the detection loss change rate of the first detection model. At this time, if the detection loss change rate of the first detection model is lower than the first preset loss threshold, it means that the detection loss change of the first detection model is small and close to the stable optimal model, thereby indicating that the first detection model is used.
  • the accuracy of the objects and their key points in the image obtained by the model detection is relatively high, so that it is determined that the first detection model has reached the first preset condition, and then it can be determined that the first detection model does not need to be trained;
  • the detection loss change rate is not lower than the first preset loss threshold, which means that the first detection model has a large change in detection loss and is farther from the optimal model that is stable, thus indicating that the object in the image detected by the first detection model
  • the accuracy of its key points is low, so it is determined that the first detection model does not meet the first preset condition, and then it can be determined that the first detection model still needs to be trained again.
  • S53A6 Determine that the first detection model meets the first preset condition.
  • S53A7 Determine that the first detection model does not meet the first preset condition.
  • step S53A1 and step S53A2 can be executed in sequence, step S53A2 and step S53A1 can be executed in sequence, or step S53A1 and step S53A2 can be executed simultaneously.
  • step S53 is an implementation manner of step S53.
  • the detection loss change rate of the first detection model is lower than the first preset loss threshold as the training cut-off condition of the first detection model.
  • step S53 in order to improve the evaluation accuracy of the detection effect of the first detection model, the embodiment of the present application also provides another specific implementation manner of step S53, as shown in FIG. Assuming that the condition includes that the rate of change of the detection loss of the first detection model is lower than the first preset loss threshold, step S53 may specifically include steps S53B1-S53B7:
  • S53B1 Determine the detection loss of the target object according to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected.
  • the detection loss of the target object includes the recognition loss of the target object and the position detection loss of the target object.
  • step S53B1 is the same as the content of step S53A1 described above. For the sake of brevity, it will not be repeated here.
  • step S53B1 please refer to the content of step S53A1 described above.
  • S53B2 Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object.
  • step S53B2 is the same as the content of step S53A2 described above. For the sake of brevity, it will not be repeated here.
  • step S53B2 please refer to the content of step S53A2 described above.
  • S53B3 Determine the detection loss of the first detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object.
  • step S53B3 is the same as the content of step S53A3 described above. For the sake of brevity, it will not be repeated here.
  • step S53B3 please refer to the content of step S53A3 described above.
  • S53B4 Determine the detection loss change rate of the first detection model according to the detection loss of the first detection model and the historical detection loss of the first detection model.
  • the historical detection loss of the first detection model is the detection loss of the first detection model determined during the historical training process.
  • step S53B4 is the same as the content of step S53A4. For the sake of brevity, it will not be repeated here.
  • step S53B4 please refer to the content of step S53A4.
  • S53B5 Determine whether at least one of the following two conditions is met, and the two conditions are: the detection loss change rate of the first detection model is lower than the first preset loss threshold and the number of training rounds of the first detection model reaches the first A preset threshold for the number of rounds; if yes, execute step S53B6; if not, execute step S53B7.
  • step S53B6 Directly execute step S53B6; moreover, as long as it is determined that the number of training rounds of the first detection model reaches the first preset round number threshold, no matter whether the action is performed or not, it is judged whether the detection loss change rate of the first detection model is lower than the first prediction.
  • step S53B6 directly execute step S53B6; moreover, it is only determined that the detection loss change rate of the first detection model is not lower than the first preset loss threshold, and the number of training rounds of the first detection model does not reach the first preset Step S53B7 is executed only when the threshold of the number of rounds is set.
  • the action judging whether the detection loss change rate of the first detection model is lower than the first preset loss threshold” and the action “whether the number of training rounds of the first detection model reaches the first predetermined Set the execution order of the number of rounds threshold. You can execute the actions “judge whether the detection loss change rate of the first detection model is lower than the first preset loss threshold” and the action “whether the number of training rounds of the first detection model reach the first preset” in sequence.
  • S53B6 Determine that the first detection model reaches a first preset condition.
  • S53B7 Determine that the first detection model does not meet the first preset condition.
  • step S53B1 and step S53B2 can be executed in sequence, step S53B2 and step S53B1 can be executed in sequence, or step S53B1 and step S53B2 can be executed simultaneously.
  • step S53 is another embodiment of step S53.
  • the detection loss change rate of the first detection model is lower than the first preset loss threshold and the number of training rounds of the first detection model reaches the first preset number of rounds threshold. As the training cut-off condition of the first detection model.
  • step S53 is the relevant content of step S53.
  • the predicted position information of the target object on the image to be detected and the target object on the training image detected by the first detection model may be used.
  • the gap between the actual position information of the target object and the gap between the predicted key point position information of the target object and the actual key point position information of the target object to update the first detection model so that the updated first detection model detects The obtained predicted position information of the target object on the image to be detected is closer to the actual position information of the target object on the training image, and the predicted key point position information of the target object is closer to the actual key point position information of the target object, thereby This enables the updated first detection model to more accurately detect objects and their key points in the image.
  • the image features corresponding to the training image, the actual position information of the target object on the training image, and the target object’s actual position are acquired.
  • the first detection model is used to detect the target object and its key points according to the image characteristics corresponding to the training image, and the predicted position information of the target object on the image to be detected and the prediction of the target object are determined Key point position information; and then according to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the target object’s position information Predict the position information of the key points, and determine whether the first detection model meets the first preset condition, so as to determine whether to continue training the first detection model according to the judgment result.
  • the embodiment of this application is based on the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the target object’s position information.
  • Predicting the four pieces of information of the key point location information to evaluate the detection effect of the first detection model makes the evaluation result more reasonable and effective, so that the detection effect of the first detection model that satisfies the first preset condition is better.
  • the embodiment of the present application also uses three loss factors: the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object.
  • the evaluation result is more reasonable and effective, so that the detection effect of the first detection model that meets the first preset condition is better.
  • an embodiment of the present application also provides a method for determining objects and their key points in an image Another embodiment of, will be explained and described below with reference to the accompanying drawings.
  • Method embodiment 4 is an improvement based on method embodiment 1 to method embodiment 3.
  • the parts in method embodiment 2 that are the same as those in method embodiment 1 to method embodiment 3 will not be repeated here. .
  • FIG. 8 is a flowchart of a method for determining an object and its key points in an image provided in the fourth embodiment of the method of this application.
  • the method for determining an object and its key points in an image specifically includes steps S81-S82:
  • S81 Perform feature extraction on the image to be detected to obtain image features corresponding to the image to be detected.
  • step S81 is the same as the content of step S21 in the first embodiment of the method, and for the sake of brevity, the details are not repeated here.
  • S82 Determine the position information of the target object on the image to be detected and the key point position information in each image subregion of the image to be detected according to the image features corresponding to the image to be detected, and determine the position information of the target object in the image to be detected
  • the location information on the image and the location information of the key points in each image subregion on the image to be detected determine the location information of the key points of the target object.
  • the image sub-region is the region of interest corresponding to the anchor point. It should be noted that, for the technical details of the "region of interest corresponding to the anchor point", please refer to the related content in step S22 of the first method embodiment.
  • the image to be detected usually includes at least one image sub-region.
  • the position information of the target object on the image to be detected and the image sub-regions on the image to be detected are determined according to the image features corresponding to the image to be detected.
  • the position information of the key points in the image to be detected the position information of the key points of the target object is determined according to the position information of the target object on the image to be detected and the key point position information in each image subregion of the image to be detected.
  • the image to be detected includes the first image sub-region to the P-th image sub-region, and the target object is located in the W-th image sub-region.
  • step S82 specifically includes: determining the position information of the target object on the image to be detected and the key point position information in the first image subregion of the image to be detected according to the image features corresponding to the image to be detected to the Pth The position information of the key points in the image sub-region.
  • the position information of the target object on the image to be detected and the position information of each image sub-region on the image to be detected are The key point position information is to determine the key point position information of the target object, that is, to determine the key point position information of the target object in the W-th image subregion.
  • the image subregion where the target object is located needs to be The key point position within is used as the key point position information of the target object.
  • step S82 may specifically be: according to the image feature corresponding to the image to be detected, using a pre-built second detection model to pair The target object and its key points are detected, and the position information of the target object on the image to be detected and the key point position information of the target object are determined.
  • the second detection model is used to detect the position information of the target object on the image to be detected and the key point position information of the target object according to the image features corresponding to the image to be detected.
  • the second detection model includes a second object position detection network layer, a second key point position detection network layer, and an object key point position determination network layer, and the output result of the second object position detection network layer and the second key point
  • the output result of the position detection network layer is the input data of the object key point position determination network layer
  • the second object position detection network layer is used to determine that the target object is in the image to be detected according to the image features corresponding to the image to be detected
  • the second key point position detection network layer is used to determine the key point position information in each image sub-area of the image to be detected according to the image characteristics corresponding to the image to be detected; and the object key point
  • the position determination network layer is used to determine the key point position information of the target object according to the position information of the target object on the image to be detected and the key point position information in each image subregion of the image to be detected.
  • the input data "image features corresponding to the image to be detected" of the second detection model are the input data of the second object position detection network layer and the input data of the second key point position detection network layer.
  • the output data of the second object position detection network layer and the output data of the second key point position detection network layer are the input data of the object key point position determination network layer.
  • the input data of the second detection model "image features corresponding to the image to be detected” is the input data of the face detection network layer and the input of the face key point detection network layer Data
  • the output data of the face detection network layer and the output data of the face key point detection network layer are the input data of the face key point determination network layer.
  • face detection network layer is used to indicate the “second object position detection network layer” applied to the detection of face position and its key points;
  • face key point detection network layer is used to indicate the “second key point location detection network layer” when applied to face position and key point detection;
  • face key point determination network layer is used to indicate when it is applied to face position and key point detection "Determining the network layer of the key point position of the object”.
  • the second detection model is a deep neural network
  • the deep neural network is composed of a second object position detection network layer and a second key point position detection network layer.
  • the above is the specific implementation of the method for determining the object and its key points in the image provided by method embodiment 4.
  • the image features corresponding to the image to be detected are extracted from the image to be detected, the According to the image features corresponding to the image to be detected, the position information of the target object on the image to be detected and the key point position information in each image sub-area of the image to be detected are determined, and then according to the position information of the target object on the image to be detected And the key point position information in each image sub-region on the image to be detected to determine the key point position information of the target object.
  • the position information of the target object on the image to be detected can accurately represent the position of the target object in the image to be detected
  • the key point position information of the target object can accurately represent the key point position of the target object in the image to be detected
  • the method for determining the object and its key points in the image provided by the embodiments of the application can effectively determine the position information of the target object and the position information of its key points from the image to be detected, so as to accurately detect the object and the key point in the image. Its key point.
  • the position information of the target object on the image to be detected and the key point position information in each image subregion on the image to be detected are directly determined according to the image feature corresponding to the image to be detected , And based on the position information of the target object on the image to be detected and the position information of the key points in each image sub-area of the image to be detected, the key point position information of the target object can be determined, without the need to place the target object on the image to be detected
  • the location information is restored to the object detection image and the process of feature extraction on the object detection image, so that only one image feature extraction is required in the process of determining the location information of the target object on the image to be detected and the key point location information of the target object This simplifies the process of determining the object and its key points in the image, improves the efficiency of determining the object and its key points in the image, and reduces the memory loss when determining the object and its key points in the image.
  • the second detection model can be used to detect the target object and its key points to determine the position information of the target object on the image to be detected and the target The key point position information of the object.
  • the second detection model is a deep neural network that includes the second object position detection network layer and the second key point position detection network layer, only one training model is required when training the second detection model.
  • a deep neural network is sufficient, reducing the memory and time consumption when training the second detection model; moreover, when using the second detection model to detect objects and their key points, you only need to run a deep neural network, which reduces Memory and time loss when determining objects and their key points.
  • the second detection model can be trained to improve The detection effect of the second detection model.
  • the embodiment of the present application also provides a training process of a second detection model, which will be explained and described below in conjunction with the accompanying drawings.
  • FIG. 10 is a flowchart of the training process of the second detection model provided in an embodiment of the application.
  • the training process of the second detection model provided in the embodiment of the present application includes steps S101-S105:
  • S101 Obtain image features corresponding to the training image, actual position information of the target object on the training image, and actual key point position information of the target object.
  • step S101 is the same as the content of step S51 in the third method embodiment.
  • step S51 in the third method embodiment please refer to step S51 in the third method embodiment.
  • S102 Use the second detection model to detect the target object and its key points according to the image features corresponding to the training image, and determine the predicted position information of the target object on the image to be detected and the predicted key point position information of the target object.
  • the predicted position information of the target object on the image to be detected is used to record information about the position of the target object on the image to be detected obtained by using the second detection model.
  • the predicted key point position information of the target object is used to record the information about the predicted key point position of the target object in the image to be detected obtained by using the second detection model.
  • the image feature corresponding to the training image is input into the second detection model, so that the second detection model can target the target according to the image feature corresponding to the training image.
  • the object and its key points are detected, the predicted position information of the target object on the image to be detected and the predicted key point position information of the target object are determined, and the predicted position information of the target object on the image to be detected and the target are determined by the second detection model.
  • the predicted key point position information of the object is output.
  • step S103 According to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position of the target object Information, it is determined whether the second detection model meets the second preset condition; if so, step S105 is executed; if not, step S104 is executed.
  • the second preset condition is used to record the training cut-off condition information of the training process of the second detection model; moreover, the second preset condition may be preset or set according to the application scenario.
  • the second preset condition may specifically include: the rate of change of the detection loss of the second detection model is lower than the second preset loss threshold, and/or the number of training rounds of the second detection model reaches the second preset number of rounds threshold .
  • the predicted position information of the object and its key points detected by the second detection model can be compared with the actual position information of the object and its key points, so as to determine that the prediction of the second detection model is accurate according to the comparison result. It is specifically: according to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the position information of the target object. Predict the location information of key points, and determine whether the second detection model meets the second preset condition.
  • the second detection model If the second detection model meets the second preset condition, it indicates the prediction of the object and its key points detected by the second detection model The position information is very close to the actual position information of the object and its key points, which indicates that the prediction accuracy of the second detection model is high, and there is no need to train the second detection model. At this time, the second detection model can be ended.
  • the training process if the second detection model does not meet the second preset condition, it means the difference between the predicted position information of the object and its key points detected by the second detection model and the actual position information of the object and its key points Larger, which indicates that the prediction accuracy of the second detection model is low, and further training of the second detection model is needed. At this time, the second detection model needs to be updated to improve the prediction accuracy of the updated second detection model Sex.
  • step S103 when the second preset When the condition includes that the rate of change of the detection loss of the second detection model is lower than the second preset loss threshold, step S103 may specifically include steps S103A1-S103A7:
  • S103A1 Determine the detection loss of the target object according to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected.
  • the detection loss of the target object is used to record the loss caused by the second detection model in the process of acquiring the position information of the target object on the image to be detected; moreover, the detection loss of the target object includes the recognition loss of the target object (L- softmax) and the position detection loss of the target object (L-regression).
  • the recognition loss of the target object is used to record the loss caused by the second detection model when recognizing the target object in the image. That is, the recognition loss of the target object is used to record whether the second detection model can recognize the target object from the image.
  • the recognition loss of the target object may include the loss caused by not recognizing the target object in the image, the loss caused by recognizing other objects in the image as the target object, and the loss caused by recognizing the target object in the image to be detected as other objects. Damage caused by objects, etc.
  • the position detection loss of the target object is used to record the loss caused by the second detection model in the process of determining the position information of the target object on the image to be detected; moreover, the position detection loss of the target object may include Detect the loss caused by inaccurate predicted position information on the image.
  • a two-dimensional loss function (l2-loss) can usually be used for calculation, that is, the distance difference between the predicted position coordinate and the actual position coordinate can be calculated.
  • the actual location information can be compared with the predicted location information to determine the second detection model
  • the distance difference between the actual position information and the predicted position information can be calculated to determine the position detection loss of the target object of the second detection model.
  • the embodiment of the application does not limit the calculation method of the recognition loss of the target object, and any existing or future calculation method that can measure the recognition effect of the target object in the image can be used to determine the recognition of the target object. loss.
  • the embodiment of the application does not limit the calculation method of the position detection loss of the target object, and any existing or future calculation method that can measure the position detection effect of the target object can be used to determine the position detection loss of the target object. .
  • S103A2 Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object.
  • the key point position detection loss (loss-dpoint) of the target object is used to record the loss caused by the second detection model in the process of obtaining the key point position information of the target object; moreover, the key point position detection loss of the target object includes the loss-dpoint caused by the target object. Loss caused by inaccurate position information of the predicted key points of the object.
  • the two-dimensional loss function (l2-loss) can usually be used to calculate the detection loss of the key point position of the target object, that is, to calculate the distance difference between the predicted position coordinate and the actual position coordinate.
  • the embodiment of the application does not limit the calculation method of the key point position detection loss of the target object, and any existing or future calculation method that can measure the position detection effect of the key point of the target object can be used. To determine the detection loss of the key point position of the target object.
  • S103A3 Determine the detection loss of the second detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object.
  • the loss-detect of the second detection model is used to record the loss caused by the second detection model in the process of acquiring the position information of the target object on the image to be detected and the key point position information of the target object.
  • a preset loss function can be used to determine the detection loss of the second detection model, and the loss function is:
  • loss-detect represents the detection loss of the second detection model
  • represents the influence weight of the recognition loss of the target object
  • L-softmax represents the recognition loss of the target object
  • represents the influence weight of the position detection loss of the target object
  • L- Regression represents the location detection loss of the target object
  • F represents the existence identification of the key point location detection loss of the target object. If the training image includes the target object (that is, the positive sample training image), the F value corresponding to the training image is 1 , If the training image does not include the target object (that is, the negative sample training image), the F value corresponding to the training image is 0; ⁇ represents the influence weight of the key point position detection loss of the target object; loss-dpoint represents the target object The key point position detection loss.
  • the value of F is determined according to whether the target object is included in the training image, specifically: if a target object is included in a training image, the key point related information of the target object needs to be considered.
  • the second detection is measured
  • the model detects the loss of the training image, it needs to consider the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object; if the target object is not included in a training image, there is no need to consider the key of the target object Point-related information.
  • the second detection model when measuring the detection loss of the training image by the second detection model, only the recognition loss of the target object and the position detection loss of the target object need to be considered.
  • the loss function mentioned above is to jointly supervise the detection effect of the second detection model by using three loss indicators: the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object. Therefore, when using the above loss function to evaluate the detection effect of the second detection model on the object and its key points, it can effectively determine whether the current second detection model can accurately detect the object and its key points in the image. Thus, the detection accuracy of the trained second detection model is improved.
  • S103A4 Determine the detection loss change rate of the second detection model according to the detection loss of the second detection model and the historical detection loss of the second detection model.
  • the historical detection loss of the second detection model is the detection loss of the second detection model determined during the historical training process. For example, suppose that the second detection model is sequentially trained from the first round to the Y-round training, and the first-round training is earlier than the second-round training, and the second-round training is earlier than the third-round training..., Y-th The first round of training is earlier than the Y-round training, and the Y-round training is the current training process being executed.
  • the first-round training to the Y-1th round of training are all the Y-round
  • the historical training process of training then, the detection loss of the second detection model determined in the first round of training to the detection loss of the second detection model determined in the Y-1 round of training are all historical detections of the Y-round training loss.
  • the detection loss change rate of the second detection model is used to characterize the change information of the detection effect of the second detection model during multiple rounds of training; moreover, the smaller the detection loss change rate of the second detection model, the better the detection effect of the second detection model. Good, and the larger the detection loss change rate of the second detection model is, the worse the detection effect of the second detection model is.
  • S103A5 Determine whether the detection loss change rate of the second detection model is lower than the second preset loss threshold, if yes, execute step S103A6; if not, execute step S103A7.
  • the second preset loss threshold can be set in advance, especially can be set according to application scenarios.
  • the detection loss of the second detection model is determined based on the second detection loss.
  • the detection loss of the detection model determines the detection loss change rate of the second detection model. At this time, if the detection loss change rate of the second detection model is lower than the second preset loss threshold, it means that the detection loss of the second detection model has a small change and is close to the stable optimal model, thereby indicating that the second detection model is used.
  • the accuracy of the objects and their key points in the image obtained by the model detection is relatively high, so that it is determined that the second detection model has reached the second preset condition, and then it can be determined that the second detection model does not need to be trained;
  • the detection loss change rate is not lower than the second preset loss threshold, which means that the second detection model has a large change in detection loss and is farther away from the stable optimal model, thus indicating that the object in the image detected by the second detection model
  • the accuracy of its key points is low, so it is determined that the second detection model does not meet the second preset condition, and it can be determined that the second detection model still needs to be trained again.
  • S103A6 Determine that the second detection model meets a second preset condition.
  • S103A7 Determine that the second detection model does not meet the second preset condition.
  • step S103A1 and step S103A2 can be executed in sequence, step S103A2 and step S103A1 can be executed in sequence, or step S103A1 and step S103A2 can be executed simultaneously.
  • step S103 the detection loss change rate of the second detection model is lower than the second preset loss threshold as the training cut-off condition of the second detection model.
  • step S103 may specifically include steps S103B1-S103B7:
  • S103B1 Determine the detection loss of the target object according to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected.
  • the detection loss of the target object includes the recognition loss of the target object and the position detection loss of the target object.
  • step S103B1 is the same as the content of step S103A1 described above. For the sake of brevity, it will not be repeated here.
  • step S103B1 please refer to the content of step S103A1 described above.
  • S103B2 Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object.
  • step S103B2 is the same as the content of step S103A2 described above. For the sake of brevity, details are not repeated here. For technical details of step S103B2, please refer to the content of step S103A2 described above.
  • S103B3 Determine the detection loss of the second detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object.
  • step S103B3 is the same as the content of step S103A3 described above. For the sake of brevity, it will not be repeated here.
  • step S103B3 please refer to the content of step S103A3 described above.
  • S103B4 Determine the detection loss change rate of the second detection model according to the detection loss of the second detection model and the historical detection loss of the second detection model.
  • the historical detection loss of the second detection model is the detection loss of the second detection model determined during the historical training process.
  • step S103B4 is the same as the content of step S103A4 described above. For the sake of brevity, it will not be repeated here.
  • step S103B4 please refer to the content of step S103A4 described above.
  • S103B5 Determine whether at least one of the following two conditions is met, and the two conditions are: the detection loss change rate of the second detection model is lower than the second preset loss threshold and the number of training rounds of the second detection model reaches the first Second, the preset round number threshold; if yes, execute step S103B6; if not, execute step S103B7.
  • the execution action "judges whether the detection loss change rate of the second detection model is lower than the first 2. Preset loss threshold” and action "Does the number of training rounds of the second detection model reach the second preset number of rounds threshold”.
  • step S103B6 Directly execute step S103B6; moreover, as long as it is determined that the number of training rounds of the second detection model reaches the second preset round number threshold, no matter whether the action is performed or not, it is judged whether the detection loss change rate of the second detection model is lower than the second preset Set the loss threshold", proceed directly to step S103B6; moreover, it is only determined that the detection loss change rate of the second detection model is not lower than the second preset loss threshold, and the number of training rounds of the second detection model does not reach the second preset Step S103B7 is executed only when the threshold of the number of rounds is set.
  • the embodiment of this application does not limit the action “judging whether the detection loss change rate of the second detection model is lower than the second preset loss threshold” and the action “whether the number of training rounds of the second detection model reaches the second predetermined loss threshold”.
  • S103B6 Determine that the second detection model meets a second preset condition.
  • step S103B1 and step S103B2 can be executed in sequence, step S103B2 and step S103B1 can be executed in sequence, or step S103B1 and step S103B2 can be executed simultaneously.
  • step S103 is another implementation manner of step S103.
  • the detection loss change rate of the second detection model is lower than the second preset loss threshold and the number of training rounds of the second detection model reaches the second preset round number threshold. As the training cut-off condition of the second detection model.
  • step S103 The above is the relevant content of step S103.
  • the predicted position information of the target object on the image to be detected and the target object on the training image detected by the second detection model may be used.
  • the difference between the actual position information of the target object, and the difference between the predicted key point position information of the target object and the actual key point position information of the target object, to update the second detection model, so that the updated second detection model detects The obtained predicted position information of the target object on the image to be detected is closer to the actual position information of the target object on the training image, and the predicted key point position information of the target object is closer to the actual key point position information of the target object, thereby This enables the updated second detection model to more accurately detect objects and their key points in the image.
  • the image features corresponding to the training image, the actual position information of the target object on the training image, and the target object’s information are acquired.
  • the second detection model After the actual key point location information, first use the second detection model to detect the target object and its key points according to the image features corresponding to the training image, and determine the predicted location information of the target object on the image to be detected and the prediction of the target object Key point position information; and then according to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the target object’s position information Predict the position information of the key point, and determine whether the second detection model meets the second preset condition, so as to determine whether to continue training the second detection model according to the judgment result.
  • the embodiment of this application is based on the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the target object’s position information.
  • Predicting the four pieces of information of the key point location information to evaluate the detection effect of the second detection model makes the evaluation result more reasonable and effective, so that the detection effect of the second detection model that satisfies the second preset condition is better.
  • the embodiments of the present application also use three loss factors: the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object.
  • the evaluation result is more reasonable and effective, so that the detection effect of the second detection model that meets the second preset condition is better.
  • the execution subject of the "method for determining objects and their key points in the image” and the execution subject of the "training process of the detection model” may be the same execution subject, or they may be different executions.
  • the main body, the embodiment of this application does not specifically limit this.
  • the "method for determining objects in images and their key points” can be executed by terminals, servers, vehicles, and other devices that can execute “methods for determining objects in images and their key points”.
  • the execution subject of the "training process of the detection model” can be terminals, servers, vehicles, and other devices capable of executing the "training process of the detection model”.
  • an embodiment of the present application also provides a device for determining objects and their key points in an image.
  • a device for determining objects and their key points in an image The following is an explanation and description with reference to the accompanying drawings. .
  • FIG. 13 is a schematic structural diagram of an apparatus for determining an object and its key points in an image provided by an embodiment of the application.
  • the device 130 for determining an object and its key points in an image includes:
  • the extraction unit 131 is configured to perform feature extraction on the image to be detected to obtain image features corresponding to the image to be detected;
  • the determining unit 132 is configured to determine the position information of the target object on the image to be detected and the key point position information of the target object according to the image feature corresponding to the image to be detected; wherein, the key point of the target object is used for Characterize the structural characteristics of the target object.
  • the determining unit 132 is specifically configured to: determine the position of the target object on the image to be detected according to the image characteristics corresponding to the image to be detected. Location information, and determine the key point location information of the target object according to the image feature corresponding to the image to be detected and the location information of the target object on the image to be detected.
  • the determining unit 132 is specifically configured to:
  • the image features corresponding to the image to be detected use a pre-built first detection model to detect the target object and its key points, and determine the position information of the target object on the image to be detected and the key point position information of the target object;
  • the first detection model includes a first object position detection network layer and a first key point position detection network layer
  • the output result of the first object position detection network layer is the first key point position detection network layer
  • the first object position detection network layer is used to determine the position information of the target object on the image to be detected according to the image features corresponding to the image to be detected
  • the first key point position detection network layer is used to According to the image feature corresponding to the image to be detected and the position information of the target object on the image to be detected, the key point position information of the target object is determined.
  • the training process of the first detection model specifically includes:
  • the image features corresponding to the training image use the first detection model to detect the target object and its key points, and determine the predicted position information of the target object on the image to be detected and the predicted key point position information of the target object;
  • the actual position information of the target object on the training image the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object, Judging whether the first detection model meets the first preset condition;
  • the first detection model does not meet the first preset condition, update the first detection model, and continue to execute "Using the first detection model to detect the target object and its key points according to the image features corresponding to the training image , Determine the predicted location information of the target object on the image to be detected and the predicted key point location information of the target object".
  • the first preset condition is: the change rate of the detection loss of the first detection model is lower than the first preset loss threshold, and/ Or, the number of training rounds of the first detection model reaches the first preset round number threshold.
  • the first preset condition includes that the rate of change of the detection loss of the first detection model is lower than the first preset loss threshold
  • the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
  • the detection loss change rate of the first detection model is determined; the historical detection loss of the first detection model is the first determined in the historical training process. 1. Detection loss of the detection model;
  • the detection loss change rate of the first detection model is lower than the first preset loss threshold, determining that the first detection model meets the first preset condition
  • the detection loss change rate of the first detection model is not lower than the first preset loss threshold, it is determined that the first detection model does not meet the first preset condition.
  • the predicted position information of the object on the image to be detected and the predicted key point position information of the target object to determine whether the first detection model meets the first preset condition specifically includes:
  • the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
  • the detection loss change rate of the first detection model is determined; the historical detection loss of the first detection model is the first determined in the historical training process. 1. Detection loss of the detection model;
  • the detection loss change rate of the first detection model is lower than the first preset loss threshold, or the number of training rounds of the first detection model reaches the first preset number of rounds threshold, it is determined that the first detection model reaches the first Precondition
  • the detection loss change rate of the first detection model is not lower than the first preset loss threshold, and the number of training rounds of the first detection model does not reach the first preset number of rounds threshold, it is determined that the first detection model is not The first preset condition is reached.
  • the determining unit 132 is specifically configured to:
  • the position information of the target object on the image to be detected and the position information of key points in each image subregion on the image to be detected are determined, and according to the position of the target object on the image to be detected
  • the position information and the key point position information in each image sub-region on the image to be detected determine the key point position information of the target object; the image sub-region is the region of interest corresponding to the anchor point.
  • the determining unit 132 is specifically configured to:
  • the image features corresponding to the image to be detected use a pre-built second detection model to detect the target object and its key points, and determine the position information of the target object on the image to be detected and the key point position information of the target object;
  • the second detection model includes a second object position detection network layer, a second key point position detection network layer, and an object key point position determination network layer, and the output result of the second object position detection network layer and the The output result of the second key point position detection network layer is the input data of the object key point position determination network layer, and the second object position detection network layer is used to determine the target object according to the image feature corresponding to the image to be detected Position information on the image to be detected; and the second key point position detection network layer is used to determine the position information of key points in each image subregion on the image to be detected according to the image characteristics corresponding to the image to be detected; and The object key point position determination network layer is used to determine the key point position of the target object according to the position information of the target object on the image to be detected and the key point position information in each image subregion of the image to be detected information.
  • the training process of the second detection model specifically includes:
  • the second detection model uses the second detection model to detect the target object and its key points, and determine the predicted position information of the target object on the image to be detected and the predicted key point position information of the target object;
  • the actual position information of the target object on the training image the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object, Judging whether the second detection model meets a second preset condition;
  • the second detection model does not meet the second preset condition, update the second detection model, and continue to execute "Using the second detection model to detect the target object and its key points according to the image features corresponding to the training image , Determine the predicted location information of the target object on the image to be detected and the predicted key point location information of the target object".
  • the second preset condition is: the change rate of the detection loss of the second detection model is lower than the second preset loss threshold, and/ Or, the number of training rounds of the second detection model reaches the second preset round number threshold.
  • the second preset condition includes that the rate of change of the detection loss of the second detection model is lower than the second preset loss threshold
  • the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
  • the detection loss change rate of the second detection model is determined; the historical detection loss of the second detection model is the first determined in the historical training process. 2. Detection loss of the detection model;
  • the detection loss change rate of the second detection model is lower than a second preset loss threshold, determining that the second detection model meets the second preset condition
  • the detection loss change rate of the second detection model is not lower than the second preset loss threshold, it is determined that the second detection model does not meet the second preset condition.
  • the predicted position information of the object on the image to be detected and the predicted key point position information of the target object to determine whether the second detection model meets the second preset condition specifically includes:
  • the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
  • the detection loss change rate of the second detection model is determined; the historical detection loss of the second detection model is the first determined in the historical training process. 2. Detection loss of the detection model;
  • the detection loss change rate of the second detection model is lower than the second preset loss threshold, or the number of training rounds of the second detection model reaches the second preset number of rounds threshold, it is determined that the second detection model reaches the second Precondition
  • the detection loss change rate of the second detection model is not lower than the second preset loss threshold, and the number of training rounds of the second detection model does not reach the second preset number of rounds threshold, it is determined that the second detection model is not The second preset condition is reached.
  • the above is the specific implementation of the device 130 for determining the object and its key points in the image provided by the device embodiment.
  • the image feature corresponding to the image to be detected is extracted from the image to be detected, it is directly based on the The image feature corresponding to the image to be detected determines the position information of the target object on the image to be detected and the key point position information of the target object.
  • the position information of the target object on the image to be detected can accurately represent the position of the target object in the image to be detected, and the key point position information of the target object can accurately represent the key point position of the target object in the image to be detected
  • the method for determining the object and its key points in the image provided by the embodiments of the application can effectively determine the position information of the target object and the position information of its key points from the image to be detected, so as to accurately detect the object and the key point in the image. Its key point.
  • the location information of the target object on the image to be detected and the key point location information of the target object can be determined directly based on the image feature corresponding to the image to be detected.
  • an embodiment of the present application also provides a device, which will be explained and described below with reference to the accompanying drawings.
  • FIG. 14 is a schematic diagram of the device structure provided by an embodiment of the application.
  • the device 140 provided in the embodiment of the present application includes a processor 141 and a memory 142:
  • the memory 142 is used to store computer programs
  • the processor 141 is configured to execute any implementation manner of the method for determining an object and its key points in an image provided by the foregoing method embodiment according to the computer program. In other words, the processor 141 is configured to perform the following steps:
  • the position information of the target object on the image to be detected and the position information of the key points of the target object are determined; wherein, the key points of the target object are used to characterize the Structure.
  • the determining the position information of the target object on the image to be detected and the key point position information of the target object according to the image feature corresponding to the image to be detected is specifically:
  • the position information of the target object on the image to be detected is determined according to the image feature corresponding to the image to be detected, and the position information of the target object on the image to be detected is determined according to the image feature corresponding to the image to be detected and the target object is on the image to be detected
  • the location information to determine the key point location information of the target object including:
  • the image features corresponding to the image to be detected use a pre-built first detection model to detect the target object and its key points, and determine the position information of the target object on the image to be detected and the key point position information of the target object;
  • the first detection model includes a first object position detection network layer and a first key point position detection network layer
  • the output result of the first object position detection network layer is the first key point position detection network layer
  • the first object position detection network layer is used to determine the position information of the target object on the image to be detected according to the image features corresponding to the image to be detected
  • the first key point position detection network layer is used to According to the image feature corresponding to the image to be detected and the position information of the target object on the image to be detected, the key point position information of the target object is determined.
  • the training process of the first detection model specifically includes:
  • the image features corresponding to the training image use the first detection model to detect the target object and its key points, and determine the predicted position information of the target object on the image to be detected and the predicted key point position information of the target object;
  • the actual position information of the target object on the training image the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object, Judging whether the first detection model meets the first preset condition;
  • the first detection model does not meet the first preset condition, update the first detection model, and continue to execute "Using the first detection model to detect the target object and its key points according to the image features corresponding to the training image , Determine the predicted location information of the target object on the image to be detected and the predicted key point location information of the target object".
  • the first preset condition is: the rate of change of the detection loss of the first detection model is lower than a first preset loss threshold, and/or the number of training rounds of the first detection model reaches the first preset round Number threshold.
  • the first preset condition includes that the rate of change of the detection loss of the first detection model is lower than the first preset loss threshold
  • the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object determine whether the first detection model meets the first preset condition, Specifically:
  • the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
  • the detection loss change rate of the first detection model is determined; the historical detection loss of the first detection model is the first determined in the historical training process. 1. Detection loss of the detection model;
  • the detection loss change rate of the first detection model is lower than the first preset loss threshold, determining that the first detection model meets the first preset condition
  • the detection loss change rate of the first detection model is not lower than the first preset loss threshold, it is determined that the first detection model does not meet the first preset condition.
  • the actual position information of the target object on the training image specifically includes:
  • the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
  • the detection loss change rate of the first detection model is determined; the historical detection loss of the first detection model is the first determined in the historical training process. 1. Detection loss of the detection model;
  • the detection loss change rate of the first detection model is lower than the first preset loss threshold, or the number of training rounds of the first detection model reaches the first preset number of rounds threshold, it is determined that the first detection model reaches the first Precondition
  • the detection loss change rate of the first detection model is not lower than the first preset loss threshold, and the number of training rounds of the first detection model does not reach the first preset number of rounds threshold, it is determined that the first detection model is not The first preset condition is reached.
  • the determining the position information of the target object on the image to be detected and the key point position information of the target object according to the image feature corresponding to the image to be detected is specifically:
  • the position information of the target object on the image to be detected and the position information of key points in each image subregion on the image to be detected are determined, and according to the position of the target object on the image to be detected
  • the position information and the key point position information in each image sub-region on the image to be detected determine the key point position information of the target object; the image sub-region is the region of interest corresponding to the anchor point.
  • the position information of the target object on the image to be detected and the position information of key points in each image subregion on the image to be detected are determined according to the image features corresponding to the image to be detected, and based on the target object
  • the location information on the image to be detected and the location information of the key points in each image sub-region on the image to be detected, to determine the location information of the key points of the target object specifically:
  • the image features corresponding to the image to be detected use a pre-built second detection model to detect the target object and its key points, and determine the position information of the target object on the image to be detected and the key point position information of the target object;
  • the second detection model includes a second object position detection network layer, a second key point position detection network layer, and an object key point position determination network layer, and the output result of the second object position detection network layer and the The output result of the second key point position detection network layer is the input data of the object key point position determination network layer, and the second object position detection network layer is used to determine the target object according to the image feature corresponding to the image to be detected Position information on the image to be detected; and the second key point position detection network layer is used to determine the position information of key points in each image subregion on the image to be detected according to the image characteristics corresponding to the image to be detected; and The object key point position determination network layer is used to determine the key point position of the target object according to the position information of the target object on the image to be detected and the key point position information in each image subregion of the image to be detected information.
  • the training process of the second detection model specifically includes:
  • the second detection model uses the second detection model to detect the target object and its key points, and determine the predicted position information of the target object on the image to be detected and the predicted key point position information of the target object;
  • the actual position information of the target object on the training image the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object, Judging whether the second detection model meets a second preset condition;
  • the second detection model does not meet the second preset condition, update the second detection model, and continue to execute "Using the second detection model to detect the target object and its key points according to the image features corresponding to the training image , Determine the predicted location information of the target object on the image to be detected and the predicted key point location information of the target object".
  • the second preset condition is: the rate of change of the detection loss of the second detection model is lower than a second preset loss threshold, and/or the number of training rounds of the second detection model reaches the second preset round Number threshold.
  • the second preset condition includes that the rate of change of the detection loss of the second detection model is lower than a second preset loss threshold
  • the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object determine whether the second detection model meets a second preset condition, Specifically:
  • the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
  • the detection loss change rate of the second detection model is determined; the historical detection loss of the second detection model is the first determined in the historical training process. 2. Detection loss of the detection model;
  • the detection loss change rate of the second detection model is lower than a second preset loss threshold, determining that the second detection model meets the second preset condition
  • the detection loss change rate of the second detection model is not lower than the second preset loss threshold, it is determined that the second detection model does not meet the second preset condition.
  • the actual position information of the target object on the training image specifically includes:
  • the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
  • the detection loss change rate of the second detection model is determined; the historical detection loss of the second detection model is the first determined in the historical training process. 2. Detection loss of the detection model;
  • the detection loss change rate of the second detection model is lower than the second preset loss threshold, or the number of training rounds of the second detection model reaches the second preset number of rounds threshold, it is determined that the second detection model reaches the second Precondition
  • the detection loss change rate of the second detection model is not lower than the second preset loss threshold, and the number of training rounds of the second detection model does not reach the second preset number of rounds threshold, it is determined that the second detection model is not The second preset condition is reached.
  • an embodiment of the present application also provides a computer-readable storage medium.
  • the embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium is used to store a computer program, and the computer program is used to execute the method for determining an object and its key points in an image provided by the foregoing method embodiment Any embodiment of. That is, the computer program is used to perform the following steps:
  • the position information of the target object on the image to be detected and the position information of the key points of the target object are determined; wherein, the key points of the target object are used to characterize the Structure.
  • the determining the position information of the target object on the image to be detected and the key point position information of the target object according to the image feature corresponding to the image to be detected is specifically:
  • the position information of the target object on the image to be detected is determined according to the image feature corresponding to the image to be detected, and the position information of the target object on the image to be detected is determined according to the image feature corresponding to the image to be detected and the target object is on the image to be detected
  • the location information to determine the key point location information of the target object including:
  • the image features corresponding to the image to be detected use a pre-built first detection model to detect the target object and its key points, and determine the position information of the target object on the image to be detected and the key point position information of the target object;
  • the first detection model includes a first object position detection network layer and a first key point position detection network layer
  • the output result of the first object position detection network layer is the first key point position detection network layer
  • the first object position detection network layer is used to determine the position information of the target object on the image to be detected according to the image features corresponding to the image to be detected
  • the first key point position detection network layer is used to According to the image feature corresponding to the image to be detected and the position information of the target object on the image to be detected, the key point position information of the target object is determined.
  • the training process of the first detection model specifically includes:
  • the image features corresponding to the training image use the first detection model to detect the target object and its key points, and determine the predicted position information of the target object on the image to be detected and the predicted key point position information of the target object;
  • the actual position information of the target object on the training image the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object, Judging whether the first detection model meets the first preset condition;
  • the first detection model does not meet the first preset condition, update the first detection model, and continue to execute "Using the first detection model to detect the target object and its key points according to the image features corresponding to the training image , Determine the predicted location information of the target object on the image to be detected and the predicted key point location information of the target object".
  • the first preset condition is: the rate of change of the detection loss of the first detection model is lower than a first preset loss threshold, and/or the number of training rounds of the first detection model reaches the first preset round Number threshold.
  • the first preset condition includes that the rate of change of the detection loss of the first detection model is lower than the first preset loss threshold
  • the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object determine whether the first detection model meets the first preset condition, Specifically:
  • the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
  • the detection loss change rate of the first detection model is determined; the historical detection loss of the first detection model is the first determined in the historical training process. 1. Detection loss of the detection model;
  • the detection loss change rate of the first detection model is lower than the first preset loss threshold, determining that the first detection model meets the first preset condition
  • the detection loss change rate of the first detection model is not lower than the first preset loss threshold, it is determined that the first detection model does not meet the first preset condition.
  • the actual position information of the target object on the training image specifically includes:
  • the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
  • the detection loss change rate of the first detection model is determined; the historical detection loss of the first detection model is the first determined in the historical training process. 1. Detection loss of the detection model;
  • the detection loss change rate of the first detection model is lower than the first preset loss threshold, or the number of training rounds of the first detection model reaches the first preset number of rounds threshold, it is determined that the first detection model reaches the first Precondition
  • the detection loss change rate of the first detection model is not lower than the first preset loss threshold, and the number of training rounds of the first detection model does not reach the first preset number of rounds threshold, it is determined that the first detection model is not The first preset condition is reached.
  • the determining the position information of the target object on the image to be detected and the key point position information of the target object according to the image feature corresponding to the image to be detected is specifically:
  • the position information of the target object on the image to be detected and the position information of key points in each image subregion on the image to be detected are determined, and according to the position of the target object on the image to be detected
  • the position information and the key point position information in each image sub-region on the image to be detected determine the key point position information of the target object; the image sub-region is the region of interest corresponding to the anchor point.
  • the position information of the target object on the image to be detected and the position information of key points in each image subregion on the image to be detected are determined according to the image features corresponding to the image to be detected, and based on the target object
  • the location information on the image to be detected and the location information of the key points in each image sub-region on the image to be detected, to determine the location information of the key points of the target object specifically:
  • the image features corresponding to the image to be detected use a pre-built second detection model to detect the target object and its key points, and determine the position information of the target object on the image to be detected and the key point position information of the target object;
  • the second detection model includes a second object position detection network layer, a second key point position detection network layer, and an object key point position determination network layer, and the output result of the second object position detection network layer and the The output result of the second key point position detection network layer is the input data of the object key point position determination network layer, and the second object position detection network layer is used to determine the target object according to the image feature corresponding to the image to be detected Position information on the image to be detected; and the second key point position detection network layer is used to determine the position information of key points in each image subregion on the image to be detected according to the image characteristics corresponding to the image to be detected; and The object key point position determination network layer is used to determine the key point position of the target object according to the position information of the target object on the image to be detected and the key point position information in each image subregion of the image to be detected information.
  • the training process of the second detection model specifically includes:
  • the second detection model uses the second detection model to detect the target object and its key points, and determine the predicted position information of the target object on the image to be detected and the predicted key point position information of the target object;
  • the actual position information of the target object on the training image the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object, Judging whether the second detection model meets a second preset condition;
  • the second detection model does not meet the second preset condition, update the second detection model, and continue to execute "Using the second detection model to detect the target object and its key points according to the image features corresponding to the training image , Determine the predicted location information of the target object on the image to be detected and the predicted key point location information of the target object".
  • the second preset condition is: the rate of change of the detection loss of the second detection model is lower than a second preset loss threshold, and/or the number of training rounds of the second detection model reaches the second preset round Number threshold.
  • the second preset condition includes that the rate of change of the detection loss of the second detection model is lower than a second preset loss threshold
  • the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object determine whether the second detection model meets a second preset condition, Specifically:
  • the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
  • the detection loss change rate of the second detection model is determined; the historical detection loss of the second detection model is the first determined in the historical training process. 2. Detection loss of the detection model;
  • the detection loss change rate of the second detection model is lower than a second preset loss threshold, determining that the second detection model meets the second preset condition
  • the detection loss change rate of the second detection model is not lower than the second preset loss threshold, it is determined that the second detection model does not meet the second preset condition.
  • the actual position information of the target object on the training image specifically includes:
  • the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
  • the detection loss change rate of the second detection model is determined; the historical detection loss of the second detection model is the first determined in the historical training process. 2. Detection loss of the detection model;
  • the detection loss change rate of the second detection model is lower than the second preset loss threshold, or the number of training rounds of the second detection model reaches the second preset number of rounds threshold, it is determined that the second detection model reaches the second Precondition
  • the detection loss change rate of the second detection model is not lower than the second preset loss threshold, and the number of training rounds of the second detection model does not reach the second preset number of rounds threshold, it is determined that the second detection model is not The second preset condition is reached.
  • At least one (item) refers to one or more, and “multiple” refers to two or more.
  • “And/or” is used to describe the association relationship of associated objects, indicating that there can be three types of relationships, for example, “A and/or B” can mean: only A, only B, and both A and B , Where A and B can be singular or plural.
  • the character “/” generally indicates that the associated objects before and after are in an “or” relationship.
  • the following at least one item (a) or similar expressions refers to any combination of these items, including any combination of a single item (a) or a plurality of items (a).
  • At least one of a, b, or c can mean: a, b, c, "a and b", “a and c", “b and c", or "a and b and c" ", where a, b, and c can be single or multiple.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

A method and apparatus for determining an object and key points thereof in an image. After image features corresponding to an image to be detected are extracted from said image, position information of a target object on said image and position information of key points of the target object are directly determined according to the image features corresponding to said image. The position information of the target object on said image can accurately represent the position of the target object in said image, and the position information of the key points of the target object can accurately represent the positions of the key points of the target object in said image. Therefore, according to the method for determining an object and key points thereof in an image provided by the present invention, the position information of the target object and the position information of the key points thereof can be effectively determined from the image to be detected, and the object and the key points thereof in the image can thus be accurately detected.

Description

一种图像中物体及其关键点的确定方法和装置Method and device for determining objects and their key points in images
本发明要求于2019年10月09日提交中国专利局、申请号为CN201910954556.7、申请名称为“一种图像中物体及其关键点的确定方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本发明中。The present invention claims the priority of a Chinese patent application filed with the Chinese Patent Office on October 9, 2019, the application number is CN201910954556.7, and the application title is "a method and device for determining an object and its key points in an image", which The entire content is incorporated into the present invention by reference.
技术领域Technical field
本申请涉及图像处理技术领域,尤其涉及一种图像中物体及其关键点的确定方法和装置。This application relates to the field of image processing technology, and in particular to a method and device for determining objects and their key points in an image.
背景技术Background technique
随着车辆的普及,车辆驾驶安全越来越重要。其中,驾驶员的疲劳驾驶是一种能够影响驾驶安全的重要因素。With the popularity of vehicles, vehicle driving safety is becoming more and more important. Among them, the driver's fatigue driving is an important factor that can affect driving safety.
目前,通常使用车辆的疲劳驾驶检测系统确定驾驶员是否处于疲劳驾驶状态,以便在确定驾驶员处于疲劳驾驶状态时对驾驶员采取相应的提醒措施。其中,为了能够有效地确定出驾驶员是否处于疲劳驾驶状态,需要根据包括驾驶员脸部的图像,进行人脸检测以及人脸关键点检测,以便后续能够利用人脸检测结果以及人脸关键点检测结果确定驾驶员的精神状态。At present, the fatigue driving detection system of the vehicle is usually used to determine whether the driver is in a fatigue driving state, so as to take corresponding reminder measures to the driver when it is determined that the driver is in a fatigue driving state. Among them, in order to be able to effectively determine whether the driver is in a fatigued driving state, it is necessary to perform face detection and face key point detection based on the image including the driver's face, so that the face detection results and face key points can be used later The detection result determines the mental state of the driver.
有上述分析可知,为了提高驾驶安全性,需要准确地检测出图像中人脸以及关键点,以便能够基于该检测所得的人脸及其关键点准确地判断驾驶员的精神状态。然而,如何根据准确地检测出图像中物体(例如,人脸)及其关键点却是一个亟待解决的技术问题。The above analysis shows that in order to improve driving safety, it is necessary to accurately detect the face and key points in the image, so that the driver's mental state can be accurately judged based on the detected face and the key points. However, how to accurately detect objects (for example, human faces) and their key points in an image is a technical problem to be solved urgently.
发明内容Summary of the invention
为了解决现有技术中存在的以上技术问题,本申请提供一种图像中物体及其关键点的确定方法和装置,能够有效地确定出图像中物体位置及其关键点位置,从而能够准确地检测出图像中物体及其关键点。In order to solve the above technical problems in the prior art, the present application provides a method and device for determining the object and its key points in an image, which can effectively determine the position of the object and its key point in the image, so as to accurately detect Figure out the objects and their key points in the image.
为了实现上述目的,本申请实施例提供的技术方案如下:In order to achieve the foregoing objectives, the technical solutions provided by the embodiments of the present application are as follows:
本申请实施例提供一种图像中物体及其关键点的确定方法,包括:The embodiment of the present application provides a method for determining an object and its key points in an image, including:
对待检测图像进行特征提取,得到所述待检测图像对应的图像特征;Performing feature extraction on the image to be detected to obtain image features corresponding to the image to be detected;
根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息以及所述目标物体的关键点位置信息;其中,所述目标物体的关键点用 于表征所述目标物体的结构特征。According to the image features corresponding to the image to be detected, the position information of the target object on the image to be detected and the position information of the key points of the target object are determined; wherein, the key points of the target object are used to characterize the Structure.
可选的,所述根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息以及所述目标物体的关键点位置信息,具体为:Optionally, the determining the position information of the target object on the image to be detected and the key point position information of the target object according to the image feature corresponding to the image to be detected is specifically:
根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息,并根据所述待检测图像对应的图像特征和所述目标物体在待检测图像上的位置信息,确定目标物体的关键点位置信息。Determine the location information of the target object on the image to be detected according to the image feature corresponding to the image to be detected, and determine the target object based on the image feature corresponding to the image to be detected and the location information of the target object on the image to be detected The key point position information of the object.
可选的,所述根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息,并根据所述待检测图像对应的图像特征和所述目标物体在待检测图像上的位置信息,确定目标物体的关键点位置信息,具体包括:Optionally, the position information of the target object on the image to be detected is determined according to the image feature corresponding to the image to be detected, and the position information of the target object on the image to be detected is determined according to the image feature corresponding to the image to be detected and the target object is on the image to be detected The location information to determine the key point location information of the target object, including:
根据所述待检测图像对应的图像特征,利用预先构建的第一检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的位置信息以及目标物体的关键点位置信息;According to the image features corresponding to the image to be detected, use a pre-built first detection model to detect the target object and its key points, and determine the position information of the target object on the image to be detected and the key point position information of the target object;
其中,所述第一检测模型包括第一物体位置检测网络层和第一关键点位置检测网络层,且所述第一物体位置检测网络层的输出结果是所述第一关键点位置检测网络层的输入数据,且所述第一物体位置检测网络层用于根据所述待检测图像对应的图像特征确定目标物体在待检测图像上的位置信息,所述第一关键点位置检测网络层用于根据所述待检测图像对应的图像特征和所述目标物体在待检测图像上的位置信息,确定目标物体的关键点位置信息。Wherein, the first detection model includes a first object position detection network layer and a first key point position detection network layer, and the output result of the first object position detection network layer is the first key point position detection network layer The first object position detection network layer is used to determine the position information of the target object on the image to be detected according to the image features corresponding to the image to be detected, and the first key point position detection network layer is used to According to the image feature corresponding to the image to be detected and the position information of the target object on the image to be detected, the key point position information of the target object is determined.
可选的,所述第一检测模型的训练过程,具体包括:Optionally, the training process of the first detection model specifically includes:
获取训练图像对应的图像特征、目标物体在训练图像上的实际位置信息以及目标物体的实际关键点位置信息;Obtain the image features corresponding to the training image, the actual position information of the target object on the training image, and the actual key point position information of the target object;
根据所述训练图像对应的图像特征,利用第一检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的预测位置信息以及目标物体的预测关键点位置信息;According to the image features corresponding to the training image, use the first detection model to detect the target object and its key points, and determine the predicted position information of the target object on the image to be detected and the predicted key point position information of the target object;
根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第一检测模型是否达到第一预设条件;According to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object, Judging whether the first detection model meets the first preset condition;
若确定所述第一检测模型未达到第一预设条件,则更新第一检测模型,继续执行“根据所述训练图像对应的图像特征,利用第一检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的预测位置信息以及目标物 体的预测关键点位置信息”。If it is determined that the first detection model does not meet the first preset condition, update the first detection model, and continue to execute "Using the first detection model to detect the target object and its key points according to the image features corresponding to the training image , Determine the predicted location information of the target object on the image to be detected and the predicted key point location information of the target object".
可选的,所述第一预设条件为:第一检测模型的检测损失的变化率低于第一预设损失阈值,和/或,第一检测模型的训练轮数达到第一预设轮数阈值。Optionally, the first preset condition is: the rate of change of the detection loss of the first detection model is lower than a first preset loss threshold, and/or the number of training rounds of the first detection model reaches the first preset round Number threshold.
可选的,当所述第一预设条件包括第一检测模型的检测损失的变化率低于第一预设损失阈值时,所述根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第一检测模型是否达到第一预设条件,具体包括:Optionally, when the first preset condition includes that the rate of change of the detection loss of the first detection model is lower than the first preset loss threshold, according to the actual position information of the target object on the training image, The actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object, determine whether the first detection model meets the first preset condition, Specifically:
根据所述目标物体在训练图像上的实际位置信息以及目标物体在待检测图像上的预测位置信息,确定目标物体的检测损失;其中,所述目标物体的检测损失包括目标物体的识别损失和目标物体的位置检测损失;According to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected, the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
根据所述目标物体的实际关键点位置信息以及所述目标物体的预测关键点位置信息,确定目标物体的关键点位置检测损失;Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object;
根据所述目标物体的识别损失、所述目标物体的位置检测损失和所述目标物体的关键点位置检测损失,确定第一检测模型的检测损失;Determine the detection loss of the first detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object;
根据所述第一检测模型的检测损失以及第一检测模型的历史检测损失,确定第一检测模型的检测损失变化率;所述第一检测模型的历史检测损失是在历史训练过程中确定的第一检测模型的检测损失;According to the detection loss of the first detection model and the historical detection loss of the first detection model, the detection loss change rate of the first detection model is determined; the historical detection loss of the first detection model is the first determined in the historical training process. 1. Detection loss of the detection model;
若所述第一检测模型的检测损失变化率低于第一预设损失阈值,则确定所述第一检测模型达到第一预设条件;If the detection loss change rate of the first detection model is lower than the first preset loss threshold, determining that the first detection model meets the first preset condition;
若所述第一检测模型的检测损失变化率不低于第一预设损失阈值,则确定所述第一检测模型未达到第一预设条件。If the detection loss change rate of the first detection model is not lower than the first preset loss threshold, it is determined that the first detection model does not meet the first preset condition.
可选的,所述根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第一检测模型是否达到第一预设条件,具体包括:Optionally, according to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the position information of the target object Predicting the location information of key points and judging whether the first detection model meets the first preset condition specifically includes:
根据所述目标物体在训练图像上的实际位置信息以及目标物体在待检测图像上的预测位置信息,确定目标物体的检测损失;其中,所述目标物体的检测损失包括目标物体的识别损失和目标物体的位置检测损失;According to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected, the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
根据所述目标物体的实际关键点位置信息以及所述目标物体的预测关键 点位置信息,确定目标物体的关键点位置检测损失;Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object;
根据所述目标物体的识别损失、所述目标物体的位置检测损失和所述目标物体的关键点位置检测损失,确定第一检测模型的检测损失;Determine the detection loss of the first detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object;
根据所述第一检测模型的检测损失以及第一检测模型的历史检测损失,确定第一检测模型的检测损失变化率;所述第一检测模型的历史检测损失是在历史训练过程中确定的第一检测模型的检测损失;According to the detection loss of the first detection model and the historical detection loss of the first detection model, the detection loss change rate of the first detection model is determined; the historical detection loss of the first detection model is the first determined in the historical training process. 1. Detection loss of the detection model;
若所述第一检测模型的检测损失变化率低于第一预设损失阈值,或第一检测模型的训练轮数达到第一预设轮数阈值,则确定所述第一检测模型达到第一预设条件;If the detection loss change rate of the first detection model is lower than the first preset loss threshold, or the number of training rounds of the first detection model reaches the first preset number of rounds threshold, it is determined that the first detection model reaches the first Precondition
若所述第一检测模型的检测损失变化率不低于第一预设损失阈值,且第一检测模型的训练轮数未达到第一预设轮数阈值,则确定所述第一检测模型未达到第一预设条件。If the detection loss change rate of the first detection model is not lower than the first preset loss threshold, and the number of training rounds of the first detection model does not reach the first preset number of rounds threshold, it is determined that the first detection model is not The first preset condition is reached.
可选的,所述根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息以及所述目标物体的关键点位置信息,具体为:Optionally, the determining the position information of the target object on the image to be detected and the key point position information of the target object according to the image feature corresponding to the image to be detected is specifically:
根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息和待检测图像上各个图像子区域内的关键点位置信息,并根据所述目标物体在待检测图像上的位置信息和所述待检测图像上各个图像子区域内的关键点位置信息,确定所述目标物体的关键点位置信息;所述图像子区域是锚点对应的感兴趣区域。According to the image features corresponding to the image to be detected, the position information of the target object on the image to be detected and the position information of key points in each image subregion on the image to be detected are determined, and according to the position of the target object on the image to be detected The position information and the key point position information in each image sub-region on the image to be detected determine the key point position information of the target object; the image sub-region is the region of interest corresponding to the anchor point.
可选的,所述根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息和待检测图像上各个图像子区域内的关键点位置信息,并根据所述目标物体在待检测图像上的位置信息和所述待检测图像上各个图像子区域内的关键点位置信息,确定所述目标物体的关键点位置信息,具体为:Optionally, the position information of the target object on the image to be detected and the position information of key points in each image subregion on the image to be detected are determined according to the image features corresponding to the image to be detected, and based on the target object The location information on the image to be detected and the location information of the key points in each image sub-region on the image to be detected, to determine the location information of the key points of the target object, specifically:
根据所述待检测图像对应的图像特征,利用预先构建的第二检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的位置信息以及目标物体的关键点位置信息;According to the image features corresponding to the image to be detected, use a pre-built second detection model to detect the target object and its key points, and determine the position information of the target object on the image to be detected and the key point position information of the target object;
其中,所述第二检测模型包括第二物体位置检测网络层、第二关键点位置检测网络层和物体关键点位置确定网络层,且所述第二物体位置检测网络层的输出结果和所述第二关键点位置检测网络层的输出结果是所述物体关键点位置确定网络层的输入数据,且所述第二物体位置检测网络层用于根据所述待检 测图像对应的图像特征确定目标物体在待检测图像上的位置信息;且所述第二关键点位置检测网络层用于根据所述待检测图像对应的图像特征确定待检测图像上各个图像子区域内的关键点位置信息;且所述物体关键点位置确定网络层用于根据所述目标物体在待检测图像上的位置信息和所述待检测图像上各个图像子区域内的关键点位置信息,确定所述目标物体的关键点位置信息。Wherein, the second detection model includes a second object position detection network layer, a second key point position detection network layer, and an object key point position determination network layer, and the output result of the second object position detection network layer and the The output result of the second key point position detection network layer is the input data of the object key point position determination network layer, and the second object position detection network layer is used to determine the target object according to the image feature corresponding to the image to be detected Position information on the image to be detected; and the second key point position detection network layer is used to determine the position information of key points in each image subregion on the image to be detected according to the image characteristics corresponding to the image to be detected; and The object key point position determination network layer is used to determine the key point position of the target object according to the position information of the target object on the image to be detected and the key point position information in each image subregion of the image to be detected information.
可选的,所述第二检测模型的训练过程,具体包括:Optionally, the training process of the second detection model specifically includes:
获取训练图像对应的图像特征、目标物体在训练图像上的实际位置信息以及目标物体的实际关键点位置信息;Obtain the image features corresponding to the training image, the actual position information of the target object on the training image, and the actual key point position information of the target object;
根据所述训练图像对应的图像特征,利用第二检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的预测位置信息以及目标物体的预测关键点位置信息;According to the image features corresponding to the training image, use the second detection model to detect the target object and its key points, and determine the predicted position information of the target object on the image to be detected and the predicted key point position information of the target object;
根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第二检测模型是否达到第二预设条件;According to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object, Judging whether the second detection model meets a second preset condition;
若确定所述第二检测模型未达到第二预设条件,则更新第二检测模型,继续执行“根据所述训练图像对应的图像特征,利用第二检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的预测位置信息以及目标物体的预测关键点位置信息”。If it is determined that the second detection model does not meet the second preset condition, update the second detection model, and continue to execute "Using the second detection model to detect the target object and its key points according to the image features corresponding to the training image , Determine the predicted location information of the target object on the image to be detected and the predicted key point location information of the target object".
可选的,所述第二预设条件为:第二检测模型的检测损失的变化率低于第二预设损失阈值,和/或,第二检测模型的训练轮数达到第二预设轮数阈值。Optionally, the second preset condition is: the rate of change of the detection loss of the second detection model is lower than a second preset loss threshold, and/or the number of training rounds of the second detection model reaches the second preset round Number threshold.
可选的,当所述第二预设条件包括第二检测模型的检测损失的变化率低于第二预设损失阈值时,所述根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第二检测模型是否达到第二预设条件,具体包括:Optionally, when the second preset condition includes that the rate of change of the detection loss of the second detection model is lower than a second preset loss threshold, according to the actual position information of the target object on the training image, The actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object, determine whether the second detection model meets a second preset condition, Specifically:
根据所述目标物体在训练图像上的实际位置信息以及目标物体在待检测图像上的预测位置信息,确定目标物体的检测损失;其中,所述目标物体的检测损失包括目标物体的识别损失和目标物体的位置检测损失;According to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected, the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
根据所述目标物体的实际关键点位置信息以及所述目标物体的预测关键点位置信息,确定目标物体的关键点位置检测损失;Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object;
根据所述目标物体的识别损失、所述目标物体的位置检测损失和所述目标物体的关键点位置检测损失,确定第二检测模型的检测损失;Determine the detection loss of the second detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object;
根据所述第二检测模型的检测损失以及第二检测模型的历史检测损失,确定第二检测模型的检测损失变化率;所述第二检测模型的历史检测损失是在历史训练过程中确定的第二检测模型的检测损失;According to the detection loss of the second detection model and the historical detection loss of the second detection model, the detection loss change rate of the second detection model is determined; the historical detection loss of the second detection model is the first determined in the historical training process. 2. Detection loss of the detection model;
若所述第二检测模型的检测损失变化率低于第二预设损失阈值,则确定所述第二检测模型达到第二预设条件;If the detection loss change rate of the second detection model is lower than a second preset loss threshold, determining that the second detection model meets the second preset condition;
若所述第二检测模型的检测损失变化率不低于第二预设损失阈值,则确定所述第二检测模型未达到第二预设条件。If the detection loss change rate of the second detection model is not lower than the second preset loss threshold, it is determined that the second detection model does not meet the second preset condition.
可选的,所述根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第二检测模型是否达到第二预设条件,具体包括:Optionally, according to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the position information of the target object Predicting the location information of key points and judging whether the second detection model meets the second preset condition specifically includes:
根据所述目标物体在训练图像上的实际位置信息以及目标物体在待检测图像上的预测位置信息,确定目标物体的检测损失;其中,所述目标物体的检测损失包括目标物体的识别损失和目标物体的位置检测损失;According to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected, the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
根据所述目标物体的实际关键点位置信息以及所述目标物体的预测关键点位置信息,确定目标物体的关键点位置检测损失;Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object;
根据所述目标物体的识别损失、所述目标物体的位置检测损失和所述目标物体的关键点位置检测损失,确定第二检测模型的检测损失;Determine the detection loss of the second detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object;
根据所述第二检测模型的检测损失以及第二检测模型的历史检测损失,确定第二检测模型的检测损失变化率;所述第二检测模型的历史检测损失是在历史训练过程中确定的第二检测模型的检测损失;According to the detection loss of the second detection model and the historical detection loss of the second detection model, the detection loss change rate of the second detection model is determined; the historical detection loss of the second detection model is the first determined in the historical training process. 2. Detection loss of the detection model;
若所述第二检测模型的检测损失变化率低于第二预设损失阈值,或第二检测模型的训练轮数达到第二预设轮数阈值,则确定所述第二检测模型达到第二预设条件;If the detection loss change rate of the second detection model is lower than the second preset loss threshold, or the number of training rounds of the second detection model reaches the second preset number of rounds threshold, it is determined that the second detection model reaches the second Precondition
若所述第二检测模型的检测损失变化率不低于第二预设损失阈值,且第二检测模型的训练轮数未达到第二预设轮数阈值,则确定所述第二检测模型未达到第二预设条件。If the detection loss change rate of the second detection model is not lower than the second preset loss threshold, and the number of training rounds of the second detection model does not reach the second preset number of rounds threshold, it is determined that the second detection model is not The second preset condition is reached.
本申请实施例还提供了一种图像中物体及其关键点的确定装置,包括:The embodiment of the present application also provides an apparatus for determining an object and its key points in an image, including:
提取单元,用于对待检测图像进行特征提取,得到所述待检测图像对应的图像特征;An extraction unit, configured to perform feature extraction on the image to be detected to obtain image features corresponding to the image to be detected;
确定单元,用于根根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息以及所述目标物体的关键点位置信息;其中,所述目标物体的关键点用于表征所述目标物体的结构特征。The determining unit is configured to determine the position information of the target object on the image to be detected and the key point position information of the target object based on the image feature corresponding to the image to be detected; wherein, the key point of the target object is used for Characterize the structural characteristics of the target object.
本申请实施例还提供了一种设备,所述设备包括处理器以及存储器:An embodiment of the present application also provides a device, which includes a processor and a memory:
所述存储器用于存储计算机程序;The memory is used to store a computer program;
所述处理器用于根据所述计算机程序执行上述提供的图像中物体及其关键点的确定方法的任一实施方式。The processor is configured to execute any embodiment of the method for determining the object and its key points in the image provided above according to the computer program.
本申请实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质用于存储计算机程序,所述计算机程序用于执行上述提供的图像中物体及其关键点的确定方法的任一实施方式。The embodiments of the present application also provide a computer-readable storage medium, the computer-readable storage medium is used to store a computer program, and the computer program is used to execute any of the methods for determining objects and their key points in the image provided above. One embodiment.
与现有技术相比,本申请实施例至少具有以下优点:Compared with the prior art, the embodiments of the present application have at least the following advantages:
本申请实施例提供的图像中物体及其关键点的确定方法中,在从待检测图像中提取到该待检测图像对应的图像特征之后,直接根据该待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息以及该目标物体的关键点位置信息。其中,由于目标物体在待检测图像上的位置信息能够准确地表征待检测图像中目标物体的位置,且目标物体的关键点位置信息能够准确地表征待检测图像中目标物体的关键点位置,因而,本申请实施例提供的图像中物体及其关键点的确定方法能够有效地从待检测图像中确定出目标物体的位置信息及其关键点的位置信息,从而能够准确地检测出图像中物体及其关键点。In the method for determining an object and its key points in an image provided by the embodiment of the present application, after the image feature corresponding to the image to be detected is extracted from the image to be detected, the target object is determined directly according to the image feature corresponding to the image to be detected The position information on the image to be detected and the key point position information of the target object. Among them, because the position information of the target object on the image to be detected can accurately represent the position of the target object in the image to be detected, and the key point position information of the target object can accurately represent the key point position of the target object in the image to be detected, The method for determining the object and its key points in the image provided by the embodiments of the application can effectively determine the position information of the target object and the position information of its key points from the image to be detected, so as to accurately detect the object and the key point in the image. Its key point.
附图说明Description of the drawings
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。In order to more clearly describe the technical solutions in the embodiments of the present application or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only These are some embodiments described in this application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative work.
图1为本申请实施例提供的利用级联的两个模型确定图像中物体及其关键点的示意图;FIG. 1 is a schematic diagram of using two cascaded models to determine objects and their key points in an image according to an embodiment of the application;
图2为本申请方法实施例一提供的图像中物体及其关键点的确定方法流程图;2 is a flowchart of a method for determining an object and its key points in an image provided by Embodiment 1 of the method of this application;
图3为本申请方法实施例二提供的图像中物体及其关键点的确定方法流程图;FIG. 3 is a flowchart of a method for determining objects and their key points in an image provided by the second embodiment of the method of this application;
图4为本申请实施例提供的第一检测模型应用于人脸位置及其关键点检测时的示意图;FIG. 4 is a schematic diagram when the first detection model provided by an embodiment of the application is applied to the detection of the position of a human face and its key points;
图5为本申请实施例提供的第一检测模型的训练过程流程图;FIG. 5 is a flowchart of the training process of the first detection model provided by an embodiment of the application;
图6为本申请实施例提供的步骤S53的一种实施方式的流程图;FIG. 6 is a flowchart of an implementation manner of step S53 according to an embodiment of this application;
图7为本申请实施例提供的步骤S53的另一种实施方式的流程图;FIG. 7 is a flowchart of another implementation manner of step S53 according to an embodiment of the application;
图8为本申请方法实施例四提供的图像中物体及其关键点的确定方法流程图;FIG. 8 is a flowchart of a method for determining an object and its key points in an image provided by the fourth embodiment of the method of this application;
图9为本申请实施例提供的第二检测模型应用于人脸位置及其关键点检测时的示意图;FIG. 9 is a schematic diagram when the second detection model provided by an embodiment of the application is applied to the detection of the position of a human face and its key points;
图10为本申请实施例提供的第二检测模型的训练过程流程图;FIG. 10 is a flowchart of the training process of the second detection model provided by an embodiment of the application;
图11为本申请实施例提供的步骤S103的一种实施方式的流程图;FIG. 11 is a flowchart of an implementation manner of step S103 according to an embodiment of this application;
图12为本申请实施例提供的步骤S103的另一种实施方式的流程图;FIG. 12 is a flowchart of another implementation manner of step S103 according to an embodiment of this application;
图13为本申请实施例提供的图像中物体及其关键点的确定装置的结构示意图;FIG. 13 is a schematic structural diagram of an apparatus for determining an object and its key points in an image provided by an embodiment of the application;
图14为本申请实施例提供的设备结构示意图。FIG. 14 is a schematic diagram of the device structure provided by an embodiment of the application.
具体实施方式Detailed ways
为了解决背景技术部分的技术问题,发明人经过研究发现:可以利用级联的物体检测模型(例如,人脸检测模型)以及物体的关键点检测模型(例如,人脸关键点检测模型),来依次确定目标物体(例如,人脸)在待检测图像上的位置信息和目标物体的关键点位置信息。其中,由于物体检测模型包括图像特征提取网络和物体位置检测网络,而且物体的关键点检测模型包括图像特征提取网络和物体关键点检测网络,因而,在利用级联的物体检测模型以及物体的关键点检测模型确定目标物体及其关键点时,需要依次经过以下步骤:对待检测图像进行图像特征提取→检测目标物体的位置信息→对目标物体的图像进行图像特征提取→检测目标物体的关键点的位置信息。In order to solve the technical problems in the background technology, the inventor found through research that cascaded object detection models (for example, face detection models) and key point detection models of objects (for example, face key point detection models) can be used to The position information of the target object (for example, human face) on the image to be detected and the key point position information of the target object are sequentially determined. Among them, because the object detection model includes the image feature extraction network and the object position detection network, and the key point detection model of the object includes the image feature extraction network and the object key point detection network, the cascaded object detection model and the key point of the object are used. When the point detection model determines the target object and its key points, it needs to go through the following steps in sequence: image feature extraction of the image to be detected → detection of the location information of the target object → image feature extraction of the image of the target object → detection of the key points of the target object location information.
为了便于理解和解释上述过程,下面结合人脸及其关键点的检测过程进行说明。In order to facilitate the understanding and explanation of the above process, the following describes the detection process of the human face and its key points.
作为示例,如图1所示,当目标物体为人脸时,则利用级联的人脸检测模型以及人脸关键点检测模型确定人脸及其关键点的过程,具体为:首先,对待检测图像进行特征提取,获取第一图像特征;其次,利用人脸检测网络对第一图像特征进行检测,得到人脸图像(也就是,人脸检测结果);然后,对人脸图像进行特征提取,获取第二图像特征;最后,利用人脸关键点检测网络对第二图像特征进行检测,得到携带有人脸关键点信息的图像(也就是,人脸关键点检测结果)。As an example, as shown in Figure 1, when the target object is a face, the cascaded face detection model and the face key point detection model are used to determine the face and its key points. Specifically, the process is as follows: First, the detected image Perform feature extraction to obtain the first image feature; secondly, use the face detection network to detect the first image feature to obtain the face image (that is, the face detection result); then, perform feature extraction on the face image to obtain The second image feature; finally, the second image feature is detected using the face key point detection network to obtain an image that carries the face key point information (that is, the face key point detection result).
基于背景技术部分的技术问题以及上述技术方案,发明人进行了进一步研究发现:在利用级联的物体检测模型以及物体的关键点检测模型确定物体及其关键点时,需要进行两次图像特征提取,增加了图像中物体及其关键点的确定过程的复杂性,降低了图像中物体及其关键点的确定效率,还增加了确定图像中物体及其关键点时的内存消耗。另外,由于物体检测模型和物体的关键点检测模型均需要利用深度神经网络进行实现,而且每个深度神经网络在使用以及训练时均会消耗较大的内存以及时间,因而,级联的物体检测模型以及物体的关键点检测模型的使用过程以及训练过程均需要消耗较大的内存以及时间,从而增加了图像中物体及其关键点的获取成本,限制了基于级联的物体检测模型以及物体的关键点检测模型的图像中物体及其关键点的检测方法的应用范围。Based on the technical problems in the background technology and the above technical solutions, the inventor has conducted further research and found that when using the cascaded object detection model and the key point detection model of the object to determine the object and its key points, two image feature extractions are required. This increases the complexity of the process of determining objects and their key points in the image, reduces the efficiency of determining objects and their key points in the image, and also increases the memory consumption when determining objects and their key points in the image. In addition, since the object detection model and the key point detection model of the object need to be implemented by deep neural networks, and each deep neural network consumes a large amount of memory and time during use and training, therefore, cascaded object detection The use process and training process of the model and the key point detection model of the object require a large amount of memory and time, which increases the acquisition cost of the object and its key point in the image, and limits the cascade-based object detection model and the object The scope of application of the detection method of objects and their key points in the image of the key point detection model.
为了解决背景技术部分的技术问题以及克服上述级联的两个模型的技术方案所存在的缺陷,本申请实施例提供了一种图像中物体及其关键点的确定方法,该方法具体包括:在从待检测图像中提取到该待检测图像对应的图像特征之后,直接根据该待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息以及该目标物体的关键点位置信息。In order to solve the technical problems of the background technology and overcome the defects of the technical solutions of the two cascaded models described above, an embodiment of the present application provides a method for determining objects and their key points in an image. The method specifically includes: After the image feature corresponding to the image to be detected is extracted from the image to be detected, the location information of the target object on the image to be detected and the key point location information of the target object are determined directly according to the image feature corresponding to the image to be detected.
在本申请实施例提供的图像中物体及其关键点的确定方法中,由于目标物体在待检测图像上的位置信息能够准确地表征待检测图像中目标物体的位置,且目标物体的关键点位置信息能够准确地表征待检测图像中目标物体的关键点位置,因而,本申请实施例提供的图像中物体及其关键点的确定方法能够有效地从待检测图像中确定出目标物体的位置信息及其关键点的位置信息,从而能够准确地检测出图像中物体及其关键点。另外,由于在获取到待检测图像对 应的图像特征之后,直接基于该待检测图像对应的图像特征确定目标物体在待检测图像上的位置信息以及所述目标物体的关键点位置信息即可,无需进行将目标物体在待检测图像上的位置信息还原为物体检测图像以及对该物体检测图像进行特征提取的过程,使得在确定目标物体在待检测图像上的位置信息以及目标物体的关键点位置信息过程中只需进行一次图像特征提取过程,如此简化了图像中物体及其关键点的确定过程,提高了图像中物体及其关键点的确定效率,并降低了在确定图像中物体及其关键点时的内存损耗。In the method for determining the object and its key points in the image provided by the embodiments of the present application, since the position information of the target object on the image to be detected can accurately represent the position of the target object in the image to be detected, and the key point position of the target object The information can accurately characterize the position of the key points of the target object in the image to be detected. Therefore, the method for determining the object and its key points in the image provided by the embodiments of the present application can effectively determine the position information and the position of the target object from the image to be detected. The location information of its key points can accurately detect the objects and their key points in the image. In addition, since the image feature corresponding to the image to be detected is obtained, the location information of the target object on the image to be detected and the key point location information of the target object can be determined directly based on the image feature corresponding to the image to be detected. The process of restoring the position information of the target object on the image to be detected into an object detection image and performing feature extraction on the object detection image, so that the position information of the target object on the image to be detected and the key point position information of the target object are determined Only one image feature extraction process is required in the process, which simplifies the process of determining objects and their key points in the image, improves the efficiency of determining objects and their key points in the image, and reduces the determination of objects and their key points in the image Memory loss at the time.
为了使本技术领域的人员更好地理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to enable those skilled in the art to better understand the solutions of the present invention, the technical solutions in the embodiments of the present invention will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only These are a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.
方法实施例一Method embodiment one
参见图2,该图为本申请方法实施例一提供的图像中物体及其关键点的确定方法流程图。Refer to FIG. 2, which is a flowchart of a method for determining an object and its key points in an image provided in the first method of the application.
本申请实施例提供的图像中物体及其关键点的确定方法,具体包括步骤S21-S22:The method for determining objects and their key points in an image provided by the embodiments of the present application specifically includes steps S21-S22:
S21:对待检测图像进行特征提取,得到所述待检测图像对应的图像特征。S21: Perform feature extraction on the image to be detected to obtain image features corresponding to the image to be detected.
待检测图像是指需要进行物体检测的图像。The image to be detected refers to the image that needs to be detected.
待检测图像对应的图像特征用于表征该待检测图像自身所具有的特征信息。The image feature corresponding to the image to be detected is used to characterize the feature information of the image to be detected.
本申请实施例不限定图像特征提取方法,可以采用任一种现有或未来出现的能够从待检测图像中进行特征提取的图像特征提取方法。例如,图像特征提取方法可以是卷积算法。The embodiment of the present application does not limit the image feature extraction method, and any existing or future image feature extraction method capable of extracting features from the image to be detected can be used. For example, the image feature extraction method may be a convolution algorithm.
S22:根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息以及所述目标物体的关键点位置信息;其中,所述目标物体的关键点用于表征所述目标物体的结构特征。S22: Determine the position information of the target object on the image to be detected and the key point position information of the target object according to the image features corresponding to the image to be detected; wherein the key points of the target object are used to characterize the target The structural characteristics of the object.
目标物体是指需要在图像上进行识别的对象;而且,本申请实施例不限定目标物体,目标物体可以是人脸、车辆、茶杯、以及其他物体。另外,目标物体可以提前设定,尤其可以根据应用场景设定。The target object refers to the object that needs to be recognized on the image; moreover, the embodiment of the present application does not limit the target object, and the target object may be a human face, a vehicle, a tea cup, and other objects. In addition, the target object can be set in advance, especially according to the application scenario.
目标物体在待检测图像上的位置信息用于记录在待检测图像中,目标物体所处区域的位置相关信息;而且,本申请实施例不限定目标物体在待检测图像上的位置信息的表示形式,可以采用文字、数字、符号等中的至少一个进行表示。例如,目标物体在待检测图像上的位置信息可以利用锚点(anchor)对应的感兴趣区域(region of interest,roi)以及待检测图像中目标物体所处区域相对于该anchor对应的roi的偏移参数来进行描述。其中,每个anchor对应的roi可以利用区域参数(x,y,w,h),且x和y用于表示该anchor对应的roi的中心点像素坐标;w用于表示该anchor对应的roi的宽度;h用于表示该anchor对应的roi的高度。偏移参数可以利用区域偏移参数(△x,△y,△w,△h),且△x和△y用于表示待检测图像中目标物体所处区域的中心点像素坐标相对于该anchor对应的roi的中心点像素坐标的偏移量;△w用于表示待检测图像中目标物体所处区域的宽度相对于该anchor对应的roi的宽度偏移量;△h用于表示待检测图像中目标物体所处区域的高度相对于该anchor对应的roi的高度偏移量。The position information of the target object on the image to be detected is used to record the position-related information of the area where the target object is located in the image to be detected; moreover, the embodiment of this application does not limit the representation form of the position information of the target object on the image to be detected , Can be represented by at least one of words, numbers, symbols, etc. For example, the position information of the target object on the image to be detected can use the region of interest (roi) corresponding to the anchor point and the deviation of the area of the target object in the image to be detected relative to the roi corresponding to the anchor. Shift parameters to describe. Among them, the roi corresponding to each anchor can use the regional parameters (x, y, w, h), and x and y are used to represent the center point pixel coordinates of the roi corresponding to the anchor; w is used to represent the roi corresponding to the anchor Width; h is used to indicate the height of the roi corresponding to the anchor. The offset parameter can use the area offset parameter (△x, △y, △w, △h), and △x and △y are used to indicate that the pixel coordinates of the center point of the target object in the image to be detected are relative to the anchor The offset of the pixel coordinates of the center point of the corresponding roi; △w is used to indicate the width offset of the target object in the image to be detected relative to the width of the roi corresponding to the anchor; △h is used to indicate the image to be detected The height offset of the area where the target object is located relative to the height of the roi corresponding to the anchor.
为了便于理解和解释目标物体在待检测图像上的位置信息,下面结合示例进行说明。In order to facilitate the understanding and explanation of the position information of the target object on the image to be detected, the following description is combined with examples.
作为示例,假设,目标物体为人脸;且待检测图像包括第1个anchor至第N个anchor;且第1个anchor对应的roi的区域参数为(x 1,y 1,w 1,h 1),第2个anchor对应的roi的区域参数为(x 2,y 2,w 2,h 2),……,第N个anchor对应的roi的区域参数为(x N,y N,w N,h N);且第T个anchor对应的roi中携带有人脸;且待检测图像中人脸所处区域相对于第T个anchor对应的roi的偏移参数为(△x,△y,△w,△h),且1≤T≤N。基于上述假设可知,人脸在待检测图像上的位置信息可以利用待检测图像中人脸所处区域的区域参数(x N+△x,y N+△y,w N+△w,h N+△h)表示。 As an example, suppose that the target object is a human face; and the image to be detected includes the first anchor to the Nth anchor; and the area parameter of the roi corresponding to the first anchor is (x 1 , y 1 , w 1 , h 1 ) , The area parameter of the roi corresponding to the second anchor is (x 2 , y 2 , w 2 , h 2 ),..., the area parameter of the roi corresponding to the Nth anchor is (x N , y N , w N , h N ); and the roi corresponding to the T-th anchor carries a human face; and the offset parameter of the area of the face in the image to be detected relative to the roi corresponding to the T-th anchor is (△x, △y, △w ,△h), and 1≤T≤N. Based on the above assumptions, it can be known that the position information of the face on the image to be detected can use the regional parameters of the area where the face is located in the image to be detected (x N +△x,y N +△y,w N +△w,h N +△h) means.
目标物体的关键点用于表征目标物体的结构特征。例如,当目标物体为人脸时,则目标物体的关键点可以包括能够表征人脸五官位置以及脸部轮廓位置的相关点。The key points of the target object are used to characterize the structural characteristics of the target object. For example, when the target object is a human face, the key points of the target object may include related points that can characterize the position of the facial features of the human face and the position of the contour of the face.
在本申请实施例中,在获取到待检测图像对应的图像特征之后,可以依据待检测图像对应的图像特征,直接确定目标物体在待检测图像上的位置信息以及所述目标物体的关键点位置信息。In the embodiment of the present application, after the image feature corresponding to the image to be detected is acquired, the position information of the target object on the image to be detected and the key point position of the target object can be directly determined according to the image feature corresponding to the image to be detected information.
另外,本申请实施例还提供了步骤S22的两种具体实施方式,且步骤S22这两种具体实施方式将分别在下文 方法实施例二和下文 方法实施例四中进行介绍,也就是说,步骤S22这两种具体实施方式分别是下文 方法实施例二中的步骤S32和下文 方法实施例四中的步骤S82,技术详情请参见下文。 Further, the present application further provides two specific embodiments of step S22, and step S22 DETAILED DESCRIPTION two embodiments respectively four and two embodiments are described in the methods described below the methods described below In embodiments, i.e., the step of both S22 DETAILED DESCRIPTION hereinafter embodiments are the method of Example II step S32 and step four methods described below in Example S82 embodiment, technical details, see below.
以上为方法实施例一提供的图像中物体及其关键点的确定方法的具体实施方式,在该实施方式中,在从待检测图像中提取到该待检测图像对应的图像特征之后,直接根据该待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息以及该目标物体的关键点位置信息。其中,由于目标物体在待检测图像上的位置信息能够准确地表征待检测图像中目标物体的位置,且目标物体的关键点位置信息能够准确地表征待检测图像中目标物体的关键点位置,因而,本申请实施例提供的图像中物体及其关键点的确定方法能够有效地从待检测图像中确定出目标物体的位置信息及其关键点的位置信息,从而能够准确地检测出图像中物体及其关键点。另外,由于在获取到待检测图像对应的图像特征之后,直接基于该待检测图像对应的图像特征确定目标物体在待检测图像上的位置信息以及所述目标物体的关键点位置信息即可,无需进行将目标物体在待检测图像上的位置信息还原为物体检测图像以及对该物体检测图像进行特征提取的过程,使得在确定目标物体在待检测图像上的位置信息以及目标物体的关键点位置信息过程中只需进行一次图像特征提取过程,如此简化了图像中物体及其关键点的确定过程,提高了图像中物体及其关键点的确定效率,并降低了在确定图像中物体及其关键点时的内存损耗。The above is the specific implementation of the method for determining the object and its key points in the image provided by method embodiment 1. In this implementation, after the image feature corresponding to the image to be detected is extracted from the image to be detected, it is directly based on the The image feature corresponding to the image to be detected determines the position information of the target object on the image to be detected and the key point position information of the target object. Among them, because the position information of the target object on the image to be detected can accurately represent the position of the target object in the image to be detected, and the key point position information of the target object can accurately represent the key point position of the target object in the image to be detected, The method for determining the object and its key points in the image provided by the embodiments of the application can effectively determine the position information of the target object and the position information of its key points from the image to be detected, so as to accurately detect the object and the key point in the image. Its key point. In addition, since the image feature corresponding to the image to be detected is obtained, the location information of the target object on the image to be detected and the key point location information of the target object can be determined directly based on the image feature corresponding to the image to be detected. The process of restoring the position information of the target object on the image to be detected into an object detection image and performing feature extraction on the object detection image, so that the position information of the target object on the image to be detected and the key point position information of the target object are determined Only one image feature extraction process is required in the process, which simplifies the process of determining objects and their key points in the image, improves the efficiency of determining objects and their key points in the image, and reduces the determination of objects and their key points in the image Memory loss at the time.
为了提高图像中物体及其关键点的位置检测准确性,基于上述方法实施例一提供的图像中物体及其关键点的确定方法,本申请实施例还提供了图像中物体及其关键点的确定方法的另一种实施方式,下面结合附图进行解释和说明。In order to improve the accuracy of the position detection of objects and their key points in the image, based on the method for determining objects and their key points in the image provided in the first embodiment of the above method, this embodiment of the application also provides the determination of objects and their key points in the image. Another embodiment of the method is explained and illustrated below in conjunction with the drawings.
方法实施例二Method embodiment two
方法实施例二是在方法实施例一的基础上进行的改进,为了简要起见,方法实施例二中与方法实施例一中内容相同的部分,在此不再赘述。The second method embodiment is an improvement made on the basis of the first method embodiment. For the sake of brevity, the parts in the second method embodiment with the same content as in the first method embodiment will not be repeated here.
参见图3,该图为本申请方法实施例二提供的图像中物体及其关键点的确定方法流程图。Refer to FIG. 3, which is a flowchart of a method for determining an object and its key points in an image provided by the second method of the application.
本申请实施例提供的图像中物体及其关键点的确定方法,具体包括步骤S31-S32:The method for determining the object and its key points in the image provided by the embodiment of the present application specifically includes steps S31-S32:
S31:对待检测图像进行特征提取,得到所述待检测图像对应的图像特征。S31: Perform feature extraction on the image to be detected to obtain image features corresponding to the image to be detected.
需要说明的是,步骤S31的具体内容与上述方法实施例一中步骤S21的内容相同,为了简要起见,在此不再赘述。It should be noted that the specific content of step S31 is the same as the content of step S21 in the first embodiment of the above method, and for the sake of brevity, it will not be repeated here.
S32:根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息,并根据所述待检测图像对应的图像特征和所述目标物体在待检测图像上的位置信息,确定目标物体的关键点位置信息;其中,所述目标物体的关键点用于表征所述目标物体的结构特征。S32: Determine the position information of the target object on the image to be detected according to the image feature corresponding to the image to be detected, and determine the position information of the target object on the image to be detected according to the image feature corresponding to the image to be detected and the position information of the target object on the image to be detected. The key point position information of the target object is determined; wherein, the key point of the target object is used to characterize the structural feature of the target object.
在本申请实施例中,在获取到待检测图像对应的图像特征之后,需要先利用该待检测图像对应的图像特征确定目标物体在待检测图像上的位置信息,再利用该目标物体在待检测图像上的位置信息以及待检测图像对应的图像特征,确定目标物体的关键点位置信息。In the embodiment of the present application, after the image features corresponding to the image to be detected are acquired, the image features corresponding to the image to be detected need to be used to determine the position information of the target object on the image to be detected, and then the target object is used in the image to be detected. The location information on the image and the image features corresponding to the image to be detected determine the location information of the key points of the target object.
另外,为了提高图像中物体及其关键点的确定效率,可以利用由物体位置检测网络和物体关键点检测网络融合而成的第一检测模型确定物体及其关键点。基于此,本申请实施例还提供了步骤S32的一种实施方式,在该实施方式中,步骤S32具体可以为:根据所述待检测图像对应的图像特征,利用预先构建的第一检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的位置信息以及目标物体的关键点位置信息。In addition, in order to improve the efficiency of determining the object and its key points in the image, the first detection model formed by the fusion of the object position detection network and the object key point detection network can be used to determine the object and its key points. Based on this, the embodiment of the present application also provides an implementation manner of step S32. In this implementation manner, step S32 may specifically be: according to the image features corresponding to the image to be detected, using a pre-built first detection model to pair The target object and its key points are detected, and the position information of the target object on the image to be detected and the key point position information of the target object are determined.
第一检测模型用于根据待检测图像对应的图像特征对目标物体在待检测图像上的位置信息以及目标物体的关键点位置信息进行检测。The first detection model is used to detect the position information of the target object on the image to be detected and the key point position information of the target object according to the image features corresponding to the image to be detected.
第一检测模型包括第一物体位置检测网络层和第一关键点位置检测网络层,且所述第一物体位置检测网络层的输出结果是所述第一关键点位置检测网络层的输入数据,且所述第一物体位置检测网络层用于根据所述待检测图像对应的图像特征确定目标物体在待检测图像上的位置信息,所述第一关键点位置检测网络层用于根据所述待检测图像对应的图像特征和所述目标物体在待检测图像上的位置信息,确定目标物体的关键点位置信息。也就是,在第一检测模型中,第一检测模型的输入数据“待检测图像对应的图像特征”就是第一物 体位置检测网络层的输入数据,而且第一物体位置检测网络层的输出数据就是第一关键点位置检测网络层的输入数据。例如,当目标物体为人脸时,如图4所示,第一检测模型的输入数据“待检测图像对应的图像特征”就是人脸检测网络层的输入数据,而且人脸检测网络层的输出数据就是人脸关键点检测网络层的输入数据。需要说明的是:在图4中,“人脸检测网络层”用于表示应用于人脸位置及其关键点检测时的“第一物体位置检测网络层”;“人脸关键点检测网络层”用于表示应用于人脸位置及其关键点检测时的“第一关键点位置检测网络层”。The first detection model includes a first object position detection network layer and a first key point position detection network layer, and the output result of the first object position detection network layer is the input data of the first key point position detection network layer, And the first object position detection network layer is used to determine the position information of the target object on the image to be detected according to the image features corresponding to the image to be detected, and the first key point position detection network layer is used to determine the position information of the target object on the image to be detected according to the image feature to be detected. The image feature corresponding to the detected image and the position information of the target object on the image to be detected are determined, and the key point position information of the target object is determined. That is, in the first detection model, the input data "image feature corresponding to the image to be detected" of the first detection model is the input data of the first object position detection network layer, and the output data of the first object position detection network layer is The first key point position detects the input data of the network layer. For example, when the target object is a human face, as shown in Figure 4, the input data of the first detection model "image features corresponding to the image to be detected" is the input data of the face detection network layer, and the output data of the face detection network layer It is the input data of the face key point detection network layer. It should be noted that: in Figure 4, "face detection network layer" is used to indicate the "first object position detection network layer" when applied to face position and key point detection; "face key point detection network layer" "Is used to indicate the "first key point position detection network layer" when it is applied to the detection of the face position and its key points.
需要说明的是,本申请实施例不限定第一检测模型中第一关键点位置检测网络层对第一物体位置检测网络层的输出数据的使用方式,例如,可以由第一关键点位置检测网络层直接使用第一物体位置检测网络层的输出数据;还可以先根据第一物体位置检测网络层的输出数据确定包括目标物体的图像区域相关信息,再将由第一关键点位置检测网络层使用包括目标物体的图像区域相关信息,本申请实施例对此不做具体限定。It should be noted that the embodiment of the present application does not limit the way in which the first key point position detection network layer in the first detection model uses the output data of the first object position detection network layer. For example, the first key point position detection network can be used The layer directly uses the output data of the first object position detection network layer; it can also first determine the image area related information including the target object according to the output data of the first object position detection network layer, and then use the first key point position detection network layer to include The image area related information of the target object is not specifically limited in the embodiment of the present application.
需要说明的是,在本申请实施例中,第一检测模型是一个深度神经网络,而且该深度神经网络是由第一物体位置检测网络层和第一关键点位置检测网络层构成的。如此,在使用第一检测模型进行物体及其关键点检测时,只需运行一个深度神经网络即可,节约了运行内存以及运行时间。It should be noted that, in the embodiment of the present application, the first detection model is a deep neural network, and the deep neural network is composed of a first object position detection network layer and a first key point position detection network layer. In this way, when using the first detection model to detect objects and their key points, only one deep neural network needs to be run, which saves running memory and running time.
还需要说明的是,在本申请实施例中还提供了一种第一检测模型的训练过程,且该训练过程将在 方法实施例三中进行详细说明,技术详情请参照 方法实 施例三It is further noted that the embodiment is also provided a process of training a first detection model in the present application embodiment, and the training process according to a third embodiment will be described in detail in the process, technical details, refer to the method of Example III.
以上为方法实施例二提供的图像中物体及其关键点的确定方法的具体实施方式,在该实施方式中,在从待检测图像中提取到该待检测图像对应的图像特征之后,先根据该待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息,再直接根据该目标物体在待检测图像上的位置信息以及待检测图像对应的图像特征,确定目标物体的关键点位置信息。其中,由于目标物体在待检测图像上的位置信息能够准确地表征待检测图像中目标物体的位置,且目标物体的关键点位置信息能够准确地表征待检测图像中目标物体的关键点位置,因而,本申请实施例提供的图像中物体及其关键点的确定方法能够有效地从待检测图像中确定出目标物体的位置信息及其关键点的位置信息,从而能 够准确地检测出图像中物体及其关键点。另外,由于在获取到目标物体在待检测图像上的位置信息之后,直接将目标物体在待检测图像上的位置信息作为确定目标物体的关键点位置信息的依据即可,无需进行将目标物体在待检测图像上的位置信息还原为物体检测图像以及对该物体检测图像进行特征提取的过程,使得在确定目标物体在待检测图像上的位置信息以及目标物体的关键点位置信息过程中只需进行一次图像特征提取过程,如此简化了图像中物体及其关键点的确定过程,提高了图像中物体及其关键点的确定效率,并降低了在确定图像中物体及其关键点时的内存损耗。The above is the specific implementation of the method for determining the object and its key points in the image provided by method embodiment 2. In this implementation, after the image feature corresponding to the image to be detected is extracted from the image to be detected, the Image features corresponding to the image to be detected, determine the position information of the target object on the image to be detected, and then directly determine the key point position of the target object based on the position information of the target object on the image to be detected and the corresponding image features of the image to be detected information. Among them, because the position information of the target object on the image to be detected can accurately represent the position of the target object in the image to be detected, and the key point position information of the target object can accurately represent the key point position of the target object in the image to be detected, The method for determining the object and its key points in the image provided by the embodiments of the application can effectively determine the position information of the target object and the position information of its key points from the image to be detected, so as to accurately detect the object and the key point in the image. Its key point. In addition, since the position information of the target object on the image to be detected is obtained, the position information of the target object on the image to be detected can be directly used as the basis for determining the key point position information of the target object. The process of restoring the position information on the image to be detected to the object detection image and performing feature extraction on the object detection image, so that the process of determining the position information of the target object on the image to be detected and the key point position information of the target object only needs to be performed An image feature extraction process simplifies the process of determining objects and their key points in the image, improves the efficiency of determining objects and their key points in the image, and reduces the memory loss when determining objects and their key points in the image.
另外,为了进一步降低确定图像中物体及其关键点时的内存以及时间消耗,可以将利用第一检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的位置信息以及目标物体的关键点位置信息。其中,由于在该第一检测模型是一个包括了第一物体位置检测网络层和第一关键点位置检测网络层的深度神经网络,因而,在对该第一检测模型进行训练时只需训练一个深度神经网络即可,减少了训练第一检测模型时的内存以及时间消耗;而且,在使用该第一检测模型进行物体及其关键点检测时,只需运行一个深度神经网络即可,降低了确定物体及其关键点时的内存以及时间损耗。In addition, in order to further reduce the memory and time consumption when determining the object and its key points in the image, the first detection model can be used to detect the target object and its key points to determine the position information of the target object on the image to be detected and the target The key point position information of the object. Among them, since the first detection model is a deep neural network that includes the first object position detection network layer and the first key point position detection network layer, only one training model is required when training the first detection model. A deep neural network is sufficient, which reduces the memory and time consumption when training the first detection model; moreover, when using the first detection model to detect objects and their key points, you only need to run a deep neural network, which reduces Memory and time loss when determining objects and their key points.
基于上述方法实施例一和方法实施例二提供的图像中物体及其关键点的确定方法,为了进一步提高确定的图像中物体及其检测点的准确性,可以对第一检测模型进行训练,以便提高第一检测模型的检测效果。基于此,本申请实施例还提供了一种第一检测模型的训练过程,下面结合附图进行解释和说明。Based on the method for determining objects and their key points in the image provided by the above method embodiment 1 and method embodiment 2, in order to further improve the accuracy of the objects and their detection points in the determined image, the first detection model can be trained to Improve the detection effect of the first detection model. Based on this, the embodiment of the present application also provides a training process of the first detection model, which will be explained and described below with reference to the accompanying drawings.
方法实施例三Method Example Three
参见图5,该图为本申请实施例提供的第一检测模型的训练过程流程图。Refer to FIG. 5, which is a flowchart of the training process of the first detection model provided in an embodiment of the application.
本申请实施例提供的第一检测模型的训练过程,包括步骤S51-S55:The training process of the first detection model provided in the embodiment of the application includes steps S51-S55:
S51:获取训练图像对应的图像特征、目标物体在训练图像上的实际位置信息以及目标物体的实际关键点位置信息。S51: Obtain image features corresponding to the training image, actual position information of the target object on the training image, and actual key point position information of the target object.
训练图像用于表示在对第一检测模型进行训练时所使用的图像;而且,本申请不限定训练图像的类型,例如,训练图像可以是通过图像采集设备(例如,照相机)采集的图像,也可以是携带有目标物体位置信息的图像,还可以是携 带有目标物体的关键点位置信息的图像。另外,本申请实施例不限定训练图像的个数,而且训练图像的个数可以预先设定,尤其可以根据应用场景设定。The training image is used to represent the image used when training the first detection model; moreover, this application does not limit the type of training image. For example, the training image may be an image collected by an image acquisition device (for example, a camera), or It can be an image that carries the position information of the target object, or it can be an image that carries the key point position information of the target object. In addition, the embodiment of the present application does not limit the number of training images, and the number of training images can be preset, especially according to application scenarios.
目标物体在训练图像上的实际位置信息用于记录训练图像中目标物体的实际位置相关信息。The actual position information of the target object on the training image is used to record the relevant information about the actual position of the target object in the training image.
目标物体的实际关键点位置信息用于记录训练图像中目标物体的关键点的实际位置相关信息。The actual key point position information of the target object is used to record the actual position related information of the key point of the target object in the training image.
需要说明的是,本申请实施例不限定训练图像对应的图像特征的获取方法,可以采用任一种能够获取训练图像对应的图像特征的获取方法;本申请实施例不限定目标物体在训练图像上的实际位置信息的获取方法,可以采用任一种能够获取目标物体在训练图像上的实际位置信息的获取方法;本申请实施例不限定目标物体的实际关键点位置信息的获取方法,可以采用任一种能够获取目标物体的实际关键点位置信息的获取方法。It should be noted that the embodiment of the application does not limit the method for acquiring image features corresponding to the training image, and any method for acquiring the image features corresponding to the training image can be used; the embodiment of the application does not limit the target object on the training image. The method for obtaining actual position information of the target object can be any method that can obtain the actual position information of the target object on the training image; the embodiment of this application does not limit the method for obtaining the actual key point position information of the target object, and any method can be used. An acquisition method that can acquire the actual key point position information of the target object.
S52:根据所述训练图像对应的图像特征,利用第一检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的预测位置信息以及目标物体的预测关键点位置信息。S52: Use the first detection model to detect the target object and its key points according to the image features corresponding to the training image, and determine the predicted position information of the target object on the image to be detected and the predicted key point position information of the target object.
目标物体在待检测图像上的预测位置信息用于记录利用第一检测模型检测获得的目标物体在待检测图像上的位置相关信息。The predicted position information of the target object on the image to be detected is used to record the position related information of the target object on the image to be detected obtained by the first detection model.
目标物体的预测关键点位置信息用于记录利用第一检测模型检测获得的待检测图像中目标物体的预测关键点位置相关信息。The predicted key point position information of the target object is used to record the information about the predicted key point position of the target object in the image to be detected obtained by using the first detection model.
在本申请实施例中,在获取到训练图像对应的图像特征之后,将该训练图像对应的图像特征输入到第一检测模型中,以便该第一检测模型根据该训练图像对应的图像特征对目标物体及其关键点进行检测,确定目标物体在待检测图像上的预测位置信息以及目标物体的预测关键点位置信息,并由第一检测模型将目标物体在待检测图像上的预测位置信息以及目标物体的预测关键点位置信息进行输出。In the embodiment of the present application, after the image feature corresponding to the training image is obtained, the image feature corresponding to the training image is input into the first detection model, so that the first detection model can target the target according to the image feature corresponding to the training image. The object and its key points are detected, the predicted position information of the target object on the image to be detected and the predicted key point position information of the target object are determined, and the predicted position information of the target object on the image to be detected and the target are determined by the first detection model. The predicted key point position information of the object is output.
S53:根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第一检测模型是否达到第一预设条件;若是,则执行步骤S55;若否,则执行步骤S54。S53: According to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position of the target object Information, it is determined whether the first detection model meets the first preset condition; if it is, step S55 is executed; if not, step S54 is executed.
第一预设条件用于记录第一检测模型的训练过程的训练截止条件信息;而且,第一预设条件可以预先设定,也可以根据应用场景设定。例如,第一预设条件可以具体包括:第一检测模型的检测损失的变化率低于第一预设损失阈值,和/或,第一检测模型的训练轮数达到第一预设轮数阈值。The first preset condition is used to record the training cut-off condition information of the training process of the first detection model; moreover, the first preset condition may be preset or set according to the application scenario. For example, the first preset condition may specifically include: the rate of change of the detection loss of the first detection model is lower than the first preset loss threshold, and/or the number of training rounds of the first detection model reaches the first preset number of rounds threshold .
在本申请实施例中,可以将由第一检测模型检测所得的物体及其关键点的预测位置信息与物体及其关键点的实际位置信息进行比较,以便根据比较结果确定第一检测模型的预测准确性,其具体为:根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第一检测模型是否达到第一预设条件,若第一检测模型达到第一预设条件,则表示由第一检测模型检测所得的物体及其关键点的预测位置信息十分接近于物体及其关键点的实际位置信息,从而表示该第一检测模型的预测准确性较高,无需再对该第一检测模型进行训练,此时可以结束对该第一检测模型的训练过程;若第一检测模型未达到第一预设条件,则表示由第一检测模型检测所得的物体及其关键点的预测位置信息与物体及其关键点的实际位置信息之间的差距较大,从而表示该第一检测模型的预测准确性较低,还需对该第一检测模型进行进一步训练,此时需要更新第一检测模型,以便提高更新后的第一检测模型的预测准确性。In the embodiment of the present application, the predicted position information of the object and its key points detected by the first detection model can be compared with the actual position information of the object and its key points, so as to determine that the prediction of the first detection model is accurate according to the comparison result. It is specifically: according to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the position information of the target object. Predict the location information of key points, and determine whether the first detection model meets the first preset condition. If the first detection model meets the first preset condition, it indicates the prediction of the object and its key points detected by the first detection model The position information is very close to the actual position information of the object and its key points, which indicates that the prediction accuracy of the first detection model is high, and there is no need to train the first detection model. At this time, the first detection model can be ended. The training process; if the first detection model does not meet the first preset condition, it means the difference between the predicted position information of the object and its key points detected by the first detection model and the actual position information of the object and its key points Larger, which indicates that the prediction accuracy of the first detection model is low, and further training of the first detection model is needed. At this time, the first detection model needs to be updated to improve the prediction accuracy of the updated first detection model Sex.
另外,为了提高对第一检测模型的检测效果的评价准确性,本申请实施例还提供了步骤S53的一种具体实施方式,如图6所示,在该实施方式中,当第一预设条件包括第一检测模型的检测损失的变化率低于第一预设损失阈值时,步骤S53具体可以包括步骤S53A1-S53A7:In addition, in order to improve the accuracy of the evaluation of the detection effect of the first detection model, the embodiment of the present application also provides a specific implementation manner of step S53, as shown in FIG. 6, in this implementation manner, when the first preset When the condition includes that the change rate of the detection loss of the first detection model is lower than the first preset loss threshold, step S53 may specifically include steps S53A1-S53A7:
S53A1:根据所述目标物体在训练图像上的实际位置信息以及目标物体在待检测图像上的预测位置信息,确定目标物体的检测损失。S53A1: Determine the detection loss of the target object according to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected.
目标物体的检测损失用于记录第一检测模型在获取目标物体在待检测图像上的位置信息的过程中所造成的损失;而且,所述目标物体的检测损失包括目标物体的识别损失(L-softmax)和目标物体的位置检测损失(L-regression)。The detection loss of the target object is used to record the loss caused by the first detection model in the process of acquiring the position information of the target object on the image to be detected; moreover, the detection loss of the target object includes the recognition loss of the target object (L- softmax) and the position detection loss of the target object (L-regression).
其中,目标物体的识别损失用于记录第一检测模型在识别图像中目标物体时所造成的损失。也就是说,目标物体的识别损失用于记录第一检测模型是否能够从图像中识别出目标物体。另外,目标物体的识别损失可以包括因未识别 出图像中目标物体而造成的损失、因将图像中的其他物体识别为目标物体而造成的损失、因将待检测图像中的目标物体识别为其他物体而造成的损失等。Among them, the recognition loss of the target object is used to record the loss caused by the first detection model when recognizing the target object in the image. That is, the recognition loss of the target object is used to record whether the first detection model can recognize the target object from the image. In addition, the recognition loss of the target object may include the loss caused by not recognizing the target object in the image, the loss caused by recognizing other objects in the image as the target object, and the loss caused by recognizing the target object in the image to be detected as other objects. Damage caused by objects, etc.
目标物体的位置检测损失用于记录第一检测模型在确定目标物体在待检测图像上的位置信息的过程中所造成的损失;而且,所述目标物体的位置检测损失可以包括因目标物体在待检测图像上的预测位置信息不准确导致的损失。需要说明的是,在确定目标物体的位置检测损失时通常可以利用二维损失函数(l2-loss)进行计算,也就是计算预测位置坐标与实际位置坐标之间的距离差。The position detection loss of the target object is used to record the loss caused by the first detection model in the process of determining the position information of the target object on the image to be detected; moreover, the position detection loss of the target object may include Detect the loss caused by inaccurate predicted position information on the image. It should be noted that, when determining the position detection loss of the target object, a two-dimensional loss function (l2-loss) can usually be used for calculation, that is, the distance difference between the predicted position coordinate and the actual position coordinate can be calculated.
基于上述内容可知,在获取到目标物体在训练图像上的实际位置信息以及目标物体在待检测图像上的预测位置信息之后,可以将实际位置信息与预测位置信息进行比较,来确定第一检测模型的目标物体的识别损失;同时,还可以计算实际位置信息与预测位置信息之间的距离差,来确定第一检测模型的目标物体的位置检测损失。Based on the above content, after acquiring the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected, the actual position information can be compared with the predicted position information to determine the first detection model At the same time, the distance difference between the actual position information and the predicted position information can be calculated to determine the position detection loss of the target object of the first detection model.
需要说明的是,本申请实施例不限定目标物体的识别损失的计算方法,可以采用现有的或未来出现的任一种能够衡量图像中目标物体识别效果的计算方法,来确定目标物体的识别损失。另外,本申请实施例不限定目标物体的位置检测损失的计算方法,可以采用现有的或未来出现的任一种能够衡量目标物体的位置检测效果的计算方法,来确定目标物体的位置检测损失。It should be noted that the embodiment of the application does not limit the calculation method of the recognition loss of the target object, and any existing or future calculation method that can measure the recognition effect of the target object in the image can be used to determine the recognition of the target object. loss. In addition, the embodiment of the application does not limit the calculation method of the position detection loss of the target object, and any existing or future calculation method that can measure the position detection effect of the target object can be used to determine the position detection loss of the target object. .
S53A2:根据所述目标物体的实际关键点位置信息以及所述目标物体的预测关键点位置信息,确定目标物体的关键点位置检测损失。S53A2: Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object.
目标物体的关键点位置检测损失(loss-dpoint)用于记录第一检测模型在获取目标物体的关键点位置信息的过程中所造成的损失;而且,目标物体的关键点位置检测损失包括因目标物体的预测关键点位置信息不准确导致的损失。需要说明的是,在确定目标物体的关键点位置检测损失时通常可以利用二维损失函数(l2-loss)进行计算,也就是计算预测位置坐标与实际位置坐标之间的距离差。The loss-dpoint detection loss of the key point position of the target object is used to record the loss caused by the first detection model in the process of obtaining the key point position information of the target object; moreover, the key point position detection loss of the target object includes the loss-dpoint caused by the target object. Loss caused by inaccurate position information of the predicted key points of the object. It should be noted that the two-dimensional loss function (l2-loss) can usually be used to calculate the detection loss of the key point position of the target object, that is, to calculate the distance difference between the predicted position coordinate and the actual position coordinate.
需要说明的是,本申请实施例不限定目标物体的关键点位置检测损失的计算方法,可以采用现有的或未来出现的任一种能够衡量目标物体的关键点的位置检测效果的计算方法,来确定目标物体的关键点位置检测损失。It should be noted that the embodiment of the application does not limit the calculation method of the key point position detection loss of the target object, and any existing or future calculation method that can measure the position detection effect of the key point of the target object can be used. To determine the detection loss of the key point position of the target object.
S53A3:根据所述目标物体的识别损失、所述目标物体的位置检测损失和所述目标物体的关键点位置检测损失,确定第一检测模型的检测损失。S53A3: Determine the detection loss of the first detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object.
第一检测模型的检测损失(loss-detect)用于记录第一检测模型在获取目标物体在待检测图像上的位置信息以及目标物体的关键点位置信息的过程中所造成的损失。The loss-detect of the first detection model is used to record the loss caused by the first detection model in the process of acquiring the position information of the target object on the image to be detected and the key point position information of the target object.
在本申请实施例中,为了提高对模型检测效果的评价准确性,可以利用预先设定的损失函数来确定第一检测模型的检测损失,且该损失函数为:In this embodiment of the application, in order to improve the accuracy of the evaluation of the model detection effect, a preset loss function can be used to determine the detection loss of the first detection model, and the loss function is:
loss-detect=α×L-softmax+β×L-regression+F×σ×loss-dpoint,loss-detect=α×L-softmax+β×L-regression+F×σ×loss-dpoint,
式中,loss-detect表示第一检测模型的检测损失;α表示目标物体的识别损失的影响权重;L-softmax表示目标物体的识别损失;β表示目标物体的位置检测损失的影响权重;L-regression表示目标物体的位置检测损失;F表示目标物体的关键点位置检测损失的存在标识,若训练图像中包括目标物体(也就是,正样本训练图像),则该训练图像对应的F值为1,若训练图像中不包括目标物体(也就是,负样本训练图像),则该训练图像对应的F值为0;σ表示目标物体的关键点位置检测损失的影响权重;loss-dpoint表示目标物体的关键点位置检测损失。In the formula, loss-detect represents the detection loss of the first detection model; α represents the influence weight of the recognition loss of the target object; L-softmax represents the recognition loss of the target object; β represents the influence weight of the position detection loss of the target object; L- Regression represents the location detection loss of the target object; F represents the existence identification of the key point location detection loss of the target object. If the training image includes the target object (that is, the positive sample training image), the F value corresponding to the training image is 1 , If the training image does not include the target object (that is, the negative sample training image), the F value corresponding to the training image is 0; σ represents the influence weight of the key point position detection loss of the target object; loss-dpoint represents the target object The key point position detection loss.
需要说明的是,F的值是根据训练图像中是否包括目标物体确定的,具体为:若一个训练图像中包括目标物体,则需要考虑目标物体的关键点相关信息,此时在衡量第一检测模型对该训练图像的检测损失时需要考虑目标物体的识别损失、目标物体的位置检测损失和目标物体的关键点位置检测损失;若一个训练图像中不包括目标物体,则无需考虑目标物体的关键点相关信息,此时在衡量第一检测模型对该训练图像的检测损失时只需要考虑目标物体的识别损失和目标物体的位置检测损失即可。It should be noted that the value of F is determined according to whether the target object is included in the training image, specifically: if a target object is included in a training image, the key point related information of the target object needs to be considered. At this time, the first detection is measured When the model detects the loss of the training image, it needs to consider the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object; if the target object is not included in a training image, there is no need to consider the key of the target object Point-related information. At this time, when measuring the detection loss of the training image by the first detection model, only the recognition loss of the target object and the position detection loss of the target object need to be considered.
在本申请实施例中,由于上述损失函数是通过利用目标物体的识别损失、目标物体的位置检测损失和目标物体的关键点位置检测损失这三个损失指标来共同监督第一检测模型的检测效果的,因而在利用上述损失函数评价第一检测模型的对物体及其关键点的检测效果时,能够有效地确定出当前第一检测模型是否能够准确地检测出图像中的物体及其关键点,从而提高了训练好的第一检测模型的检测准确性。In the embodiment of this application, the above-mentioned loss function is to jointly supervise the detection effect of the first detection model by using three loss indicators: the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object. Therefore, when using the above loss function to evaluate the detection effect of the first detection model on the object and its key points, it can effectively determine whether the current first detection model can accurately detect the object and its key points in the image. Thus, the detection accuracy of the trained first detection model is improved.
S53A4:根据所述第一检测模型的检测损失以及第一检测模型的历史检测损失,确定第一检测模型的检测损失变化率。S53A4: Determine the detection loss change rate of the first detection model according to the detection loss of the first detection model and the historical detection loss of the first detection model.
第一检测模型的历史检测损失是在历史训练过程中确定的第一检测模型的检测损失。例如,假设对第一检测模型依次进行了第1轮训练至第Y轮训练,且第1轮训练早于第2轮训练、第2轮训练早于第3轮训练、……、第Y-1轮训练早于第Y轮训练,且第Y轮训练是当前正在执行的训练过程,此时,对于第Y轮训练来说,第1轮训练至第Y-1轮训练均是第Y轮训练的历史训练过程,那么,在第1轮训练中确定的第一检测模型的检测损失至在第Y-1轮训练中确定的第一检测模型的检测损失均是第Y轮训练的历史检测损失。The historical detection loss of the first detection model is the detection loss of the first detection model determined during the historical training process. For example, suppose that the first detection model is sequentially trained from the first round to the Y-round training, and the first-round training is earlier than the second-round training, and the second-round training is earlier than the third-round training..., Y-th The first round of training is earlier than the Y-round training, and the Y-round training is the current training process being executed. At this time, for the Y-round training, the first-round training to the Y-1th round of training are all the Y-round The historical training process of training, then, the detection loss of the first detection model determined in the first round of training to the detection loss of the first detection model determined in the Y-1 round of training are all historical detections of the Y-round training loss.
第一检测模型的检测损失变化率用于表征第一检测模型在多轮训练过程中检测效果的变化信息;而且,第一检测模型的检测损失变化率越小表示第一检测模型的检测效果越好,且第一检测模型的检测损失变化率越大表示第一检测模型的检测效果越差。The detection loss change rate of the first detection model is used to characterize the change information of the detection effect of the first detection model during multiple rounds of training; moreover, the smaller the detection loss change rate of the first detection model, the greater the detection effect of the first detection model. Good, and the larger the detection loss change rate of the first detection model is, the worse the detection effect of the first detection model is.
S53A5:判断所述第一检测模型的检测损失变化率是否低于第一预设损失阈值,若是,则执行步骤S53A6;若否,则执行步骤S53A7。S53A5: Determine whether the detection loss change rate of the first detection model is lower than the first preset loss threshold, if yes, execute step S53A6; if not, execute step S53A7.
第一预设损失阈值可以预先设定,尤其可以根据应用场景设定。The first preset loss threshold can be set in advance, especially can be set according to application scenarios.
在本申请实施例中,在根据目标物体的识别损失、目标物体的位置检测损失和目标物体的关键点位置检测损失这三个监督因素确定出第一检测模型的检测损失后,再基于第一检测模型的检测损失确定第一检测模型的检测损失变化率。此时,如果第一检测模型的检测损失变化率低于第一预设损失阈值,则表示该第一检测模型检测损失变化较小且接近于稳定的最优模型,从而表示利用该第一检测模型检测所得的图像中物体及其关键点准确性较高,从而确定该第一检测模型已达到第一预设条件,进而可以确定该第一检测模型无需再进行训练;如果第一检测模型的检测损失变化率不低于第一预设损失阈值,则表示该第一检测模型检测损失变化较大且距离稳定的最优模型较远,从而表示利用该第一检测模型检测所得的图像中物体及其关键点准确性较低,从而确定该第一检测模型未达到第一预设条件,进而可以确定该第一检测模型仍需再次进行训练。In the embodiment of the present application, after the detection loss of the first detection model is determined based on the three supervision factors: the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object, the detection loss is based on the first The detection loss of the detection model determines the detection loss change rate of the first detection model. At this time, if the detection loss change rate of the first detection model is lower than the first preset loss threshold, it means that the detection loss change of the first detection model is small and close to the stable optimal model, thereby indicating that the first detection model is used. The accuracy of the objects and their key points in the image obtained by the model detection is relatively high, so that it is determined that the first detection model has reached the first preset condition, and then it can be determined that the first detection model does not need to be trained; The detection loss change rate is not lower than the first preset loss threshold, which means that the first detection model has a large change in detection loss and is farther from the optimal model that is stable, thus indicating that the object in the image detected by the first detection model The accuracy of its key points is low, so it is determined that the first detection model does not meet the first preset condition, and then it can be determined that the first detection model still needs to be trained again.
S53A6:确定所述第一检测模型达到第一预设条件。S53A6: Determine that the first detection model meets the first preset condition.
S53A7:确定所述第一检测模型未达到第一预设条件。S53A7: Determine that the first detection model does not meet the first preset condition.
需要说明的是,步骤S53A1和步骤S53A2之间没有固定的执行顺序,可以依次执行步骤S53A1和步骤S53A2,也可以依次执行步骤S53A2和步骤S53A1,还可以同时执行步骤S53A1和步骤S53A2。It should be noted that there is no fixed execution order between step S53A1 and step S53A2. Step S53A1 and step S53A2 can be executed in sequence, step S53A2 and step S53A1 can be executed in sequence, or step S53A1 and step S53A2 can be executed simultaneously.
以上为步骤S53的一种实施方式,该实施方式将第一检测模型的检测损失变化率低于第一预设损失阈值作为了第一检测模型的训练截止条件。The foregoing is an implementation manner of step S53. In this implementation manner, the detection loss change rate of the first detection model is lower than the first preset loss threshold as the training cut-off condition of the first detection model.
另外,为了提高对第一检测模型的检测效果的评价准确性,本申请实施例还提供了步骤S53的另一种具体实施方式,如图7所示,在该实施方式中,当第一预设条件包括第一检测模型的检测损失的变化率低于第一预设损失阈值时,步骤S53具体可以包括步骤S53B1-S53B7:In addition, in order to improve the evaluation accuracy of the detection effect of the first detection model, the embodiment of the present application also provides another specific implementation manner of step S53, as shown in FIG. Assuming that the condition includes that the rate of change of the detection loss of the first detection model is lower than the first preset loss threshold, step S53 may specifically include steps S53B1-S53B7:
S53B1:根据所述目标物体在训练图像上的实际位置信息以及目标物体在待检测图像上的预测位置信息,确定目标物体的检测损失。S53B1: Determine the detection loss of the target object according to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected.
其中,所述目标物体的检测损失包括目标物体的识别损失和目标物体的位置检测损失。Wherein, the detection loss of the target object includes the recognition loss of the target object and the position detection loss of the target object.
需要说明的是,步骤S53B1的内容与上述步骤S53A1的内容相同,为了简要起见,在此不再赘述,步骤S53B1的技术详情请参照上述步骤S53A1的内容。It should be noted that the content of step S53B1 is the same as the content of step S53A1 described above. For the sake of brevity, it will not be repeated here. For technical details of step S53B1, please refer to the content of step S53A1 described above.
S53B2:根据所述目标物体的实际关键点位置信息以及所述目标物体的预测关键点位置信息,确定目标物体的关键点位置检测损失。S53B2: Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object.
需要说明的是,步骤S53B2的内容与上述步骤S53A2的内容相同,为了简要起见,在此不再赘述,步骤S53B2的技术详情请参照上述步骤S53A2的内容。It should be noted that the content of step S53B2 is the same as the content of step S53A2 described above. For the sake of brevity, it will not be repeated here. For technical details of step S53B2, please refer to the content of step S53A2 described above.
S53B3:根据所述目标物体的识别损失、所述目标物体的位置检测损失和所述目标物体的关键点位置检测损失,确定第一检测模型的检测损失。S53B3: Determine the detection loss of the first detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object.
需要说明的是,步骤S53B3的内容与上述步骤S53A3的内容相同,为了简要起见,在此不再赘述,步骤S53B3的技术详情请参照上述步骤S53A3的内容。It should be noted that the content of step S53B3 is the same as the content of step S53A3 described above. For the sake of brevity, it will not be repeated here. For technical details of step S53B3, please refer to the content of step S53A3 described above.
S53B4:根据所述第一检测模型的检测损失以及第一检测模型的历史检测损失,确定第一检测模型的检测损失变化率。S53B4: Determine the detection loss change rate of the first detection model according to the detection loss of the first detection model and the historical detection loss of the first detection model.
第一检测模型的历史检测损失是在历史训练过程中确定的第一检测模型的检测损失。The historical detection loss of the first detection model is the detection loss of the first detection model determined during the historical training process.
需要说明的是,步骤S53B4的内容与上述步骤S53A4的内容相同,为了简要起见,在此不再赘述,步骤S53B4的技术详情请参照上述步骤S53A4的内容。It should be noted that the content of step S53B4 is the same as the content of step S53A4. For the sake of brevity, it will not be repeated here. For technical details of step S53B4, please refer to the content of step S53A4.
S53B5:判断是否满足以下两个条件中的至少一个条件,且该两个条件为:第一检测模型的检测损失变化率低于第一预设损失阈值和第一检测模型的训练轮数达到第一预设轮数阈值;若是,则执行步骤S53B6;若否,则执行步骤S53B7。S53B5: Determine whether at least one of the following two conditions is met, and the two conditions are: the detection loss change rate of the first detection model is lower than the first preset loss threshold and the number of training rounds of the first detection model reaches the first A preset threshold for the number of rounds; if yes, execute step S53B6; if not, execute step S53B7.
在本申请实施例中,在获取到第一检测模型的检测损失变化率以及第一检测模型的训练轮数之后,需要分别进行执行动作“判断第一检测模型的检测损失变化率是否低于第一预设损失阈值”以及动作“第一检测模型的训练轮数是否达到第一预设轮数阈值”。其中,只要确定了第一检测模型的检测损失变化率低于第一预设损失阈值时,无论有没有执行动作“第一检测模型的训练轮数是否达到第一预设轮数阈值”,均直接则执行步骤S53B6;而且,只要确定了第一检测模型的训练轮数达到第一预设轮数阈值,无论有没有执行动作“判断第一检测模型的检测损失变化率是否低于第一预设损失阈值”,均直接则执行步骤S53B6;而且,只有确定了第一检测模型的检测损失变化率不低于第一预设损失阈值,且第一检测模型的训练轮数未达到第一预设轮数阈值时,才执行步骤S53B7。In the embodiment of the present application, after the detection loss change rate of the first detection model and the number of training rounds of the first detection model are obtained, it is necessary to perform the action "to judge whether the detection loss change rate of the first detection model is lower than the first detection model. A preset loss threshold" and an action "whether the number of training rounds of the first detection model reaches the first preset number of rounds threshold". Wherein, as long as it is determined that the detection loss change rate of the first detection model is lower than the first preset loss threshold, regardless of whether the action "whether the number of training rounds of the first detection model reaches the first preset threshold" is performed or not Directly execute step S53B6; moreover, as long as it is determined that the number of training rounds of the first detection model reaches the first preset round number threshold, no matter whether the action is performed or not, it is judged whether the detection loss change rate of the first detection model is lower than the first prediction. Set the loss threshold”, then directly execute step S53B6; moreover, it is only determined that the detection loss change rate of the first detection model is not lower than the first preset loss threshold, and the number of training rounds of the first detection model does not reach the first preset Step S53B7 is executed only when the threshold of the number of rounds is set.
需要说明的是,在本申请实施例不限定动作“判断第一检测模型的检测损失变化率是否低于第一预设损失阈值”和动作“第一检测模型的训练轮数是否达到第一预设轮数阈值”的执行顺序,可以依次执行动作“判断第一检测模型的检测损失变化率是否低于第一预设损失阈值”和动作“第一检测模型的训练轮数是否达到第一预设轮数阈值”,也可以依次执行动作“第一检测模型的训练轮数是否达到第一预设轮数阈值”和动作“判断第一检测模型的检测损失变化率是否低于第一预设损失阈值”,还可以同时执行该两个动作。It should be noted that in this embodiment of the application, the action "judging whether the detection loss change rate of the first detection model is lower than the first preset loss threshold" and the action "whether the number of training rounds of the first detection model reaches the first predetermined Set the execution order of the number of rounds threshold. You can execute the actions “judge whether the detection loss change rate of the first detection model is lower than the first preset loss threshold” and the action “whether the number of training rounds of the first detection model reach the first preset” in sequence. Set the number of rounds threshold", you can also perform the actions "whether the number of training rounds of the first detection model reaches the first preset round number threshold" and the action "judge whether the detection loss change rate of the first detection model is lower than the first preset" Loss threshold", you can also perform the two actions at the same time.
S53B6:确定所述第一检测模型达到第一预设条件。S53B6: Determine that the first detection model reaches a first preset condition.
S53B7:确定所述第一检测模型未达到第一预设条件。S53B7: Determine that the first detection model does not meet the first preset condition.
需要说明的是,步骤S53B1和步骤S53B2之间没有固定的执行顺序,可以依次执行步骤S53B1和步骤S53B2,也可以依次执行步骤S53B2和步骤S53B1,还可以同时执行步骤S53B1和步骤S53B2。It should be noted that there is no fixed execution order between step S53B1 and step S53B2. Step S53B1 and step S53B2 can be executed in sequence, step S53B2 and step S53B1 can be executed in sequence, or step S53B1 and step S53B2 can be executed simultaneously.
以上为步骤S53的另一种实施方式,该实施方式将第一检测模型的检测损失变化率低于第一预设损失阈值以及第一检测模型的训练轮数达到第一预设轮数阈值共同作为了第一检测模型的训练截止条件。The above is another embodiment of step S53. In this embodiment, the detection loss change rate of the first detection model is lower than the first preset loss threshold and the number of training rounds of the first detection model reaches the first preset number of rounds threshold. As the training cut-off condition of the first detection model.
以上为步骤S53的相关内容。The above is the relevant content of step S53.
S54:更新第一检测模型,继续执行步骤S52。S54: Update the first detection model, and continue to execute step S52.
在本申请实施例中,在确定第一检测模型未达到第一预设条件时,可以根据由第一检测模型检测所得的目标物体在待检测图像上的预测位置信息与目标物体在训练图像上的实际位置信息之间的差距,以及目标物体的预测关键点位置信息与目标物体的实际关键点位置信息之间的差距,来对第一检测模型进行更新,使得更新后的第一检测模型检测所得的目标物体在待检测图像上的预测位置信息更接近与目标物体在训练图像上的实际位置信息,并使得目标物体的预测关键点位置信息更接近于标物体的实际关键点位置信息,从而使得更新后的第一检测模型能够更准确地检测出图像中物体及其关键点。In the embodiment of the present application, when it is determined that the first detection model does not meet the first preset condition, the predicted position information of the target object on the image to be detected and the target object on the training image detected by the first detection model may be used. The gap between the actual position information of the target object and the gap between the predicted key point position information of the target object and the actual key point position information of the target object to update the first detection model so that the updated first detection model detects The obtained predicted position information of the target object on the image to be detected is closer to the actual position information of the target object on the training image, and the predicted key point position information of the target object is closer to the actual key point position information of the target object, thereby This enables the updated first detection model to more accurately detect objects and their key points in the image.
S55:结束对第一检测模型的训练过程。S55: End the training process for the first detection model.
上述为方法实施例三提供的第一检测模型的训练过程的具体实施方式,在该实施方式中,在获取到训练图像对应的图像特征、目标物体在训练图像上的实际位置信息以及目标物体的实际关键点位置信息之后,先根据所述训练图像对应的图像特征,利用第一检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的预测位置信息以及目标物体的预测关键点位置信息;再根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第一检测模型是否达到第一预设条件,以便根据该判断结果确定是否继续对该第一检测模型进行训练。其中,因本申请实施例是根据目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息这四个信息来评价第一检测模型的检测效果的,使得该评价结果更合理有效,从而使得满足第一预设条件的第一检测模型的检测效果更好。The foregoing is a specific implementation manner of the training process of the first detection model provided in Method Example 3. In this implementation manner, the image features corresponding to the training image, the actual position information of the target object on the training image, and the target object’s actual position are acquired. After the actual key point position information, the first detection model is used to detect the target object and its key points according to the image characteristics corresponding to the training image, and the predicted position information of the target object on the image to be detected and the prediction of the target object are determined Key point position information; and then according to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the target object’s position information Predict the position information of the key points, and determine whether the first detection model meets the first preset condition, so as to determine whether to continue training the first detection model according to the judgment result. Among them, because the embodiment of this application is based on the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the target object’s position information. Predicting the four pieces of information of the key point location information to evaluate the detection effect of the first detection model makes the evaluation result more reasonable and effective, so that the detection effect of the first detection model that satisfies the first preset condition is better.
另外,为了进一步提高对第一检测模型的检测效果的评价准确性,本申请实施例还利用目标物体的识别损失、目标物体的位置检测损失和目标物体的关键点位置检测损失这三个损失因素来共同监督第一检测模型的检测效果,使得该评价结果更合理有效,从而使得满足第一预设条件的第一检测模型的检测效果更好。In addition, in order to further improve the accuracy of the evaluation of the detection effect of the first detection model, the embodiment of the present application also uses three loss factors: the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object. To jointly supervise the detection effect of the first detection model, the evaluation result is more reasonable and effective, so that the detection effect of the first detection model that meets the first preset condition is better.
为了提高图像中物体及其关键点的位置检测准确性,基于上述方法实施例提供的图像中物体及其关键点的确定方法,本申请实施例还提供了图像中物体及其关键点的确定方法的又一种实施方式,下面结合附图进行解释和说明。In order to improve the accuracy of position detection of objects and their key points in an image, based on the method for determining objects and their key points in an image provided by the foregoing method embodiments, an embodiment of the present application also provides a method for determining objects and their key points in an image Another embodiment of, will be explained and described below with reference to the accompanying drawings.
方法实施例四Method embodiment four
方法实施例四是在方法实施例一至方法实施例三的基础上进行的改进,为了简要起见,方法实施例二中与方法实施例一至方法实施例三中内容相同的部分,在此不再赘述。Method embodiment 4 is an improvement based on method embodiment 1 to method embodiment 3. For the sake of brevity, the parts in method embodiment 2 that are the same as those in method embodiment 1 to method embodiment 3 will not be repeated here. .
参见图8,该图为本申请方法实施例四提供的图像中物体及其关键点的确定方法流程图。Refer to FIG. 8, which is a flowchart of a method for determining an object and its key points in an image provided in the fourth embodiment of the method of this application.
本申请实施例提供的图像中物体及其关键点的确定方法,具体包括步骤S81-S82:The method for determining an object and its key points in an image provided by the embodiment of the present application specifically includes steps S81-S82:
S81:对待检测图像进行特征提取,得到所述待检测图像对应的图像特征。S81: Perform feature extraction on the image to be detected to obtain image features corresponding to the image to be detected.
需要说明的是,步骤S81的具体内容与上述方法实施例一中步骤S21的内容相同,为了简要起见,在此不再赘述。It should be noted that the specific content of step S81 is the same as the content of step S21 in the first embodiment of the method, and for the sake of brevity, the details are not repeated here.
S82:根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息和待检测图像上各个图像子区域内的关键点位置信息,并根据所述目标物体在待检测图像上的位置信息和所述待检测图像上各个图像子区域内的关键点位置信息,确定所述目标物体的关键点位置信息。S82: Determine the position information of the target object on the image to be detected and the key point position information in each image subregion of the image to be detected according to the image features corresponding to the image to be detected, and determine the position information of the target object in the image to be detected The location information on the image and the location information of the key points in each image subregion on the image to be detected determine the location information of the key points of the target object.
其中,图像子区域是锚点对应的感兴趣区域。需要说明的是,“锚点对应的感兴趣区域”的技术详情请参照方法实施例一的步骤S22中相关内容。另外,在待检测图像中通常包括至少一个图像子区域。Among them, the image sub-region is the region of interest corresponding to the anchor point. It should be noted that, for the technical details of the "region of interest corresponding to the anchor point", please refer to the related content in step S22 of the first method embodiment. In addition, the image to be detected usually includes at least one image sub-region.
在本申请实施例中,在获取到待检测图像对应的图像特征之后,先根据该待检测图像对应的图像特征分别确定目标物体在待检测图像上的位置信息和待检测图像上各个图像子区域内的关键点位置信息,再根据目标物体在待检测图像上的位置信息和所述待检测图像上各个图像子区域内的关键点位置信息,确定所述目标物体的关键点位置信息。作为示例,假设待检测图像包括第1个图像子区域至第P个图像子区域,且目标物体位于第W个图像子区域内。基于该假设,步骤S82具体为:根据待检测图像对应的图像特征分别确定目标物体在待检测图像上的位置信息和待检测图像上第1个图像子区域内的关键点位置 信息至第P个图像子区域内的关键点位置信息,此时,由于目标物体位于第W个图像子区域内,因而,在根据目标物体在待检测图像上的位置信息和待检测图像上各个图像子区域内的关键点位置信息,确定目标物体的关键点位置信息,也就是,确定目标物体在第W个图像子区域内的关键点位置信息。In the embodiment of the present application, after the image feature corresponding to the image to be detected is acquired, the position information of the target object on the image to be detected and the image sub-regions on the image to be detected are determined according to the image features corresponding to the image to be detected. According to the position information of the key points in the image to be detected, the position information of the key points of the target object is determined according to the position information of the target object on the image to be detected and the key point position information in each image subregion of the image to be detected. As an example, assume that the image to be detected includes the first image sub-region to the P-th image sub-region, and the target object is located in the W-th image sub-region. Based on this assumption, step S82 specifically includes: determining the position information of the target object on the image to be detected and the key point position information in the first image subregion of the image to be detected according to the image features corresponding to the image to be detected to the Pth The position information of the key points in the image sub-region. At this time, since the target object is located in the W-th image sub-region, the position information of the target object on the image to be detected and the position information of each image sub-region on the image to be detected are The key point position information is to determine the key point position information of the target object, that is, to determine the key point position information of the target object in the W-th image subregion.
需要说明的是,在本申请实施例中,在获取到目标物体在待检测图像上的位置信息和待检测图像上各个图像子区域内的关键点位置信息之后,需要将目标物体所在图像子区域内的关键点位置作为该目标物体的关键点位置信息。It should be noted that, in the embodiment of the present application, after acquiring the position information of the target object on the image to be detected and the key point position information in each image subregion of the image to be detected, the image subregion where the target object is located needs to be The key point position within is used as the key point position information of the target object.
另外,为了提高图像中物体及其关键点的确定效率,可以利用由物体位置检测网络和物体关键点检测网络融合而成的第二检测模型确定物体及其关键点。基于此,本申请实施例还提供了步骤S82的一种实施方式,在该实施方式中,步骤S82具体可以为:根据所述待检测图像对应的图像特征,利用预先构建的第二检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的位置信息以及目标物体的关键点位置信息。In addition, in order to improve the efficiency of determining the object and its key points in the image, the second detection model formed by the fusion of the object position detection network and the object key point detection network can be used to determine the object and its key points. Based on this, the embodiment of the present application also provides an implementation manner of step S82. In this implementation manner, step S82 may specifically be: according to the image feature corresponding to the image to be detected, using a pre-built second detection model to pair The target object and its key points are detected, and the position information of the target object on the image to be detected and the key point position information of the target object are determined.
第二检测模型用于根据待检测图像对应的图像特征对目标物体在待检测图像上的位置信息以及目标物体的关键点位置信息进行检测。The second detection model is used to detect the position information of the target object on the image to be detected and the key point position information of the target object according to the image features corresponding to the image to be detected.
第二检测模型包括第二物体位置检测网络层、第二关键点位置检测网络层和物体关键点位置确定网络层,且所述第二物体位置检测网络层的输出结果和所述第二关键点位置检测网络层的输出结果是所述物体关键点位置确定网络层的输入数据,且所述第二物体位置检测网络层用于根据所述待检测图像对应的图像特征确定目标物体在待检测图像上的位置信息;且所述第二关键点位置检测网络层用于根据所述待检测图像对应的图像特征确定待检测图像上各个图像子区域内的关键点位置信息;且所述物体关键点位置确定网络层用于根据所述目标物体在待检测图像上的位置信息和所述待检测图像上各个图像子区域内的关键点位置信息,确定所述目标物体的关键点位置信息。也就是,在第二检测模型中,第二检测模型的输入数据“待检测图像对应的图像特征”分别是第二物体位置检测网络层的输入数据和第二关键点位置检测网络层的输入数据,而且第二物体位置检测网络层的输出数据和第二关键点位置检测网络层的输出数据是物体关键点位置确定网络层的输入数据。例如,当目标物体为人脸时,如图9所示,第二检测模型的输入数据“待检测图像对应的图像特征”就是人脸检测网络层的输入数据和人脸关键点检测网络层的输入数据,而且人 脸检测网络层的输出数据和人脸关键点检测网络层的输出数据是人脸关键点确定网络层的输入数据。需要说明的是:在图9中,“人脸检测网络层”用于表示应用于人脸位置及其关键点检测时的“第二物体位置检测网络层”;“人脸关键点检测网络层”用于表示应用于人脸位置及其关键点检测时的“第二关键点位置检测网络层”;“人脸关键点确定网络层”用于表示应用于人脸位置及其关键点检测时的“物体关键点位置确定网络层”。The second detection model includes a second object position detection network layer, a second key point position detection network layer, and an object key point position determination network layer, and the output result of the second object position detection network layer and the second key point The output result of the position detection network layer is the input data of the object key point position determination network layer, and the second object position detection network layer is used to determine that the target object is in the image to be detected according to the image features corresponding to the image to be detected And the second key point position detection network layer is used to determine the key point position information in each image sub-area of the image to be detected according to the image characteristics corresponding to the image to be detected; and the object key point The position determination network layer is used to determine the key point position information of the target object according to the position information of the target object on the image to be detected and the key point position information in each image subregion of the image to be detected. That is, in the second detection model, the input data "image features corresponding to the image to be detected" of the second detection model are the input data of the second object position detection network layer and the input data of the second key point position detection network layer. , And the output data of the second object position detection network layer and the output data of the second key point position detection network layer are the input data of the object key point position determination network layer. For example, when the target object is a human face, as shown in Figure 9, the input data of the second detection model "image features corresponding to the image to be detected" is the input data of the face detection network layer and the input of the face key point detection network layer Data, and the output data of the face detection network layer and the output data of the face key point detection network layer are the input data of the face key point determination network layer. It should be noted that: in Figure 9, "face detection network layer" is used to indicate the "second object position detection network layer" applied to the detection of face position and its key points; "face key point detection network layer" "Is used to indicate the "second key point location detection network layer" when applied to face position and key point detection; "face key point determination network layer" is used to indicate when it is applied to face position and key point detection "Determining the network layer of the key point position of the object".
需要说明的是,在本申请实施例中,第二检测模型是一个深度神经网络,而且该深度神经网络是由第二物体位置检测网络层和第二关键点位置检测网络层构成的。如此,在使用第二检测模型进行物体及其关键点检测时,只需运行一个深度神经网络即可,节约了运行内存以及运行时间。It should be noted that, in the embodiment of the present application, the second detection model is a deep neural network, and the deep neural network is composed of a second object position detection network layer and a second key point position detection network layer. In this way, when using the second detection model to detect objects and their key points, only one deep neural network needs to be run, which saves running memory and running time.
还需要说明的是,在本申请实施例中还提供了一种第二检测模型的训练过程,且该训练过程将在 方法实施例五中进行详细说明,技术详情请参照 方法实 施例五It is further noted that the embodiment is also provided a process of training a second detection model in the present application embodiment, and the training process embodiment of the method will be described in detail in five technical details, refer to the method of Example V.
以上为方法实施例四提供的图像中物体及其关键点的确定方法的具体实施方式,在该实施方式中,在从待检测图像中提取到该待检测图像对应的图像特征之后,先根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息和待检测图像上各个图像子区域内的关键点位置信息,再根据所述目标物体在待检测图像上的位置信息和所述待检测图像上各个图像子区域内的关键点位置信息,确定所述目标物体的关键点位置信息。其中,由于目标物体在待检测图像上的位置信息能够准确地表征待检测图像中目标物体的位置,且目标物体的关键点位置信息能够准确地表征待检测图像中目标物体的关键点位置,因而,本申请实施例提供的图像中物体及其关键点的确定方法能够有效地从待检测图像中确定出目标物体的位置信息及其关键点的位置信息,从而能够准确地检测出图像中物体及其关键点。另外,在获取到待检测图像对应的图像特征之后,直接根据该待检测图像对应的图像特征确定目标物体在待检测图像上的位置信息和待检测图像上各个图像子区域内的关键点位置信息,并基于目标物体在待检测图像上的位置信息和待检测图像上各个图像子区域内的关键点位置信息,确定目标物体的关键点位置信息即可,无需进行将目标物体在待检测图像上的位置信息还原为物体检测图像以及对该物体检测图像进行特征提取的过程,使得在确定目标物体在待检测图像上的位置信息以及目 标物体的关键点位置信息过程中只需进行一次图像特征提取过程,如此简化了图像中物体及其关键点的确定过程,提高了图像中物体及其关键点的确定效率,并降低了在确定图像中物体及其关键点时的内存损耗。The above is the specific implementation of the method for determining the object and its key points in the image provided by method embodiment 4. In this implementation, after the image features corresponding to the image to be detected are extracted from the image to be detected, the According to the image features corresponding to the image to be detected, the position information of the target object on the image to be detected and the key point position information in each image sub-area of the image to be detected are determined, and then according to the position information of the target object on the image to be detected And the key point position information in each image sub-region on the image to be detected to determine the key point position information of the target object. Among them, because the position information of the target object on the image to be detected can accurately represent the position of the target object in the image to be detected, and the key point position information of the target object can accurately represent the key point position of the target object in the image to be detected, The method for determining the object and its key points in the image provided by the embodiments of the application can effectively determine the position information of the target object and the position information of its key points from the image to be detected, so as to accurately detect the object and the key point in the image. Its key point. In addition, after the image feature corresponding to the image to be detected is obtained, the position information of the target object on the image to be detected and the key point position information in each image subregion on the image to be detected are directly determined according to the image feature corresponding to the image to be detected , And based on the position information of the target object on the image to be detected and the position information of the key points in each image sub-area of the image to be detected, the key point position information of the target object can be determined, without the need to place the target object on the image to be detected The location information is restored to the object detection image and the process of feature extraction on the object detection image, so that only one image feature extraction is required in the process of determining the location information of the target object on the image to be detected and the key point location information of the target object This simplifies the process of determining the object and its key points in the image, improves the efficiency of determining the object and its key points in the image, and reduces the memory loss when determining the object and its key points in the image.
另外,为了进一步降低确定图像中物体及其关键点时的内存以及时间消耗,可以将利用第二检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的位置信息以及目标物体的关键点位置信息。其中,由于在该第二检测模型是一个包括了第二物体位置检测网络层和第二关键点位置检测网络层的深度神经网络,因而,在对该第二检测模型进行训练时只需训练一个深度神经网络即可,减少了训练第二检测模型时的内存以及时间消耗;而且,在使用该第二检测模型进行物体及其关键点检测时,只需运行一个深度神经网络即可,降低了确定物体及其关键点时的内存以及时间损耗。In addition, in order to further reduce the memory and time consumption when determining the object and its key points in the image, the second detection model can be used to detect the target object and its key points to determine the position information of the target object on the image to be detected and the target The key point position information of the object. Among them, since the second detection model is a deep neural network that includes the second object position detection network layer and the second key point position detection network layer, only one training model is required when training the second detection model. A deep neural network is sufficient, reducing the memory and time consumption when training the second detection model; moreover, when using the second detection model to detect objects and their key points, you only need to run a deep neural network, which reduces Memory and time loss when determining objects and their key points.
基于上述方法实施例一至方法实施例四提供的图像中物体及其关键点的确定方法,为了进一步提高确定的图像中物体及其检测点的准确性,可以对第二检测模型进行训练,以便提高第二检测模型的检测效果。基于此,本申请实施例还提供了一种第二检测模型的训练过程,下面结合附图进行解释和说明。Based on the method for determining objects and their key points in the image provided by the foregoing method embodiment 1 to method embodiment 4, in order to further improve the accuracy of the determined objects and their detection points in the image, the second detection model can be trained to improve The detection effect of the second detection model. Based on this, the embodiment of the present application also provides a training process of a second detection model, which will be explained and described below in conjunction with the accompanying drawings.
方法实施例五Method embodiment five
参见图10,该图为本申请实施例提供的第二检测模型的训练过程流程图。Refer to FIG. 10, which is a flowchart of the training process of the second detection model provided in an embodiment of the application.
本申请实施例提供的第二检测模型的训练过程,包括步骤S101-S105:The training process of the second detection model provided in the embodiment of the present application includes steps S101-S105:
S101:获取训练图像对应的图像特征、目标物体在训练图像上的实际位置信息以及目标物体的实际关键点位置信息。S101: Obtain image features corresponding to the training image, actual position information of the target object on the training image, and actual key point position information of the target object.
需要说明的是,步骤S101的内容与上述方法实施例三的步骤S51的内容相同,技术详情请参见上述方法实施例三的步骤S51。It should be noted that the content of step S101 is the same as the content of step S51 in the third method embodiment. For technical details, please refer to step S51 in the third method embodiment.
S102:根据所述训练图像对应的图像特征,利用第二检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的预测位置信息以及目标物体的预测关键点位置信息。S102: Use the second detection model to detect the target object and its key points according to the image features corresponding to the training image, and determine the predicted position information of the target object on the image to be detected and the predicted key point position information of the target object.
目标物体在待检测图像上的预测位置信息用于记录利用第二检测模型检测获得的目标物体在待检测图像上的位置相关信息。The predicted position information of the target object on the image to be detected is used to record information about the position of the target object on the image to be detected obtained by using the second detection model.
目标物体的预测关键点位置信息用于记录利用第二检测模型检测获得的待检测图像中目标物体的预测关键点位置相关信息。The predicted key point position information of the target object is used to record the information about the predicted key point position of the target object in the image to be detected obtained by using the second detection model.
在本申请实施例中,在获取到训练图像对应的图像特征之后,将该训练图像对应的图像特征输入到第二检测模型中,以便该第二检测模型根据该训练图像对应的图像特征对目标物体及其关键点进行检测,确定目标物体在待检测图像上的预测位置信息以及目标物体的预测关键点位置信息,并由第二检测模型将目标物体在待检测图像上的预测位置信息以及目标物体的预测关键点位置信息进行输出。In the embodiment of the present application, after the image feature corresponding to the training image is obtained, the image feature corresponding to the training image is input into the second detection model, so that the second detection model can target the target according to the image feature corresponding to the training image. The object and its key points are detected, the predicted position information of the target object on the image to be detected and the predicted key point position information of the target object are determined, and the predicted position information of the target object on the image to be detected and the target are determined by the second detection model. The predicted key point position information of the object is output.
S103:根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第二检测模型是否达到第二预设条件;若是,则执行步骤S105;若否,则执行步骤S104。S103: According to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position of the target object Information, it is determined whether the second detection model meets the second preset condition; if so, step S105 is executed; if not, step S104 is executed.
第二预设条件用于记录第二检测模型的训练过程的训练截止条件信息;而且,第二预设条件可以预先设定,也可以根据应用场景设定。例如,第二预设条件可以具体包括:第二检测模型的检测损失的变化率低于第二预设损失阈值,和/或,第二检测模型的训练轮数达到第二预设轮数阈值。The second preset condition is used to record the training cut-off condition information of the training process of the second detection model; moreover, the second preset condition may be preset or set according to the application scenario. For example, the second preset condition may specifically include: the rate of change of the detection loss of the second detection model is lower than the second preset loss threshold, and/or the number of training rounds of the second detection model reaches the second preset number of rounds threshold .
在本申请实施例中,可以将由第二检测模型检测所得的物体及其关键点的预测位置信息与物体及其关键点的实际位置信息进行比较,以便根据比较结果确定第二检测模型的预测准确性,其具体为:根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第二检测模型是否达到第二预设条件,若第二检测模型达到第二预设条件,则表示由第二检测模型检测所得的物体及其关键点的预测位置信息十分接近于物体及其关键点的实际位置信息,从而表示该第二检测模型的预测准确性较高,无需再对该第二检测模型进行训练,此时可以结束对该第二检测模型的训练过程;若第二检测模型未达到第二预设条件,则表示由第二检测模型检测所得的物体及其关键点的预测位置信息与物体及其关键点的实际位置信息之间的差距较大,从而表示该第二检测模型的预测准确性较低,还需对该第二检测模型进行进一步训练,此时需要更新第二检测模型,以便提高更新后的第二检测模型的预测准确性。In the embodiment of the present application, the predicted position information of the object and its key points detected by the second detection model can be compared with the actual position information of the object and its key points, so as to determine that the prediction of the second detection model is accurate according to the comparison result. It is specifically: according to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the position information of the target object. Predict the location information of key points, and determine whether the second detection model meets the second preset condition. If the second detection model meets the second preset condition, it indicates the prediction of the object and its key points detected by the second detection model The position information is very close to the actual position information of the object and its key points, which indicates that the prediction accuracy of the second detection model is high, and there is no need to train the second detection model. At this time, the second detection model can be ended. The training process; if the second detection model does not meet the second preset condition, it means the difference between the predicted position information of the object and its key points detected by the second detection model and the actual position information of the object and its key points Larger, which indicates that the prediction accuracy of the second detection model is low, and further training of the second detection model is needed. At this time, the second detection model needs to be updated to improve the prediction accuracy of the updated second detection model Sex.
另外,为了提高对第二检测模型的检测效果的评价准确性,本申请实施例还提供了步骤S103的一种具体实施方式,如图11所示,在该实施方式中,当第 二预设条件包括第二检测模型的检测损失的变化率低于第二预设损失阈值时,步骤S103具体可以包括步骤S103A1-S103A7:In addition, in order to improve the accuracy of the evaluation of the detection effect of the second detection model, the embodiment of the present application also provides a specific implementation of step S103, as shown in FIG. 11, in this implementation, when the second preset When the condition includes that the rate of change of the detection loss of the second detection model is lower than the second preset loss threshold, step S103 may specifically include steps S103A1-S103A7:
S103A1:根据所述目标物体在训练图像上的实际位置信息以及目标物体在待检测图像上的预测位置信息,确定目标物体的检测损失。S103A1: Determine the detection loss of the target object according to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected.
目标物体的检测损失用于记录第二检测模型在获取目标物体在待检测图像上的位置信息的过程中所造成的损失;而且,所述目标物体的检测损失包括目标物体的识别损失(L-softmax)和目标物体的位置检测损失(L-regression)。The detection loss of the target object is used to record the loss caused by the second detection model in the process of acquiring the position information of the target object on the image to be detected; moreover, the detection loss of the target object includes the recognition loss of the target object (L- softmax) and the position detection loss of the target object (L-regression).
其中,目标物体的识别损失用于记录第二检测模型在识别图像中目标物体时所造成的损失。也就是说,目标物体的识别损失用于记录第二检测模型是否能够从图像中识别出目标物体。另外,目标物体的识别损失可以包括因未识别出图像中目标物体而造成的损失、因将图像中的其他物体识别为目标物体而造成的损失、因将待检测图像中的目标物体识别为其他物体而造成的损失等。Among them, the recognition loss of the target object is used to record the loss caused by the second detection model when recognizing the target object in the image. That is, the recognition loss of the target object is used to record whether the second detection model can recognize the target object from the image. In addition, the recognition loss of the target object may include the loss caused by not recognizing the target object in the image, the loss caused by recognizing other objects in the image as the target object, and the loss caused by recognizing the target object in the image to be detected as other objects. Damage caused by objects, etc.
目标物体的位置检测损失用于记录第二检测模型在确定目标物体在待检测图像上的位置信息的过程中所造成的损失;而且,所述目标物体的位置检测损失可以包括因目标物体在待检测图像上的预测位置信息不准确导致的损失。需要说明的是,在确定目标物体的位置检测损失时通常可以利用二维损失函数(l2-loss)进行计算,也就是计算预测位置坐标与实际位置坐标之间的距离差。The position detection loss of the target object is used to record the loss caused by the second detection model in the process of determining the position information of the target object on the image to be detected; moreover, the position detection loss of the target object may include Detect the loss caused by inaccurate predicted position information on the image. It should be noted that, when determining the position detection loss of the target object, a two-dimensional loss function (l2-loss) can usually be used for calculation, that is, the distance difference between the predicted position coordinate and the actual position coordinate can be calculated.
基于上述内容可知,在获取到目标物体在训练图像上的实际位置信息以及目标物体在待检测图像上的预测位置信息之后,可以将实际位置信息与预测位置信息进行比较,来确定第二检测模型的目标物体的识别损失;同时,还可以计算实际位置信息与预测位置信息之间的距离差,来确定第二检测模型的目标物体的位置检测损失。Based on the above content, after acquiring the actual location information of the target object on the training image and the predicted location information of the target object on the image to be detected, the actual location information can be compared with the predicted location information to determine the second detection model At the same time, the distance difference between the actual position information and the predicted position information can be calculated to determine the position detection loss of the target object of the second detection model.
需要说明的是,本申请实施例不限定目标物体的识别损失的计算方法,可以采用现有的或未来出现的任一种能够衡量图像中目标物体识别效果的计算方法,来确定目标物体的识别损失。另外,本申请实施例不限定目标物体的位置检测损失的计算方法,可以采用现有的或未来出现的任一种能够衡量目标物体的位置检测效果的计算方法,来确定目标物体的位置检测损失。It should be noted that the embodiment of the application does not limit the calculation method of the recognition loss of the target object, and any existing or future calculation method that can measure the recognition effect of the target object in the image can be used to determine the recognition of the target object. loss. In addition, the embodiment of the application does not limit the calculation method of the position detection loss of the target object, and any existing or future calculation method that can measure the position detection effect of the target object can be used to determine the position detection loss of the target object. .
S103A2:根据所述目标物体的实际关键点位置信息以及所述目标物体的预测关键点位置信息,确定目标物体的关键点位置检测损失。S103A2: Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object.
目标物体的关键点位置检测损失(loss-dpoint)用于记录第二检测模型在获取目标物体的关键点位置信息的过程中所造成的损失;而且,目标物体的关键点位置检测损失包括因目标物体的预测关键点位置信息不准确导致的损失。需要说明的是,在确定目标物体的关键点位置检测损失时通常可以利用二维损失函数(l2-loss)进行计算,也就是计算预测位置坐标与实际位置坐标之间的距离差。The key point position detection loss (loss-dpoint) of the target object is used to record the loss caused by the second detection model in the process of obtaining the key point position information of the target object; moreover, the key point position detection loss of the target object includes the loss-dpoint caused by the target object. Loss caused by inaccurate position information of the predicted key points of the object. It should be noted that the two-dimensional loss function (l2-loss) can usually be used to calculate the detection loss of the key point position of the target object, that is, to calculate the distance difference between the predicted position coordinate and the actual position coordinate.
需要说明的是,本申请实施例不限定目标物体的关键点位置检测损失的计算方法,可以采用现有的或未来出现的任一种能够衡量目标物体的关键点的位置检测效果的计算方法,来确定目标物体的关键点位置检测损失。It should be noted that the embodiment of the application does not limit the calculation method of the key point position detection loss of the target object, and any existing or future calculation method that can measure the position detection effect of the key point of the target object can be used. To determine the detection loss of the key point position of the target object.
S103A3:根据所述目标物体的识别损失、所述目标物体的位置检测损失和所述目标物体的关键点位置检测损失,确定第二检测模型的检测损失。S103A3: Determine the detection loss of the second detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object.
第二检测模型的检测损失(loss-detect)用于记录第二检测模型在获取目标物体在待检测图像上的位置信息以及目标物体的关键点位置信息的过程中所造成的损失。The loss-detect of the second detection model is used to record the loss caused by the second detection model in the process of acquiring the position information of the target object on the image to be detected and the key point position information of the target object.
在本申请实施例中,为了提高对模型检测效果的评价准确性,可以利用预先设定的损失函数来确定第二检测模型的检测损失,且该损失函数为:In this embodiment of the application, in order to improve the accuracy of the evaluation of the model detection effect, a preset loss function can be used to determine the detection loss of the second detection model, and the loss function is:
loss-detect=α×L-softmax+β×L-regression+F×σ×loss-dpoint,loss-detect=α×L-softmax+β×L-regression+F×σ×loss-dpoint,
式中,loss-detect表示第二检测模型的检测损失;α表示目标物体的识别损失的影响权重;L-softmax表示目标物体的识别损失;β表示目标物体的位置检测损失的影响权重;L-regression表示目标物体的位置检测损失;F表示目标物体的关键点位置检测损失的存在标识,若训练图像中包括目标物体(也就是,正样本训练图像),则该训练图像对应的F值为1,若训练图像中不包括目标物体(也就是,负样本训练图像),则该训练图像对应的F值为0;σ表示目标物体的关键点位置检测损失的影响权重;loss-dpoint表示目标物体的关键点位置检测损失。In the formula, loss-detect represents the detection loss of the second detection model; α represents the influence weight of the recognition loss of the target object; L-softmax represents the recognition loss of the target object; β represents the influence weight of the position detection loss of the target object; L- Regression represents the location detection loss of the target object; F represents the existence identification of the key point location detection loss of the target object. If the training image includes the target object (that is, the positive sample training image), the F value corresponding to the training image is 1 , If the training image does not include the target object (that is, the negative sample training image), the F value corresponding to the training image is 0; σ represents the influence weight of the key point position detection loss of the target object; loss-dpoint represents the target object The key point position detection loss.
需要说明的是,F的值是根据训练图像中是否包括目标物体确定的,具体为:若一个训练图像中包括目标物体,则需要考虑目标物体的关键点相关信息,此时在衡量第二检测模型对该训练图像的检测损失时需要考虑目标物体的识别损失、目标物体的位置检测损失和目标物体的关键点位置检测损失;若一个训练图像中不包括目标物体,则无需考虑目标物体的关键点相关信息,此时在 衡量第二检测模型对该训练图像的检测损失时只需要考虑目标物体的识别损失和目标物体的位置检测损失即可。It should be noted that the value of F is determined according to whether the target object is included in the training image, specifically: if a target object is included in a training image, the key point related information of the target object needs to be considered. At this time, the second detection is measured When the model detects the loss of the training image, it needs to consider the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object; if the target object is not included in a training image, there is no need to consider the key of the target object Point-related information. At this time, when measuring the detection loss of the training image by the second detection model, only the recognition loss of the target object and the position detection loss of the target object need to be considered.
在本申请实施例中,由于上述损失函数是通过利用目标物体的识别损失、目标物体的位置检测损失和目标物体的关键点位置检测损失这三个损失指标来共同监督第二检测模型的检测效果的,因而在利用上述损失函数评价第二检测模型的对物体及其关键点的检测效果时,能够有效地确定出当前第二检测模型是否能够准确地检测出图像中的物体及其关键点,从而提高了训练好的第二检测模型的检测准确性。In the embodiment of this application, the loss function mentioned above is to jointly supervise the detection effect of the second detection model by using three loss indicators: the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object. Therefore, when using the above loss function to evaluate the detection effect of the second detection model on the object and its key points, it can effectively determine whether the current second detection model can accurately detect the object and its key points in the image. Thus, the detection accuracy of the trained second detection model is improved.
S103A4:根据所述第二检测模型的检测损失以及第二检测模型的历史检测损失,确定第二检测模型的检测损失变化率。S103A4: Determine the detection loss change rate of the second detection model according to the detection loss of the second detection model and the historical detection loss of the second detection model.
第二检测模型的历史检测损失是在历史训练过程中确定的第二检测模型的检测损失。例如,假设对第二检测模型依次进行了第1轮训练至第Y轮训练,且第1轮训练早于第2轮训练、第2轮训练早于第3轮训练、……、第Y-1轮训练早于第Y轮训练,且第Y轮训练是当前正在执行的训练过程,此时,对于第Y轮训练来说,第1轮训练至第Y-1轮训练均是第Y轮训练的历史训练过程,那么,在第1轮训练中确定的第二检测模型的检测损失至在第Y-1轮训练中确定的第二检测模型的检测损失均是第Y轮训练的历史检测损失。The historical detection loss of the second detection model is the detection loss of the second detection model determined during the historical training process. For example, suppose that the second detection model is sequentially trained from the first round to the Y-round training, and the first-round training is earlier than the second-round training, and the second-round training is earlier than the third-round training..., Y-th The first round of training is earlier than the Y-round training, and the Y-round training is the current training process being executed. At this time, for the Y-round training, the first-round training to the Y-1th round of training are all the Y-round The historical training process of training, then, the detection loss of the second detection model determined in the first round of training to the detection loss of the second detection model determined in the Y-1 round of training are all historical detections of the Y-round training loss.
第二检测模型的检测损失变化率用于表征第二检测模型在多轮训练过程中检测效果的变化信息;而且,第二检测模型的检测损失变化率越小表示第二检测模型的检测效果越好,且第二检测模型的检测损失变化率越大表示第二检测模型的检测效果越差。The detection loss change rate of the second detection model is used to characterize the change information of the detection effect of the second detection model during multiple rounds of training; moreover, the smaller the detection loss change rate of the second detection model, the better the detection effect of the second detection model. Good, and the larger the detection loss change rate of the second detection model is, the worse the detection effect of the second detection model is.
S103A5:判断所述第二检测模型的检测损失变化率是否低于第二预设损失阈值,若是,则执行步骤S103A6;若否,则执行步骤S103A7。S103A5: Determine whether the detection loss change rate of the second detection model is lower than the second preset loss threshold, if yes, execute step S103A6; if not, execute step S103A7.
第二预设损失阈值可以预先设定,尤其可以根据应用场景设定。The second preset loss threshold can be set in advance, especially can be set according to application scenarios.
在本申请实施例中,在根据目标物体的识别损失、目标物体的位置检测损失和目标物体的关键点位置检测损失这三个监督因素确定出第二检测模型的检测损失后,再基于第二检测模型的检测损失确定第二检测模型的检测损失变化率。此时,如果第二检测模型的检测损失变化率低于第二预设损失阈值,则表示该第二检测模型检测损失变化较小且接近于稳定的最优模型,从而表示利用该第二检测模型检测所得的图像中物体及其关键点准确性较高,从而确定该 第二检测模型已达到第二预设条件,进而可以确定该第二检测模型无需再进行训练;如果第二检测模型的检测损失变化率不低于第二预设损失阈值,则表示该第二检测模型检测损失变化较大且距离稳定的最优模型较远,从而表示利用该第二检测模型检测所得的图像中物体及其关键点准确性较低,从而确定该第二检测模型未达到第二预设条件,进而可以确定该第二检测模型仍需再次进行训练。In the embodiment of the present application, after the detection loss of the second detection model is determined according to the three supervision factors of the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object, the detection loss of the second detection model is determined based on the second detection loss. The detection loss of the detection model determines the detection loss change rate of the second detection model. At this time, if the detection loss change rate of the second detection model is lower than the second preset loss threshold, it means that the detection loss of the second detection model has a small change and is close to the stable optimal model, thereby indicating that the second detection model is used. The accuracy of the objects and their key points in the image obtained by the model detection is relatively high, so that it is determined that the second detection model has reached the second preset condition, and then it can be determined that the second detection model does not need to be trained; The detection loss change rate is not lower than the second preset loss threshold, which means that the second detection model has a large change in detection loss and is farther away from the stable optimal model, thus indicating that the object in the image detected by the second detection model The accuracy of its key points is low, so it is determined that the second detection model does not meet the second preset condition, and it can be determined that the second detection model still needs to be trained again.
S103A6:确定所述第二检测模型达到第二预设条件。S103A6: Determine that the second detection model meets a second preset condition.
S103A7:确定所述第二检测模型未达到第二预设条件。S103A7: Determine that the second detection model does not meet the second preset condition.
需要说明的是,步骤S103A1和步骤S103A2之间没有固定的执行顺序,可以依次执行步骤S103A1和步骤S103A2,也可以依次执行步骤S103A2和步骤S103A1,还可以同时执行步骤S103A1和步骤S103A2。It should be noted that there is no fixed execution order between step S103A1 and step S103A2. Step S103A1 and step S103A2 can be executed in sequence, step S103A2 and step S103A1 can be executed in sequence, or step S103A1 and step S103A2 can be executed simultaneously.
以上为步骤S103的一种实施方式,该实施方式将第二检测模型的检测损失变化率低于第二预设损失阈值作为了第二检测模型的训练截止条件。The foregoing is an implementation manner of step S103. In this implementation manner, the detection loss change rate of the second detection model is lower than the second preset loss threshold as the training cut-off condition of the second detection model.
另外,为了提高对第二检测模型的检测效果的评价准确性,本申请实施例还提供了步骤S103的另一种具体实施方式,如图12所示,在该实施方式中,当第二预设条件包括第二检测模型的检测损失的变化率低于第二预设损失阈值时,步骤S103具体可以包括步骤S103B1-S103B7:In addition, in order to improve the accuracy of the evaluation of the detection effect of the second detection model, the embodiment of the present application also provides another specific implementation manner of step S103, as shown in FIG. Assuming that the condition includes that the rate of change of the detection loss of the second detection model is lower than the second preset loss threshold, step S103 may specifically include steps S103B1-S103B7:
S103B1:根据所述目标物体在训练图像上的实际位置信息以及目标物体在待检测图像上的预测位置信息,确定目标物体的检测损失。S103B1: Determine the detection loss of the target object according to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected.
其中,所述目标物体的检测损失包括目标物体的识别损失和目标物体的位置检测损失。Wherein, the detection loss of the target object includes the recognition loss of the target object and the position detection loss of the target object.
需要说明的是,步骤S103B1的内容与上述步骤S103A1的内容相同,为了简要起见,在此不再赘述,步骤S103B1的技术详情请参照上述步骤S103A1的内容。It should be noted that the content of step S103B1 is the same as the content of step S103A1 described above. For the sake of brevity, it will not be repeated here. For technical details of step S103B1, please refer to the content of step S103A1 described above.
S103B2:根据所述目标物体的实际关键点位置信息以及所述目标物体的预测关键点位置信息,确定目标物体的关键点位置检测损失。S103B2: Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object.
需要说明的是,步骤S103B2的内容与上述步骤S103A2的内容相同,为了简要起见,在此不再赘述,步骤S103B2的技术详情请参照上述步骤S103A2的内容。It should be noted that the content of step S103B2 is the same as the content of step S103A2 described above. For the sake of brevity, details are not repeated here. For technical details of step S103B2, please refer to the content of step S103A2 described above.
S103B3:根据所述目标物体的识别损失、所述目标物体的位置检测损失和所述目标物体的关键点位置检测损失,确定第二检测模型的检测损失。S103B3: Determine the detection loss of the second detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object.
需要说明的是,步骤S103B3的内容与上述步骤S103A3的内容相同,为了简要起见,在此不再赘述,步骤S103B3的技术详情请参照上述步骤S103A3的内容。It should be noted that the content of step S103B3 is the same as the content of step S103A3 described above. For the sake of brevity, it will not be repeated here. For technical details of step S103B3, please refer to the content of step S103A3 described above.
S103B4:根据所述第二检测模型的检测损失以及第二检测模型的历史检测损失,确定第二检测模型的检测损失变化率。S103B4: Determine the detection loss change rate of the second detection model according to the detection loss of the second detection model and the historical detection loss of the second detection model.
第二检测模型的历史检测损失是在历史训练过程中确定的第二检测模型的检测损失。The historical detection loss of the second detection model is the detection loss of the second detection model determined during the historical training process.
需要说明的是,步骤S103B4的内容与上述步骤S103A4的内容相同,为了简要起见,在此不再赘述,步骤S103B4的技术详情请参照上述步骤S103A4的内容。It should be noted that the content of step S103B4 is the same as the content of step S103A4 described above. For the sake of brevity, it will not be repeated here. For technical details of step S103B4, please refer to the content of step S103A4 described above.
S103B5:判断是否满足以下两个条件中的至少一个条件,且该两个条件为:第二检测模型的检测损失变化率低于第二预设损失阈值和第二检测模型的训练轮数达到第二预设轮数阈值;若是,则执行步骤S103B6;若否,则执行步骤S103B7。S103B5: Determine whether at least one of the following two conditions is met, and the two conditions are: the detection loss change rate of the second detection model is lower than the second preset loss threshold and the number of training rounds of the second detection model reaches the first Second, the preset round number threshold; if yes, execute step S103B6; if not, execute step S103B7.
在本申请实施例中,在获取到第二检测模型的检测损失变化率以及第二检测模型的训练轮数之后,需要分别进行执行动作“判断第二检测模型的检测损失变化率是否低于第二预设损失阈值”以及动作“第二检测模型的训练轮数是否达到第二预设轮数阈值”。其中,只要确定了第二检测模型的检测损失变化率低于第二预设损失阈值时,无论有没有执行动作“第二检测模型的训练轮数是否达到第二预设轮数阈值”,均直接则执行步骤S103B6;而且,只要确定了第二检测模型的训练轮数达到第二预设轮数阈值,无论有没有执行动作“判断第二检测模型的检测损失变化率是否低于第二预设损失阈值”,均直接则执行步骤S103B6;而且,只有确定了第二检测模型的检测损失变化率不低于第二预设损失阈值,且第二检测模型的训练轮数未达到第二预设轮数阈值时,才执行步骤S103B7。In the embodiment of the present application, after the detection loss change rate of the second detection model and the number of training rounds of the second detection model are obtained, the execution action "judges whether the detection loss change rate of the second detection model is lower than the first 2. Preset loss threshold" and action "Does the number of training rounds of the second detection model reach the second preset number of rounds threshold". Wherein, as long as it is determined that the detection loss change rate of the second detection model is lower than the second preset loss threshold, regardless of whether the action "whether the number of training rounds of the second detection model reaches the second preset threshold" is performed or not Directly execute step S103B6; moreover, as long as it is determined that the number of training rounds of the second detection model reaches the second preset round number threshold, no matter whether the action is performed or not, it is judged whether the detection loss change rate of the second detection model is lower than the second preset Set the loss threshold", proceed directly to step S103B6; moreover, it is only determined that the detection loss change rate of the second detection model is not lower than the second preset loss threshold, and the number of training rounds of the second detection model does not reach the second preset Step S103B7 is executed only when the threshold of the number of rounds is set.
需要说明的是,在本申请实施例不限定动作“判断第二检测模型的检测损失变化率是否低于第二预设损失阈值”和动作“第二检测模型的训练轮数是否达到第二预设轮数阈值”的执行顺序,可以依次执行动作“判断第二检测模型 的检测损失变化率是否低于第二预设损失阈值”和动作“第二检测模型的训练轮数是否达到第二预设轮数阈值”,也可以依次执行动作“第二检测模型的训练轮数是否达到第二预设轮数阈值”和动作“判断第二检测模型的检测损失变化率是否低于第二预设损失阈值”,还可以同时执行该两个动作。It should be noted that the embodiment of this application does not limit the action "judging whether the detection loss change rate of the second detection model is lower than the second preset loss threshold" and the action "whether the number of training rounds of the second detection model reaches the second predetermined loss threshold". Set the execution sequence of the number of rounds threshold. You can execute the actions “judge whether the detection loss change rate of the second detection model is lower than the second preset loss threshold” and the action “whether the number of training rounds of the second detection model reach the second preset” in sequence. Set the number of rounds threshold", you can also perform the actions "whether the number of training rounds of the second detection model reaches the second preset round number threshold" and the action "judge whether the detection loss change rate of the second detection model is lower than the second preset" Loss threshold", you can also perform the two actions at the same time.
S103B6:确定所述第二检测模型达到第二预设条件。S103B6: Determine that the second detection model meets a second preset condition.
S103B7:确定所述第二检测模型未达到第二预设条件。S103B7: Determine that the second detection model does not meet the second preset condition.
需要说明的是,步骤S103B1和步骤S103B2之间没有固定的执行顺序,可以依次执行步骤S103B1和步骤S103B2,也可以依次执行步骤S103B2和步骤S103B1,还可以同时执行步骤S103B1和步骤S103B2。It should be noted that there is no fixed execution order between step S103B1 and step S103B2. Step S103B1 and step S103B2 can be executed in sequence, step S103B2 and step S103B1 can be executed in sequence, or step S103B1 and step S103B2 can be executed simultaneously.
以上为步骤S103的另一种实施方式,该实施方式将第二检测模型的检测损失变化率低于第二预设损失阈值以及第二检测模型的训练轮数达到第二预设轮数阈值共同作为了第二检测模型的训练截止条件。The above is another implementation manner of step S103. In this implementation manner, the detection loss change rate of the second detection model is lower than the second preset loss threshold and the number of training rounds of the second detection model reaches the second preset round number threshold. As the training cut-off condition of the second detection model.
以上为步骤S103的相关内容。The above is the relevant content of step S103.
S104:更新第二检测模型,继续执行步骤S102。S104: Update the second detection model, and continue to perform step S102.
在本申请实施例中,在确定第二检测模型未达到第二预设条件时,可以根据由第二检测模型检测所得的目标物体在待检测图像上的预测位置信息与目标物体在训练图像上的实际位置信息之间的差距,以及目标物体的预测关键点位置信息与目标物体的实际关键点位置信息之间的差距,来对第二检测模型进行更新,使得更新后的第二检测模型检测所得的目标物体在待检测图像上的预测位置信息更接近与目标物体在训练图像上的实际位置信息,并使得目标物体的预测关键点位置信息更接近于标物体的实际关键点位置信息,从而使得更新后的第二检测模型能够更准确地检测出图像中物体及其关键点。In the embodiment of the present application, when it is determined that the second detection model does not meet the second preset condition, the predicted position information of the target object on the image to be detected and the target object on the training image detected by the second detection model may be used. The difference between the actual position information of the target object, and the difference between the predicted key point position information of the target object and the actual key point position information of the target object, to update the second detection model, so that the updated second detection model detects The obtained predicted position information of the target object on the image to be detected is closer to the actual position information of the target object on the training image, and the predicted key point position information of the target object is closer to the actual key point position information of the target object, thereby This enables the updated second detection model to more accurately detect objects and their key points in the image.
S105:结束对第二检测模型的训练过程。S105: End the training process for the second detection model.
上述为方法实施例五提供的第二检测模型的训练过程的具体实施方式,在该实施方式中,在获取到训练图像对应的图像特征、目标物体在训练图像上的实际位置信息以及目标物体的实际关键点位置信息之后,先根据所述训练图像对应的图像特征,利用第二检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的预测位置信息以及目标物体的预测关键点位置信息;再根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体 的预测关键点位置信息,判断所述第二检测模型是否达到第二预设条件,以便根据该判断结果确定是否继续对该第二检测模型进行训练。其中,因本申请实施例是根据目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息这四个信息来评价第二检测模型的检测效果的,使得该评价结果更合理有效,从而使得满足第二预设条件的第二检测模型的检测效果更好。The foregoing is the specific implementation manner of the training process of the second detection model provided by method embodiment 5. In this implementation manner, the image features corresponding to the training image, the actual position information of the target object on the training image, and the target object’s information are acquired. After the actual key point location information, first use the second detection model to detect the target object and its key points according to the image features corresponding to the training image, and determine the predicted location information of the target object on the image to be detected and the prediction of the target object Key point position information; and then according to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the target object’s position information Predict the position information of the key point, and determine whether the second detection model meets the second preset condition, so as to determine whether to continue training the second detection model according to the judgment result. Among them, because the embodiment of this application is based on the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the target object’s position information. Predicting the four pieces of information of the key point location information to evaluate the detection effect of the second detection model makes the evaluation result more reasonable and effective, so that the detection effect of the second detection model that satisfies the second preset condition is better.
另外,为了进一步提高对第二检测模型的检测效果的评价准确性,本申请实施例还利用目标物体的识别损失、目标物体的位置检测损失和目标物体的关键点位置检测损失这三个损失因素来共同监督第二检测模型的检测效果,使得该评价结果更合理有效,从而使得满足第二预设条件的第二检测模型的检测效果更好。In addition, in order to further improve the accuracy of the evaluation of the detection effect of the second detection model, the embodiments of the present application also use three loss factors: the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object. To jointly supervise the detection effect of the second detection model, the evaluation result is more reasonable and effective, so that the detection effect of the second detection model that meets the second preset condition is better.
需要说明的是,在本申请实施例中“图像中物体及其关键点的确定方法”的执行主体与“检测模型的训练过程”的执行主体可以是同一个执行主体,也可以是不同的执行主体,本申请实施例对此不做具体限定。其中,“图像中物体及其关键点的确定方法”的执行主体可以是终端、服务器、车辆以及其他能够执行“图像中物体及其关键点的确定方法”的设备。“检测模型的训练过程”的执行主体可以是终端、服务器、车辆以及其他能够执行“检测模型的训练过程”的设备。It should be noted that in the embodiment of the present application, the execution subject of the "method for determining objects and their key points in the image" and the execution subject of the "training process of the detection model" may be the same execution subject, or they may be different executions. The main body, the embodiment of this application does not specifically limit this. Among them, the "method for determining objects in images and their key points" can be executed by terminals, servers, vehicles, and other devices that can execute "methods for determining objects in images and their key points". The execution subject of the "training process of the detection model" can be terminals, servers, vehicles, and other devices capable of executing the "training process of the detection model".
基于上述方法实施例提供的图像中物体及其关键点的确定方法的具体实施方式,本申请实施例还提供了一种图像中物体及其关键点的确定装置,下面结合附图进行解释和说明。Based on the specific implementation of the method for determining objects and their key points in an image provided by the foregoing method embodiments, an embodiment of the present application also provides a device for determining objects and their key points in an image. The following is an explanation and description with reference to the accompanying drawings. .
装置实施例Device embodiment
装置实施例提供的图像中物体及其关键点的确定装置的技术详情,请参照上述方法实施例。For technical details of the device for determining the object and its key points in the image provided by the device embodiment, please refer to the above method embodiment.
参见图13,该图为本申请实施例提供的图像中物体及其关键点的确定装置的结构示意图。Refer to FIG. 13, which is a schematic structural diagram of an apparatus for determining an object and its key points in an image provided by an embodiment of the application.
本申请实施例提供的图像中物体及其关键点的确定装置130,包括:The device 130 for determining an object and its key points in an image provided by the embodiment of the present application includes:
提取单元131,用于对待检测图像进行特征提取,得到所述待检测图像对应的图像特征;The extraction unit 131 is configured to perform feature extraction on the image to be detected to obtain image features corresponding to the image to be detected;
确定单元132,用于根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息以及所述目标物体的关键点位置信息;其中,所述目标物体的关键点用于表征所述目标物体的结构特征。The determining unit 132 is configured to determine the position information of the target object on the image to be detected and the key point position information of the target object according to the image feature corresponding to the image to be detected; wherein, the key point of the target object is used for Characterize the structural characteristics of the target object.
作为一种实施方式,为了提高图像中物体及其关键点的确定准确性,所述确定单元132,具体用于:根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息,并根据所述待检测图像对应的图像特征和所述目标物体在待检测图像上的位置信息,确定目标物体的关键点位置信息。As an implementation manner, in order to improve the accuracy of determining the object and its key points in the image, the determining unit 132 is specifically configured to: determine the position of the target object on the image to be detected according to the image characteristics corresponding to the image to be detected. Location information, and determine the key point location information of the target object according to the image feature corresponding to the image to be detected and the location information of the target object on the image to be detected.
作为一种实施方式,为了提高图像中物体及其关键点的确定准确性,所述确定单元132,具体用于:As an implementation manner, in order to improve the accuracy of determining objects and their key points in the image, the determining unit 132 is specifically configured to:
根据所述待检测图像对应的图像特征,利用预先构建的第一检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的位置信息以及目标物体的关键点位置信息;According to the image features corresponding to the image to be detected, use a pre-built first detection model to detect the target object and its key points, and determine the position information of the target object on the image to be detected and the key point position information of the target object;
其中,所述第一检测模型包括第一物体位置检测网络层和第一关键点位置检测网络层,且所述第一物体位置检测网络层的输出结果是所述第一关键点位置检测网络层的输入数据,且所述第一物体位置检测网络层用于根据所述待检测图像对应的图像特征确定目标物体在待检测图像上的位置信息,所述第一关键点位置检测网络层用于根据所述待检测图像对应的图像特征和所述目标物体在待检测图像上的位置信息,确定目标物体的关键点位置信息。Wherein, the first detection model includes a first object position detection network layer and a first key point position detection network layer, and the output result of the first object position detection network layer is the first key point position detection network layer The first object position detection network layer is used to determine the position information of the target object on the image to be detected according to the image features corresponding to the image to be detected, and the first key point position detection network layer is used to According to the image feature corresponding to the image to be detected and the position information of the target object on the image to be detected, the key point position information of the target object is determined.
作为一种实施方式,为了提高图像中物体及其关键点的确定准确性,所述第一检测模型的训练过程,具体包括:As an implementation manner, in order to improve the accuracy of determining the objects and their key points in the image, the training process of the first detection model specifically includes:
获取训练图像对应的图像特征、目标物体在训练图像上的实际位置信息以及目标物体的实际关键点位置信息;Obtain the image features corresponding to the training image, the actual position information of the target object on the training image, and the actual key point position information of the target object;
根据所述训练图像对应的图像特征,利用第一检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的预测位置信息以及目标物体的预测关键点位置信息;According to the image features corresponding to the training image, use the first detection model to detect the target object and its key points, and determine the predicted position information of the target object on the image to be detected and the predicted key point position information of the target object;
根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第一检测模型是否达到第一预设条件;According to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object, Judging whether the first detection model meets the first preset condition;
若确定所述第一检测模型未达到第一预设条件,则更新第一检测模型,继续执行“根据所述训练图像对应的图像特征,利用第一检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的预测位置信息以及目标物体的预测关键点位置信息”。If it is determined that the first detection model does not meet the first preset condition, update the first detection model, and continue to execute "Using the first detection model to detect the target object and its key points according to the image features corresponding to the training image , Determine the predicted location information of the target object on the image to be detected and the predicted key point location information of the target object".
作为一种实施方式,为了提高图像中物体及其关键点的确定准确性,所述第一预设条件为:第一检测模型的检测损失的变化率低于第一预设损失阈值,和/或,第一检测模型的训练轮数达到第一预设轮数阈值。As an implementation manner, in order to improve the accuracy of determining the object and its key points in the image, the first preset condition is: the change rate of the detection loss of the first detection model is lower than the first preset loss threshold, and/ Or, the number of training rounds of the first detection model reaches the first preset round number threshold.
作为一种实施方式,为了提高图像中物体及其关键点的确定准确性,当所述第一预设条件包括第一检测模型的检测损失的变化率低于第一预设损失阈值时,所述根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第一检测模型是否达到第一预设条件,具体包括:As an implementation manner, in order to improve the accuracy of determining the object and its key points in the image, when the first preset condition includes that the rate of change of the detection loss of the first detection model is lower than the first preset loss threshold, According to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object , Judging whether the first detection model meets the first preset condition, specifically including:
根据所述目标物体在训练图像上的实际位置信息以及目标物体在待检测图像上的预测位置信息,确定目标物体的检测损失;其中,所述目标物体的检测损失包括目标物体的识别损失和目标物体的位置检测损失;According to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected, the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
根据所述目标物体的实际关键点位置信息以及所述目标物体的预测关键点位置信息,确定目标物体的关键点位置检测损失;Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object;
根据所述目标物体的识别损失、所述目标物体的位置检测损失和所述目标物体的关键点位置检测损失,确定第一检测模型的检测损失;Determine the detection loss of the first detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object;
根据所述第一检测模型的检测损失以及第一检测模型的历史检测损失,确定第一检测模型的检测损失变化率;所述第一检测模型的历史检测损失是在历史训练过程中确定的第一检测模型的检测损失;According to the detection loss of the first detection model and the historical detection loss of the first detection model, the detection loss change rate of the first detection model is determined; the historical detection loss of the first detection model is the first determined in the historical training process. 1. Detection loss of the detection model;
若所述第一检测模型的检测损失变化率低于第一预设损失阈值,则确定所述第一检测模型达到第一预设条件;If the detection loss change rate of the first detection model is lower than the first preset loss threshold, determining that the first detection model meets the first preset condition;
若所述第一检测模型的检测损失变化率不低于第一预设损失阈值,则确定所述第一检测模型未达到第一预设条件。If the detection loss change rate of the first detection model is not lower than the first preset loss threshold, it is determined that the first detection model does not meet the first preset condition.
作为一种实施方式,为了提高图像中物体及其关键点的确定准确性,所述根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的 预测关键点位置信息,判断所述第一检测模型是否达到第一预设条件,具体包括:As an implementation manner, in order to improve the accuracy of determining the object and its key points in the image, according to the actual position information of the target object on the training image, the actual key point position information of the target object, and the target object The predicted position information of the object on the image to be detected and the predicted key point position information of the target object to determine whether the first detection model meets the first preset condition specifically includes:
根据所述目标物体在训练图像上的实际位置信息以及目标物体在待检测图像上的预测位置信息,确定目标物体的检测损失;其中,所述目标物体的检测损失包括目标物体的识别损失和目标物体的位置检测损失;According to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected, the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
根据所述目标物体的实际关键点位置信息以及所述目标物体的预测关键点位置信息,确定目标物体的关键点位置检测损失;Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object;
根据所述目标物体的识别损失、所述目标物体的位置检测损失和所述目标物体的关键点位置检测损失,确定第一检测模型的检测损失;Determine the detection loss of the first detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object;
根据所述第一检测模型的检测损失以及第一检测模型的历史检测损失,确定第一检测模型的检测损失变化率;所述第一检测模型的历史检测损失是在历史训练过程中确定的第一检测模型的检测损失;According to the detection loss of the first detection model and the historical detection loss of the first detection model, the detection loss change rate of the first detection model is determined; the historical detection loss of the first detection model is the first determined in the historical training process. 1. Detection loss of the detection model;
若所述第一检测模型的检测损失变化率低于第一预设损失阈值,或第一检测模型的训练轮数达到第一预设轮数阈值,则确定所述第一检测模型达到第一预设条件;If the detection loss change rate of the first detection model is lower than the first preset loss threshold, or the number of training rounds of the first detection model reaches the first preset number of rounds threshold, it is determined that the first detection model reaches the first Precondition
若所述第一检测模型的检测损失变化率不低于第一预设损失阈值,且第一检测模型的训练轮数未达到第一预设轮数阈值,则确定所述第一检测模型未达到第一预设条件。If the detection loss change rate of the first detection model is not lower than the first preset loss threshold, and the number of training rounds of the first detection model does not reach the first preset number of rounds threshold, it is determined that the first detection model is not The first preset condition is reached.
作为一种实施方式,为了提高图像中物体及其关键点的确定准确性,所述确定单元132,具体用于:As an implementation manner, in order to improve the accuracy of determining objects and their key points in the image, the determining unit 132 is specifically configured to:
根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息和待检测图像上各个图像子区域内的关键点位置信息,并根据所述目标物体在待检测图像上的位置信息和所述待检测图像上各个图像子区域内的关键点位置信息,确定所述目标物体的关键点位置信息;所述图像子区域是锚点对应的感兴趣区域。According to the image features corresponding to the image to be detected, the position information of the target object on the image to be detected and the position information of key points in each image subregion on the image to be detected are determined, and according to the position of the target object on the image to be detected The position information and the key point position information in each image sub-region on the image to be detected determine the key point position information of the target object; the image sub-region is the region of interest corresponding to the anchor point.
作为一种实施方式,为了提高图像中物体及其关键点的确定准确性,所述确定单元132,具体用于:As an implementation manner, in order to improve the accuracy of determining objects and their key points in the image, the determining unit 132 is specifically configured to:
根据所述待检测图像对应的图像特征,利用预先构建的第二检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的位置信息以及目标物体的关键点位置信息;According to the image features corresponding to the image to be detected, use a pre-built second detection model to detect the target object and its key points, and determine the position information of the target object on the image to be detected and the key point position information of the target object;
其中,所述第二检测模型包括第二物体位置检测网络层、第二关键点位置检测网络层和物体关键点位置确定网络层,且所述第二物体位置检测网络层的输出结果和所述第二关键点位置检测网络层的输出结果是所述物体关键点位置确定网络层的输入数据,且所述第二物体位置检测网络层用于根据所述待检测图像对应的图像特征确定目标物体在待检测图像上的位置信息;且所述第二关键点位置检测网络层用于根据所述待检测图像对应的图像特征确定待检测图像上各个图像子区域内的关键点位置信息;且所述物体关键点位置确定网络层用于根据所述目标物体在待检测图像上的位置信息和所述待检测图像上各个图像子区域内的关键点位置信息,确定所述目标物体的关键点位置信息。Wherein, the second detection model includes a second object position detection network layer, a second key point position detection network layer, and an object key point position determination network layer, and the output result of the second object position detection network layer and the The output result of the second key point position detection network layer is the input data of the object key point position determination network layer, and the second object position detection network layer is used to determine the target object according to the image feature corresponding to the image to be detected Position information on the image to be detected; and the second key point position detection network layer is used to determine the position information of key points in each image subregion on the image to be detected according to the image characteristics corresponding to the image to be detected; and The object key point position determination network layer is used to determine the key point position of the target object according to the position information of the target object on the image to be detected and the key point position information in each image subregion of the image to be detected information.
作为一种实施方式,为了提高图像中物体及其关键点的确定准确性,所述第二检测模型的训练过程,具体包括:As an implementation manner, in order to improve the accuracy of determining the objects and their key points in the image, the training process of the second detection model specifically includes:
获取训练图像对应的图像特征、目标物体在训练图像上的实际位置信息以及目标物体的实际关键点位置信息;Obtain the image features corresponding to the training image, the actual position information of the target object on the training image, and the actual key point position information of the target object;
根据所述训练图像对应的图像特征,利用第二检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的预测位置信息以及目标物体的预测关键点位置信息;According to the image features corresponding to the training image, use the second detection model to detect the target object and its key points, and determine the predicted position information of the target object on the image to be detected and the predicted key point position information of the target object;
根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第二检测模型是否达到第二预设条件;According to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object, Judging whether the second detection model meets a second preset condition;
若确定所述第二检测模型未达到第二预设条件,则更新第二检测模型,继续执行“根据所述训练图像对应的图像特征,利用第二检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的预测位置信息以及目标物体的预测关键点位置信息”。If it is determined that the second detection model does not meet the second preset condition, update the second detection model, and continue to execute "Using the second detection model to detect the target object and its key points according to the image features corresponding to the training image , Determine the predicted location information of the target object on the image to be detected and the predicted key point location information of the target object".
作为一种实施方式,为了提高图像中物体及其关键点的确定准确性,所述第二预设条件为:第二检测模型的检测损失的变化率低于第二预设损失阈值,和/或,第二检测模型的训练轮数达到第二预设轮数阈值。As an implementation manner, in order to improve the accuracy of determining the object and its key points in the image, the second preset condition is: the change rate of the detection loss of the second detection model is lower than the second preset loss threshold, and/ Or, the number of training rounds of the second detection model reaches the second preset round number threshold.
作为一种实施方式,为了提高图像中物体及其关键点的确定准确性,当所述第二预设条件包括第二检测模型的检测损失的变化率低于第二预设损失阈值时,所述根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述 目标物体的预测关键点位置信息,判断所述第二检测模型是否达到第二预设条件,具体包括:As an implementation manner, in order to improve the accuracy of determining the object and its key points in the image, when the second preset condition includes that the rate of change of the detection loss of the second detection model is lower than the second preset loss threshold, According to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object , Judging whether the second detection model meets the second preset condition, specifically including:
根据所述目标物体在训练图像上的实际位置信息以及目标物体在待检测图像上的预测位置信息,确定目标物体的检测损失;其中,所述目标物体的检测损失包括目标物体的识别损失和目标物体的位置检测损失;According to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected, the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
根据所述目标物体的实际关键点位置信息以及所述目标物体的预测关键点位置信息,确定目标物体的关键点位置检测损失;Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object;
根据所述目标物体的识别损失、所述目标物体的位置检测损失和所述目标物体的关键点位置检测损失,确定第二检测模型的检测损失;Determine the detection loss of the second detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object;
根据所述第二检测模型的检测损失以及第二检测模型的历史检测损失,确定第二检测模型的检测损失变化率;所述第二检测模型的历史检测损失是在历史训练过程中确定的第二检测模型的检测损失;According to the detection loss of the second detection model and the historical detection loss of the second detection model, the detection loss change rate of the second detection model is determined; the historical detection loss of the second detection model is the first determined in the historical training process. 2. Detection loss of the detection model;
若所述第二检测模型的检测损失变化率低于第二预设损失阈值,则确定所述第二检测模型达到第二预设条件;If the detection loss change rate of the second detection model is lower than a second preset loss threshold, determining that the second detection model meets the second preset condition;
若所述第二检测模型的检测损失变化率不低于第二预设损失阈值,则确定所述第二检测模型未达到第二预设条件。If the detection loss change rate of the second detection model is not lower than the second preset loss threshold, it is determined that the second detection model does not meet the second preset condition.
作为一种实施方式,为了提高图像中物体及其关键点的确定准确性,所述根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第二检测模型是否达到第二预设条件,具体包括:As an implementation manner, in order to improve the accuracy of determining the object and its key points in the image, according to the actual position information of the target object on the training image, the actual key point position information of the target object, and the target object The predicted position information of the object on the image to be detected and the predicted key point position information of the target object to determine whether the second detection model meets the second preset condition specifically includes:
根据所述目标物体在训练图像上的实际位置信息以及目标物体在待检测图像上的预测位置信息,确定目标物体的检测损失;其中,所述目标物体的检测损失包括目标物体的识别损失和目标物体的位置检测损失;According to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected, the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
根据所述目标物体的实际关键点位置信息以及所述目标物体的预测关键点位置信息,确定目标物体的关键点位置检测损失;Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object;
根据所述目标物体的识别损失、所述目标物体的位置检测损失和所述目标物体的关键点位置检测损失,确定第二检测模型的检测损失;Determine the detection loss of the second detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object;
根据所述第二检测模型的检测损失以及第二检测模型的历史检测损失,确定第二检测模型的检测损失变化率;所述第二检测模型的历史检测损失是在历史训练过程中确定的第二检测模型的检测损失;According to the detection loss of the second detection model and the historical detection loss of the second detection model, the detection loss change rate of the second detection model is determined; the historical detection loss of the second detection model is the first determined in the historical training process. 2. Detection loss of the detection model;
若所述第二检测模型的检测损失变化率低于第二预设损失阈值,或第二检测模型的训练轮数达到第二预设轮数阈值,则确定所述第二检测模型达到第二预设条件;If the detection loss change rate of the second detection model is lower than the second preset loss threshold, or the number of training rounds of the second detection model reaches the second preset number of rounds threshold, it is determined that the second detection model reaches the second Precondition
若所述第二检测模型的检测损失变化率不低于第二预设损失阈值,且第二检测模型的训练轮数未达到第二预设轮数阈值,则确定所述第二检测模型未达到第二预设条件。If the detection loss change rate of the second detection model is not lower than the second preset loss threshold, and the number of training rounds of the second detection model does not reach the second preset number of rounds threshold, it is determined that the second detection model is not The second preset condition is reached.
以上为装置实施例提供的图像中物体及其关键点的确定装置130的具体实施方式,在该实施方式中,在从待检测图像中提取到该待检测图像对应的图像特征之后,直接根据该待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息以及该目标物体的关键点位置信息。其中,由于目标物体在待检测图像上的位置信息能够准确地表征待检测图像中目标物体的位置,且目标物体的关键点位置信息能够准确地表征待检测图像中目标物体的关键点位置,因而,本申请实施例提供的图像中物体及其关键点的确定方法能够有效地从待检测图像中确定出目标物体的位置信息及其关键点的位置信息,从而能够准确地检测出图像中物体及其关键点。另外,由于在获取到待检测图像对应的图像特征之后,直接基于该待检测图像对应的图像特征确定目标物体在待检测图像上的位置信息以及所述目标物体的关键点位置信息即可,无需进行将目标物体在待检测图像上的位置信息还原为物体检测图像以及对该物体检测图像进行特征提取的过程,使得在确定目标物体在待检测图像上的位置信息以及目标物体的关键点位置信息过程中只需进行一次图像特征提取过程,如此简化了图像中物体及其关键点的确定过程,提高了图像中物体及其关键点的确定效率,并降低了在确定图像中物体及其关键点时的内存损耗。The above is the specific implementation of the device 130 for determining the object and its key points in the image provided by the device embodiment. In this implementation, after the image feature corresponding to the image to be detected is extracted from the image to be detected, it is directly based on the The image feature corresponding to the image to be detected determines the position information of the target object on the image to be detected and the key point position information of the target object. Among them, because the position information of the target object on the image to be detected can accurately represent the position of the target object in the image to be detected, and the key point position information of the target object can accurately represent the key point position of the target object in the image to be detected, The method for determining the object and its key points in the image provided by the embodiments of the application can effectively determine the position information of the target object and the position information of its key points from the image to be detected, so as to accurately detect the object and the key point in the image. Its key point. In addition, since the image feature corresponding to the image to be detected is obtained, the location information of the target object on the image to be detected and the key point location information of the target object can be determined directly based on the image feature corresponding to the image to be detected. The process of restoring the position information of the target object on the image to be detected into an object detection image and performing feature extraction on the object detection image, so that the position information of the target object on the image to be detected and the key point position information of the target object are determined Only one image feature extraction process is required in the process, which simplifies the process of determining objects and their key points in the image, improves the efficiency of determining objects and their key points in the image, and reduces the determination of objects and their key points in the image Memory loss at the time.
基于上述方法实施例提供的图像中物体及其关键点的确定方法的具体实施方式,本申请实施例还提供了一种设备,下面结合附图进行解释和说明。Based on the specific implementation of the method for determining the object and its key points in the image provided by the foregoing method embodiment, an embodiment of the present application also provides a device, which will be explained and described below with reference to the accompanying drawings.
设备实施例Device embodiment
设备实施例提供的设备技术详情,请参照上述方法实施例。For details of the device technology provided by the device embodiment, please refer to the above method embodiment.
参见图14,该图为本申请实施例提供的设备结构示意图。Refer to FIG. 14, which is a schematic diagram of the device structure provided by an embodiment of the application.
本申请实施例提供的设备140,包括处理器141以及存储器142:The device 140 provided in the embodiment of the present application includes a processor 141 and a memory 142:
所述存储器142用于存储计算机程序;The memory 142 is used to store computer programs;
所述处理器141用于根据所述计算机程序执行上述方法实施例提供的图像中物体及其关键点的确定方法的任一实施方式。也就是说,处理器141用于执行以下步骤:The processor 141 is configured to execute any implementation manner of the method for determining an object and its key points in an image provided by the foregoing method embodiment according to the computer program. In other words, the processor 141 is configured to perform the following steps:
对待检测图像进行特征提取,得到所述待检测图像对应的图像特征;Performing feature extraction on the image to be detected to obtain image features corresponding to the image to be detected;
根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息以及所述目标物体的关键点位置信息;其中,所述目标物体的关键点用于表征所述目标物体的结构特征。According to the image features corresponding to the image to be detected, the position information of the target object on the image to be detected and the position information of the key points of the target object are determined; wherein, the key points of the target object are used to characterize the Structure.
可选的,所述根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息以及所述目标物体的关键点位置信息,具体为:Optionally, the determining the position information of the target object on the image to be detected and the key point position information of the target object according to the image feature corresponding to the image to be detected is specifically:
根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息,并根据所述待检测图像对应的图像特征和所述目标物体在待检测图像上的位置信息,确定目标物体的关键点位置信息。Determine the location information of the target object on the image to be detected according to the image feature corresponding to the image to be detected, and determine the target object based on the image feature corresponding to the image to be detected and the location information of the target object on the image to be detected The key point position information of the object.
可选的,所述根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息,并根据所述待检测图像对应的图像特征和所述目标物体在待检测图像上的位置信息,确定目标物体的关键点位置信息,具体包括:Optionally, the position information of the target object on the image to be detected is determined according to the image feature corresponding to the image to be detected, and the position information of the target object on the image to be detected is determined according to the image feature corresponding to the image to be detected and the target object is on the image to be detected The location information to determine the key point location information of the target object, including:
根据所述待检测图像对应的图像特征,利用预先构建的第一检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的位置信息以及目标物体的关键点位置信息;According to the image features corresponding to the image to be detected, use a pre-built first detection model to detect the target object and its key points, and determine the position information of the target object on the image to be detected and the key point position information of the target object;
其中,所述第一检测模型包括第一物体位置检测网络层和第一关键点位置检测网络层,且所述第一物体位置检测网络层的输出结果是所述第一关键点位置检测网络层的输入数据,且所述第一物体位置检测网络层用于根据所述待检测图像对应的图像特征确定目标物体在待检测图像上的位置信息,所述第一关键点位置检测网络层用于根据所述待检测图像对应的图像特征和所述目标物体在待检测图像上的位置信息,确定目标物体的关键点位置信息。Wherein, the first detection model includes a first object position detection network layer and a first key point position detection network layer, and the output result of the first object position detection network layer is the first key point position detection network layer The first object position detection network layer is used to determine the position information of the target object on the image to be detected according to the image features corresponding to the image to be detected, and the first key point position detection network layer is used to According to the image feature corresponding to the image to be detected and the position information of the target object on the image to be detected, the key point position information of the target object is determined.
可选的,所述第一检测模型的训练过程,具体包括:Optionally, the training process of the first detection model specifically includes:
获取训练图像对应的图像特征、目标物体在训练图像上的实际位置信息以及目标物体的实际关键点位置信息;Obtain the image features corresponding to the training image, the actual position information of the target object on the training image, and the actual key point position information of the target object;
根据所述训练图像对应的图像特征,利用第一检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的预测位置信息以及目标物体的预测关键点位置信息;According to the image features corresponding to the training image, use the first detection model to detect the target object and its key points, and determine the predicted position information of the target object on the image to be detected and the predicted key point position information of the target object;
根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第一检测模型是否达到第一预设条件;According to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object, Judging whether the first detection model meets the first preset condition;
若确定所述第一检测模型未达到第一预设条件,则更新第一检测模型,继续执行“根据所述训练图像对应的图像特征,利用第一检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的预测位置信息以及目标物体的预测关键点位置信息”。If it is determined that the first detection model does not meet the first preset condition, update the first detection model, and continue to execute "Using the first detection model to detect the target object and its key points according to the image features corresponding to the training image , Determine the predicted location information of the target object on the image to be detected and the predicted key point location information of the target object".
可选的,所述第一预设条件为:第一检测模型的检测损失的变化率低于第一预设损失阈值,和/或,第一检测模型的训练轮数达到第一预设轮数阈值。Optionally, the first preset condition is: the rate of change of the detection loss of the first detection model is lower than a first preset loss threshold, and/or the number of training rounds of the first detection model reaches the first preset round Number threshold.
可选的,当所述第一预设条件包括第一检测模型的检测损失的变化率低于第一预设损失阈值时,所述根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第一检测模型是否达到第一预设条件,具体包括:Optionally, when the first preset condition includes that the rate of change of the detection loss of the first detection model is lower than the first preset loss threshold, according to the actual position information of the target object on the training image, The actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object, determine whether the first detection model meets the first preset condition, Specifically:
根据所述目标物体在训练图像上的实际位置信息以及目标物体在待检测图像上的预测位置信息,确定目标物体的检测损失;其中,所述目标物体的检测损失包括目标物体的识别损失和目标物体的位置检测损失;According to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected, the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
根据所述目标物体的实际关键点位置信息以及所述目标物体的预测关键点位置信息,确定目标物体的关键点位置检测损失;Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object;
根据所述目标物体的识别损失、所述目标物体的位置检测损失和所述目标物体的关键点位置检测损失,确定第一检测模型的检测损失;Determine the detection loss of the first detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object;
根据所述第一检测模型的检测损失以及第一检测模型的历史检测损失,确定第一检测模型的检测损失变化率;所述第一检测模型的历史检测损失是在历史训练过程中确定的第一检测模型的检测损失;According to the detection loss of the first detection model and the historical detection loss of the first detection model, the detection loss change rate of the first detection model is determined; the historical detection loss of the first detection model is the first determined in the historical training process. 1. Detection loss of the detection model;
若所述第一检测模型的检测损失变化率低于第一预设损失阈值,则确定所述第一检测模型达到第一预设条件;If the detection loss change rate of the first detection model is lower than the first preset loss threshold, determining that the first detection model meets the first preset condition;
若所述第一检测模型的检测损失变化率不低于第一预设损失阈值,则确定所述第一检测模型未达到第一预设条件。If the detection loss change rate of the first detection model is not lower than the first preset loss threshold, it is determined that the first detection model does not meet the first preset condition.
可选的,所述根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第一检测模型是否达到第一预设条件,具体包括:Optionally, according to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the position information of the target object Predicting the location information of key points and judging whether the first detection model meets the first preset condition specifically includes:
根据所述目标物体在训练图像上的实际位置信息以及目标物体在待检测图像上的预测位置信息,确定目标物体的检测损失;其中,所述目标物体的检测损失包括目标物体的识别损失和目标物体的位置检测损失;According to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected, the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
根据所述目标物体的实际关键点位置信息以及所述目标物体的预测关键点位置信息,确定目标物体的关键点位置检测损失;Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object;
根据所述目标物体的识别损失、所述目标物体的位置检测损失和所述目标物体的关键点位置检测损失,确定第一检测模型的检测损失;Determine the detection loss of the first detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object;
根据所述第一检测模型的检测损失以及第一检测模型的历史检测损失,确定第一检测模型的检测损失变化率;所述第一检测模型的历史检测损失是在历史训练过程中确定的第一检测模型的检测损失;According to the detection loss of the first detection model and the historical detection loss of the first detection model, the detection loss change rate of the first detection model is determined; the historical detection loss of the first detection model is the first determined in the historical training process. 1. Detection loss of the detection model;
若所述第一检测模型的检测损失变化率低于第一预设损失阈值,或第一检测模型的训练轮数达到第一预设轮数阈值,则确定所述第一检测模型达到第一预设条件;If the detection loss change rate of the first detection model is lower than the first preset loss threshold, or the number of training rounds of the first detection model reaches the first preset number of rounds threshold, it is determined that the first detection model reaches the first Precondition
若所述第一检测模型的检测损失变化率不低于第一预设损失阈值,且第一检测模型的训练轮数未达到第一预设轮数阈值,则确定所述第一检测模型未达到第一预设条件。If the detection loss change rate of the first detection model is not lower than the first preset loss threshold, and the number of training rounds of the first detection model does not reach the first preset number of rounds threshold, it is determined that the first detection model is not The first preset condition is reached.
可选的,所述根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息以及所述目标物体的关键点位置信息,具体为:Optionally, the determining the position information of the target object on the image to be detected and the key point position information of the target object according to the image feature corresponding to the image to be detected is specifically:
根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息和待检测图像上各个图像子区域内的关键点位置信息,并根据所述目标物体在待检测图像上的位置信息和所述待检测图像上各个图像子区域内的关键点位置信息,确定所述目标物体的关键点位置信息;所述图像子区域是锚点对应的感兴趣区域。According to the image features corresponding to the image to be detected, the position information of the target object on the image to be detected and the position information of key points in each image subregion on the image to be detected are determined, and according to the position of the target object on the image to be detected The position information and the key point position information in each image sub-region on the image to be detected determine the key point position information of the target object; the image sub-region is the region of interest corresponding to the anchor point.
可选的,所述根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息和待检测图像上各个图像子区域内的关键点位置信息,并根据所述目标物体在待检测图像上的位置信息和所述待检测图像上各个图像子区域内的关键点位置信息,确定所述目标物体的关键点位置信息,具体为:Optionally, the position information of the target object on the image to be detected and the position information of key points in each image subregion on the image to be detected are determined according to the image features corresponding to the image to be detected, and based on the target object The location information on the image to be detected and the location information of the key points in each image sub-region on the image to be detected, to determine the location information of the key points of the target object, specifically:
根据所述待检测图像对应的图像特征,利用预先构建的第二检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的位置信息以及目标物体的关键点位置信息;According to the image features corresponding to the image to be detected, use a pre-built second detection model to detect the target object and its key points, and determine the position information of the target object on the image to be detected and the key point position information of the target object;
其中,所述第二检测模型包括第二物体位置检测网络层、第二关键点位置检测网络层和物体关键点位置确定网络层,且所述第二物体位置检测网络层的输出结果和所述第二关键点位置检测网络层的输出结果是所述物体关键点位置确定网络层的输入数据,且所述第二物体位置检测网络层用于根据所述待检测图像对应的图像特征确定目标物体在待检测图像上的位置信息;且所述第二关键点位置检测网络层用于根据所述待检测图像对应的图像特征确定待检测图像上各个图像子区域内的关键点位置信息;且所述物体关键点位置确定网络层用于根据所述目标物体在待检测图像上的位置信息和所述待检测图像上各个图像子区域内的关键点位置信息,确定所述目标物体的关键点位置信息。Wherein, the second detection model includes a second object position detection network layer, a second key point position detection network layer, and an object key point position determination network layer, and the output result of the second object position detection network layer and the The output result of the second key point position detection network layer is the input data of the object key point position determination network layer, and the second object position detection network layer is used to determine the target object according to the image feature corresponding to the image to be detected Position information on the image to be detected; and the second key point position detection network layer is used to determine the position information of key points in each image subregion on the image to be detected according to the image characteristics corresponding to the image to be detected; and The object key point position determination network layer is used to determine the key point position of the target object according to the position information of the target object on the image to be detected and the key point position information in each image subregion of the image to be detected information.
可选的,所述第二检测模型的训练过程,具体包括:Optionally, the training process of the second detection model specifically includes:
获取训练图像对应的图像特征、目标物体在训练图像上的实际位置信息以及目标物体的实际关键点位置信息;Obtain the image features corresponding to the training image, the actual position information of the target object on the training image, and the actual key point position information of the target object;
根据所述训练图像对应的图像特征,利用第二检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的预测位置信息以及目标物体的预测关键点位置信息;According to the image features corresponding to the training image, use the second detection model to detect the target object and its key points, and determine the predicted position information of the target object on the image to be detected and the predicted key point position information of the target object;
根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第二检测模型是否达到第二预设条件;According to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object, Judging whether the second detection model meets a second preset condition;
若确定所述第二检测模型未达到第二预设条件,则更新第二检测模型,继续执行“根据所述训练图像对应的图像特征,利用第二检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的预测位置信息以及目标物体的预测关键点位置信息”。If it is determined that the second detection model does not meet the second preset condition, update the second detection model, and continue to execute "Using the second detection model to detect the target object and its key points according to the image features corresponding to the training image , Determine the predicted location information of the target object on the image to be detected and the predicted key point location information of the target object".
可选的,所述第二预设条件为:第二检测模型的检测损失的变化率低于第二预设损失阈值,和/或,第二检测模型的训练轮数达到第二预设轮数阈值。Optionally, the second preset condition is: the rate of change of the detection loss of the second detection model is lower than a second preset loss threshold, and/or the number of training rounds of the second detection model reaches the second preset round Number threshold.
可选的,当所述第二预设条件包括第二检测模型的检测损失的变化率低于第二预设损失阈值时,所述根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第二检测模型是否达到第二预设条件,具体包括:Optionally, when the second preset condition includes that the rate of change of the detection loss of the second detection model is lower than a second preset loss threshold, according to the actual position information of the target object on the training image, The actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object, determine whether the second detection model meets a second preset condition, Specifically:
根据所述目标物体在训练图像上的实际位置信息以及目标物体在待检测图像上的预测位置信息,确定目标物体的检测损失;其中,所述目标物体的检测损失包括目标物体的识别损失和目标物体的位置检测损失;According to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected, the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
根据所述目标物体的实际关键点位置信息以及所述目标物体的预测关键点位置信息,确定目标物体的关键点位置检测损失;Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object;
根据所述目标物体的识别损失、所述目标物体的位置检测损失和所述目标物体的关键点位置检测损失,确定第二检测模型的检测损失;Determine the detection loss of the second detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object;
根据所述第二检测模型的检测损失以及第二检测模型的历史检测损失,确定第二检测模型的检测损失变化率;所述第二检测模型的历史检测损失是在历史训练过程中确定的第二检测模型的检测损失;According to the detection loss of the second detection model and the historical detection loss of the second detection model, the detection loss change rate of the second detection model is determined; the historical detection loss of the second detection model is the first determined in the historical training process. 2. Detection loss of the detection model;
若所述第二检测模型的检测损失变化率低于第二预设损失阈值,则确定所述第二检测模型达到第二预设条件;If the detection loss change rate of the second detection model is lower than a second preset loss threshold, determining that the second detection model meets the second preset condition;
若所述第二检测模型的检测损失变化率不低于第二预设损失阈值,则确定所述第二检测模型未达到第二预设条件。If the detection loss change rate of the second detection model is not lower than the second preset loss threshold, it is determined that the second detection model does not meet the second preset condition.
可选的,所述根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第二检测模型是否达到第二预设条件,具体包括:Optionally, according to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the position information of the target object Predicting the location information of key points and judging whether the second detection model meets the second preset condition specifically includes:
根据所述目标物体在训练图像上的实际位置信息以及目标物体在待检测图像上的预测位置信息,确定目标物体的检测损失;其中,所述目标物体的检测损失包括目标物体的识别损失和目标物体的位置检测损失;According to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected, the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
根据所述目标物体的实际关键点位置信息以及所述目标物体的预测关键点位置信息,确定目标物体的关键点位置检测损失;Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object;
根据所述目标物体的识别损失、所述目标物体的位置检测损失和所述目标物体的关键点位置检测损失,确定第二检测模型的检测损失;Determine the detection loss of the second detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object;
根据所述第二检测模型的检测损失以及第二检测模型的历史检测损失,确定第二检测模型的检测损失变化率;所述第二检测模型的历史检测损失是在历史训练过程中确定的第二检测模型的检测损失;According to the detection loss of the second detection model and the historical detection loss of the second detection model, the detection loss change rate of the second detection model is determined; the historical detection loss of the second detection model is the first determined in the historical training process. 2. Detection loss of the detection model;
若所述第二检测模型的检测损失变化率低于第二预设损失阈值,或第二检测模型的训练轮数达到第二预设轮数阈值,则确定所述第二检测模型达到第二预设条件;If the detection loss change rate of the second detection model is lower than the second preset loss threshold, or the number of training rounds of the second detection model reaches the second preset number of rounds threshold, it is determined that the second detection model reaches the second Precondition
若所述第二检测模型的检测损失变化率不低于第二预设损失阈值,且第二检测模型的训练轮数未达到第二预设轮数阈值,则确定所述第二检测模型未达到第二预设条件。If the detection loss change rate of the second detection model is not lower than the second preset loss threshold, and the number of training rounds of the second detection model does not reach the second preset number of rounds threshold, it is determined that the second detection model is not The second preset condition is reached.
以上为本申请实施例提供的设备140的相关内容。The above is the relevant content of the device 140 provided in the embodiment of the application.
基于上述方法实施例提供的图像中物体及其关键点的确定方法,本申请实施例还提供了一种计算机可读存储介质。Based on the method for determining objects and their key points in an image provided by the foregoing method embodiments, an embodiment of the present application also provides a computer-readable storage medium.
介质实施例Media Examples
介质实施例提供的计算机可读存储介质的技术详情,请参照上述方法实施例。For technical details of the computer-readable storage medium provided by the media embodiment, please refer to the foregoing method embodiment.
本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质用于存储计算机程序,所述计算机程序用于执行上述方法实施例提供的图像中物体及其关键点的确定方法的任一实施方式。也就是说,该计算机程序用于执行以下步骤:The embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium is used to store a computer program, and the computer program is used to execute the method for determining an object and its key points in an image provided by the foregoing method embodiment Any embodiment of. That is, the computer program is used to perform the following steps:
对待检测图像进行特征提取,得到所述待检测图像对应的图像特征;Performing feature extraction on the image to be detected to obtain image features corresponding to the image to be detected;
根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息以及所述目标物体的关键点位置信息;其中,所述目标物体的关键点用于表征所述目标物体的结构特征。According to the image features corresponding to the image to be detected, the position information of the target object on the image to be detected and the position information of the key points of the target object are determined; wherein, the key points of the target object are used to characterize the Structure.
可选的,所述根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息以及所述目标物体的关键点位置信息,具体为:Optionally, the determining the position information of the target object on the image to be detected and the key point position information of the target object according to the image feature corresponding to the image to be detected is specifically:
根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息,并根据所述待检测图像对应的图像特征和所述目标物体在待检测图像上的位置信息,确定目标物体的关键点位置信息。Determine the location information of the target object on the image to be detected according to the image feature corresponding to the image to be detected, and determine the target object based on the image feature corresponding to the image to be detected and the location information of the target object on the image to be detected The key point position information of the object.
可选的,所述根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息,并根据所述待检测图像对应的图像特征和所述目标物体在待检测图像上的位置信息,确定目标物体的关键点位置信息,具体包括:Optionally, the position information of the target object on the image to be detected is determined according to the image feature corresponding to the image to be detected, and the position information of the target object on the image to be detected is determined according to the image feature corresponding to the image to be detected and the target object is on the image to be detected The location information to determine the key point location information of the target object, including:
根据所述待检测图像对应的图像特征,利用预先构建的第一检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的位置信息以及目标物体的关键点位置信息;According to the image features corresponding to the image to be detected, use a pre-built first detection model to detect the target object and its key points, and determine the position information of the target object on the image to be detected and the key point position information of the target object;
其中,所述第一检测模型包括第一物体位置检测网络层和第一关键点位置检测网络层,且所述第一物体位置检测网络层的输出结果是所述第一关键点位置检测网络层的输入数据,且所述第一物体位置检测网络层用于根据所述待检测图像对应的图像特征确定目标物体在待检测图像上的位置信息,所述第一关键点位置检测网络层用于根据所述待检测图像对应的图像特征和所述目标物体在待检测图像上的位置信息,确定目标物体的关键点位置信息。Wherein, the first detection model includes a first object position detection network layer and a first key point position detection network layer, and the output result of the first object position detection network layer is the first key point position detection network layer The first object position detection network layer is used to determine the position information of the target object on the image to be detected according to the image features corresponding to the image to be detected, and the first key point position detection network layer is used to According to the image feature corresponding to the image to be detected and the position information of the target object on the image to be detected, the key point position information of the target object is determined.
可选的,所述第一检测模型的训练过程,具体包括:Optionally, the training process of the first detection model specifically includes:
获取训练图像对应的图像特征、目标物体在训练图像上的实际位置信息以及目标物体的实际关键点位置信息;Obtain the image features corresponding to the training image, the actual position information of the target object on the training image, and the actual key point position information of the target object;
根据所述训练图像对应的图像特征,利用第一检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的预测位置信息以及目标物体的预测关键点位置信息;According to the image features corresponding to the training image, use the first detection model to detect the target object and its key points, and determine the predicted position information of the target object on the image to be detected and the predicted key point position information of the target object;
根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第一检测模型是否达到第一预设条件;According to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object, Judging whether the first detection model meets the first preset condition;
若确定所述第一检测模型未达到第一预设条件,则更新第一检测模型,继续执行“根据所述训练图像对应的图像特征,利用第一检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的预测位置信息以及目标物体的预测关键点位置信息”。If it is determined that the first detection model does not meet the first preset condition, update the first detection model, and continue to execute "Using the first detection model to detect the target object and its key points according to the image features corresponding to the training image , Determine the predicted location information of the target object on the image to be detected and the predicted key point location information of the target object".
可选的,所述第一预设条件为:第一检测模型的检测损失的变化率低于第一预设损失阈值,和/或,第一检测模型的训练轮数达到第一预设轮数阈值。Optionally, the first preset condition is: the rate of change of the detection loss of the first detection model is lower than a first preset loss threshold, and/or the number of training rounds of the first detection model reaches the first preset round Number threshold.
可选的,当所述第一预设条件包括第一检测模型的检测损失的变化率低于第一预设损失阈值时,所述根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第一检测模型是否达到第一预设条件,具体包括:Optionally, when the first preset condition includes that the rate of change of the detection loss of the first detection model is lower than the first preset loss threshold, according to the actual position information of the target object on the training image, The actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object, determine whether the first detection model meets the first preset condition, Specifically:
根据所述目标物体在训练图像上的实际位置信息以及目标物体在待检测图像上的预测位置信息,确定目标物体的检测损失;其中,所述目标物体的检测损失包括目标物体的识别损失和目标物体的位置检测损失;According to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected, the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
根据所述目标物体的实际关键点位置信息以及所述目标物体的预测关键点位置信息,确定目标物体的关键点位置检测损失;Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object;
根据所述目标物体的识别损失、所述目标物体的位置检测损失和所述目标物体的关键点位置检测损失,确定第一检测模型的检测损失;Determine the detection loss of the first detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object;
根据所述第一检测模型的检测损失以及第一检测模型的历史检测损失,确定第一检测模型的检测损失变化率;所述第一检测模型的历史检测损失是在历史训练过程中确定的第一检测模型的检测损失;According to the detection loss of the first detection model and the historical detection loss of the first detection model, the detection loss change rate of the first detection model is determined; the historical detection loss of the first detection model is the first determined in the historical training process. 1. Detection loss of the detection model;
若所述第一检测模型的检测损失变化率低于第一预设损失阈值,则确定所述第一检测模型达到第一预设条件;If the detection loss change rate of the first detection model is lower than the first preset loss threshold, determining that the first detection model meets the first preset condition;
若所述第一检测模型的检测损失变化率不低于第一预设损失阈值,则确定所述第一检测模型未达到第一预设条件。If the detection loss change rate of the first detection model is not lower than the first preset loss threshold, it is determined that the first detection model does not meet the first preset condition.
可选的,所述根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第一检测模型是否达到第一预设条件,具体包括:Optionally, according to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the position information of the target object Predicting the location information of key points and judging whether the first detection model meets the first preset condition specifically includes:
根据所述目标物体在训练图像上的实际位置信息以及目标物体在待检测图像上的预测位置信息,确定目标物体的检测损失;其中,所述目标物体的检测损失包括目标物体的识别损失和目标物体的位置检测损失;According to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected, the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
根据所述目标物体的实际关键点位置信息以及所述目标物体的预测关键点位置信息,确定目标物体的关键点位置检测损失;Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object;
根据所述目标物体的识别损失、所述目标物体的位置检测损失和所述目标物体的关键点位置检测损失,确定第一检测模型的检测损失;Determine the detection loss of the first detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object;
根据所述第一检测模型的检测损失以及第一检测模型的历史检测损失,确定第一检测模型的检测损失变化率;所述第一检测模型的历史检测损失是在历史训练过程中确定的第一检测模型的检测损失;According to the detection loss of the first detection model and the historical detection loss of the first detection model, the detection loss change rate of the first detection model is determined; the historical detection loss of the first detection model is the first determined in the historical training process. 1. Detection loss of the detection model;
若所述第一检测模型的检测损失变化率低于第一预设损失阈值,或第一检测模型的训练轮数达到第一预设轮数阈值,则确定所述第一检测模型达到第一预设条件;If the detection loss change rate of the first detection model is lower than the first preset loss threshold, or the number of training rounds of the first detection model reaches the first preset number of rounds threshold, it is determined that the first detection model reaches the first Precondition
若所述第一检测模型的检测损失变化率不低于第一预设损失阈值,且第一检测模型的训练轮数未达到第一预设轮数阈值,则确定所述第一检测模型未达到第一预设条件。If the detection loss change rate of the first detection model is not lower than the first preset loss threshold, and the number of training rounds of the first detection model does not reach the first preset number of rounds threshold, it is determined that the first detection model is not The first preset condition is reached.
可选的,所述根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息以及所述目标物体的关键点位置信息,具体为:Optionally, the determining the position information of the target object on the image to be detected and the key point position information of the target object according to the image feature corresponding to the image to be detected is specifically:
根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息和待检测图像上各个图像子区域内的关键点位置信息,并根据所述目标物体在待检测图像上的位置信息和所述待检测图像上各个图像子区域内的关键点位置信息,确定所述目标物体的关键点位置信息;所述图像子区域是锚点对应的感兴趣区域。According to the image features corresponding to the image to be detected, the position information of the target object on the image to be detected and the position information of key points in each image subregion on the image to be detected are determined, and according to the position of the target object on the image to be detected The position information and the key point position information in each image sub-region on the image to be detected determine the key point position information of the target object; the image sub-region is the region of interest corresponding to the anchor point.
可选的,所述根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息和待检测图像上各个图像子区域内的关键点位置信息,并根据所述目标物体在待检测图像上的位置信息和所述待检测图像上各个图像子区域内的关键点位置信息,确定所述目标物体的关键点位置信息,具体为:Optionally, the position information of the target object on the image to be detected and the position information of key points in each image subregion on the image to be detected are determined according to the image features corresponding to the image to be detected, and based on the target object The location information on the image to be detected and the location information of the key points in each image sub-region on the image to be detected, to determine the location information of the key points of the target object, specifically:
根据所述待检测图像对应的图像特征,利用预先构建的第二检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的位置信息以及目标物体的关键点位置信息;According to the image features corresponding to the image to be detected, use a pre-built second detection model to detect the target object and its key points, and determine the position information of the target object on the image to be detected and the key point position information of the target object;
其中,所述第二检测模型包括第二物体位置检测网络层、第二关键点位置检测网络层和物体关键点位置确定网络层,且所述第二物体位置检测网络层的输出结果和所述第二关键点位置检测网络层的输出结果是所述物体关键点位置确定网络层的输入数据,且所述第二物体位置检测网络层用于根据所述待检测图像对应的图像特征确定目标物体在待检测图像上的位置信息;且所述第二关键点位置检测网络层用于根据所述待检测图像对应的图像特征确定待检测图像上各个图像子区域内的关键点位置信息;且所述物体关键点位置确定网络 层用于根据所述目标物体在待检测图像上的位置信息和所述待检测图像上各个图像子区域内的关键点位置信息,确定所述目标物体的关键点位置信息。Wherein, the second detection model includes a second object position detection network layer, a second key point position detection network layer, and an object key point position determination network layer, and the output result of the second object position detection network layer and the The output result of the second key point position detection network layer is the input data of the object key point position determination network layer, and the second object position detection network layer is used to determine the target object according to the image feature corresponding to the image to be detected Position information on the image to be detected; and the second key point position detection network layer is used to determine the position information of key points in each image subregion on the image to be detected according to the image characteristics corresponding to the image to be detected; and The object key point position determination network layer is used to determine the key point position of the target object according to the position information of the target object on the image to be detected and the key point position information in each image subregion of the image to be detected information.
可选的,所述第二检测模型的训练过程,具体包括:Optionally, the training process of the second detection model specifically includes:
获取训练图像对应的图像特征、目标物体在训练图像上的实际位置信息以及目标物体的实际关键点位置信息;Obtain the image features corresponding to the training image, the actual position information of the target object on the training image, and the actual key point position information of the target object;
根据所述训练图像对应的图像特征,利用第二检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的预测位置信息以及目标物体的预测关键点位置信息;According to the image features corresponding to the training image, use the second detection model to detect the target object and its key points, and determine the predicted position information of the target object on the image to be detected and the predicted key point position information of the target object;
根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第二检测模型是否达到第二预设条件;According to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object, Judging whether the second detection model meets a second preset condition;
若确定所述第二检测模型未达到第二预设条件,则更新第二检测模型,继续执行“根据所述训练图像对应的图像特征,利用第二检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的预测位置信息以及目标物体的预测关键点位置信息”。If it is determined that the second detection model does not meet the second preset condition, update the second detection model, and continue to execute "Using the second detection model to detect the target object and its key points according to the image features corresponding to the training image , Determine the predicted location information of the target object on the image to be detected and the predicted key point location information of the target object".
可选的,所述第二预设条件为:第二检测模型的检测损失的变化率低于第二预设损失阈值,和/或,第二检测模型的训练轮数达到第二预设轮数阈值。Optionally, the second preset condition is: the rate of change of the detection loss of the second detection model is lower than a second preset loss threshold, and/or the number of training rounds of the second detection model reaches the second preset round Number threshold.
可选的,当所述第二预设条件包括第二检测模型的检测损失的变化率低于第二预设损失阈值时,所述根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第二检测模型是否达到第二预设条件,具体包括:Optionally, when the second preset condition includes that the rate of change of the detection loss of the second detection model is lower than a second preset loss threshold, according to the actual position information of the target object on the training image, The actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object, determine whether the second detection model meets a second preset condition, Specifically:
根据所述目标物体在训练图像上的实际位置信息以及目标物体在待检测图像上的预测位置信息,确定目标物体的检测损失;其中,所述目标物体的检测损失包括目标物体的识别损失和目标物体的位置检测损失;According to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected, the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
根据所述目标物体的实际关键点位置信息以及所述目标物体的预测关键点位置信息,确定目标物体的关键点位置检测损失;Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object;
根据所述目标物体的识别损失、所述目标物体的位置检测损失和所述目标物体的关键点位置检测损失,确定第二检测模型的检测损失;Determine the detection loss of the second detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object;
根据所述第二检测模型的检测损失以及第二检测模型的历史检测损失,确定第二检测模型的检测损失变化率;所述第二检测模型的历史检测损失是在历史训练过程中确定的第二检测模型的检测损失;According to the detection loss of the second detection model and the historical detection loss of the second detection model, the detection loss change rate of the second detection model is determined; the historical detection loss of the second detection model is the first determined in the historical training process. 2. Detection loss of the detection model;
若所述第二检测模型的检测损失变化率低于第二预设损失阈值,则确定所述第二检测模型达到第二预设条件;If the detection loss change rate of the second detection model is lower than a second preset loss threshold, determining that the second detection model meets the second preset condition;
若所述第二检测模型的检测损失变化率不低于第二预设损失阈值,则确定所述第二检测模型未达到第二预设条件。If the detection loss change rate of the second detection model is not lower than the second preset loss threshold, it is determined that the second detection model does not meet the second preset condition.
可选的,所述根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第二检测模型是否达到第二预设条件,具体包括:Optionally, according to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the position information of the target object Predicting the location information of key points and judging whether the second detection model meets the second preset condition specifically includes:
根据所述目标物体在训练图像上的实际位置信息以及目标物体在待检测图像上的预测位置信息,确定目标物体的检测损失;其中,所述目标物体的检测损失包括目标物体的识别损失和目标物体的位置检测损失;According to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected, the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
根据所述目标物体的实际关键点位置信息以及所述目标物体的预测关键点位置信息,确定目标物体的关键点位置检测损失;Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object;
根据所述目标物体的识别损失、所述目标物体的位置检测损失和所述目标物体的关键点位置检测损失,确定第二检测模型的检测损失;Determine the detection loss of the second detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object;
根据所述第二检测模型的检测损失以及第二检测模型的历史检测损失,确定第二检测模型的检测损失变化率;所述第二检测模型的历史检测损失是在历史训练过程中确定的第二检测模型的检测损失;According to the detection loss of the second detection model and the historical detection loss of the second detection model, the detection loss change rate of the second detection model is determined; the historical detection loss of the second detection model is the first determined in the historical training process. 2. Detection loss of the detection model;
若所述第二检测模型的检测损失变化率低于第二预设损失阈值,或第二检测模型的训练轮数达到第二预设轮数阈值,则确定所述第二检测模型达到第二预设条件;If the detection loss change rate of the second detection model is lower than the second preset loss threshold, or the number of training rounds of the second detection model reaches the second preset number of rounds threshold, it is determined that the second detection model reaches the second Precondition
若所述第二检测模型的检测损失变化率不低于第二预设损失阈值,且第二检测模型的训练轮数未达到第二预设轮数阈值,则确定所述第二检测模型未达到第二预设条件。If the detection loss change rate of the second detection model is not lower than the second preset loss threshold, and the number of training rounds of the second detection model does not reach the second preset number of rounds threshold, it is determined that the second detection model is not The second preset condition is reached.
以上为本申请实施例提供的计算机可读存储介质的相关内容。The above is the relevant content of the computer-readable storage medium provided by the embodiments of the application.
应当理解,在本申请中,“至少一个(项)”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,用于描述关联对象的关联关系,表示可以存在三种关系,例如,“A和/或B”可以表示:只存在A,只存在B以及同时存在A和B三种情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b或c中的至少一项(个),可以表示:a,b,c,“a和b”,“a和c”,“b和c”,或“a和b和c”,其中a,b,c可以是单个,也可以是多个。It should be understood that in this application, "at least one (item)" refers to one or more, and "multiple" refers to two or more. "And/or" is used to describe the association relationship of associated objects, indicating that there can be three types of relationships, for example, "A and/or B" can mean: only A, only B, and both A and B , Where A and B can be singular or plural. The character "/" generally indicates that the associated objects before and after are in an "or" relationship. "The following at least one item (a)" or similar expressions refers to any combination of these items, including any combination of a single item (a) or a plurality of items (a). For example, at least one of a, b, or c can mean: a, b, c, "a and b", "a and c", "b and c", or "a and b and c" ", where a, b, and c can be single or multiple.
以上所述,仅是本发明的较佳实施例而已,并非对本发明作任何形式上的限制。虽然本发明已以较佳实施例揭露如上,然而并非用以限定本发明。任何熟悉本领域的技术人员,在不脱离本发明技术方案范围情况下,都可利用上述揭示的方法和技术内容对本发明技术方案做出许多可能的变动和修饰,或修改为等同变化的等效实施例。因此,凡是未脱离本发明技术方案的内容,依据本发明的技术实质对以上实施例所做的任何简单修改、等同变化及修饰,均仍属于本发明技术方案保护的范围内。The above are only preferred embodiments of the present invention, and do not limit the present invention in any form. Although the present invention has been disclosed as above in preferred embodiments, it is not intended to limit the present invention. Anyone familiar with the art, without departing from the scope of the technical solution of the present invention, can use the methods and technical content disclosed above to make many possible changes and modifications to the technical solution of the present invention, or modify it into equivalent changes. Examples. Therefore, any simple modifications, equivalent changes and modifications made to the above embodiments without departing from the technical solution of the present invention based on the technical essence of the present invention still fall within the protection scope of the technical solution of the present invention.

Claims (16)

  1. 一种图像中物体及其关键点的确定方法,其特征在于,包括:A method for determining objects and their key points in an image is characterized in that they include:
    对待检测图像进行特征提取,得到所述待检测图像对应的图像特征;Performing feature extraction on the image to be detected to obtain image features corresponding to the image to be detected;
    根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息以及所述目标物体的关键点位置信息;其中,所述目标物体的关键点用于表征所述目标物体的结构特征。According to the image features corresponding to the image to be detected, the position information of the target object on the image to be detected and the position information of the key points of the target object are determined; wherein, the key points of the target object are used to characterize the Structure.
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息以及所述目标物体的关键点位置信息,具体为:The method according to claim 1, wherein the determining the position information of the target object on the image to be detected and the key point position information of the target object according to the image characteristics corresponding to the image to be detected is specifically :
    根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息,并根据所述待检测图像对应的图像特征和所述目标物体在待检测图像上的位置信息,确定目标物体的关键点位置信息。Determine the location information of the target object on the image to be detected according to the image feature corresponding to the image to be detected, and determine the target object based on the image feature corresponding to the image to be detected and the location information of the target object on the image to be detected The key point position information of the object.
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息,并根据所述待检测图像对应的图像特征和所述目标物体在待检测图像上的位置信息,确定目标物体的关键点位置信息,具体包括:The method according to claim 2, wherein the position information of the target object on the image to be detected is determined according to the image feature corresponding to the image to be detected, and the image feature and the image feature corresponding to the image to be detected are determined. The location information of the target object on the image to be detected and the key point location information of the target object specifically include:
    根据所述待检测图像对应的图像特征,利用预先构建的第一检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的位置信息以及目标物体的关键点位置信息;According to the image features corresponding to the image to be detected, use a pre-built first detection model to detect the target object and its key points, and determine the position information of the target object on the image to be detected and the key point position information of the target object;
    其中,所述第一检测模型包括第一物体位置检测网络层和第一关键点位置检测网络层,且所述第一物体位置检测网络层的输出结果是所述第一关键点位置检测网络层的输入数据,且所述第一物体位置检测网络层用于根据所述待检测图像对应的图像特征确定目标物体在待检测图像上的位置信息,所述第一关键点位置检测网络层用于根据所述待检测图像对应的图像特征和所述目标物体在待检测图像上的位置信息,确定目标物体的关键点位置信息。Wherein, the first detection model includes a first object position detection network layer and a first key point position detection network layer, and the output result of the first object position detection network layer is the first key point position detection network layer The first object position detection network layer is used to determine the position information of the target object on the image to be detected according to the image features corresponding to the image to be detected, and the first key point position detection network layer is used to According to the image feature corresponding to the image to be detected and the position information of the target object on the image to be detected, the key point position information of the target object is determined.
  4. 根据权利要求3所述的方法,其特征在于,所述第一检测模型的训练过程,具体包括:The method according to claim 3, wherein the training process of the first detection model specifically includes:
    获取训练图像对应的图像特征、目标物体在训练图像上的实际位置信息以及目标物体的实际关键点位置信息;Obtain the image features corresponding to the training image, the actual position information of the target object on the training image, and the actual key point position information of the target object;
    根据所述训练图像对应的图像特征,利用第一检测模型对目标物体及其关 键点进行检测,确定目标物体在待检测图像上的预测位置信息以及目标物体的预测关键点位置信息;According to the image features corresponding to the training image, use the first detection model to detect the target object and its key points, and determine the predicted position information of the target object on the image to be detected and the predicted key point position information of the target object;
    根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第一检测模型是否达到第一预设条件;According to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object, Judging whether the first detection model meets the first preset condition;
    若确定所述第一检测模型未达到第一预设条件,则更新第一检测模型,继续执行“根据所述训练图像对应的图像特征,利用第一检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的预测位置信息以及目标物体的预测关键点位置信息”。If it is determined that the first detection model does not meet the first preset condition, update the first detection model, and continue to execute "Using the first detection model to detect the target object and its key points according to the image features corresponding to the training image , Determine the predicted location information of the target object on the image to be detected and the predicted key point location information of the target object".
  5. 根据权利要求4所述的方法,其特征在于,所述第一预设条件为:第一检测模型的检测损失的变化率低于第一预设损失阈值,和/或,第一检测模型的训练轮数达到第一预设轮数阈值。The method according to claim 4, wherein the first preset condition is: the rate of change of the detection loss of the first detection model is lower than the first preset loss threshold, and/or the detection loss of the first detection model The number of training rounds reaches the first preset round number threshold.
  6. 根据权利要求5所述的方法,其特征在于,当所述第一预设条件包括第一检测模型的检测损失的变化率低于第一预设损失阈值时,所述根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第一检测模型是否达到第一预设条件,具体包括:The method according to claim 5, wherein when the first preset condition includes that the rate of change of the detection loss of the first detection model is lower than a first preset loss threshold, the The actual position information on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object, determine the first detection Whether the model meets the first preset condition, including:
    根据所述目标物体在训练图像上的实际位置信息以及目标物体在待检测图像上的预测位置信息,确定目标物体的检测损失;其中,所述目标物体的检测损失包括目标物体的识别损失和目标物体的位置检测损失;According to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected, the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
    根据所述目标物体的实际关键点位置信息以及所述目标物体的预测关键点位置信息,确定目标物体的关键点位置检测损失;Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object;
    根据所述目标物体的识别损失、所述目标物体的位置检测损失和所述目标物体的关键点位置检测损失,确定第一检测模型的检测损失;Determine the detection loss of the first detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object;
    根据所述第一检测模型的检测损失以及第一检测模型的历史检测损失,确定第一检测模型的检测损失变化率;所述第一检测模型的历史检测损失是在历史训练过程中确定的第一检测模型的检测损失;According to the detection loss of the first detection model and the historical detection loss of the first detection model, the detection loss change rate of the first detection model is determined; the historical detection loss of the first detection model is the first determined in the historical training process. 1. Detection loss of the detection model;
    若所述第一检测模型的检测损失变化率低于第一预设损失阈值,则确定所述第一检测模型达到第一预设条件;If the detection loss change rate of the first detection model is lower than the first preset loss threshold, determining that the first detection model meets the first preset condition;
    若所述第一检测模型的检测损失变化率不低于第一预设损失阈值,则确定 所述第一检测模型未达到第一预设条件。If the detection loss change rate of the first detection model is not lower than the first preset loss threshold, it is determined that the first detection model does not meet the first preset condition.
  7. 根据权利要求4所述的方法,其特征在于,所述根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第一检测模型是否达到第一预设条件,具体包括:The method according to claim 4, characterized in that, according to the actual position information of the target object on the training image, the actual key point position information of the target object, and the position information of the target object on the image to be detected The predicted position information and the predicted key point position information of the target object, and judging whether the first detection model meets the first preset condition, specifically include:
    根据所述目标物体在训练图像上的实际位置信息以及目标物体在待检测图像上的预测位置信息,确定目标物体的检测损失;其中,所述目标物体的检测损失包括目标物体的识别损失和目标物体的位置检测损失;According to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected, the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
    根据所述目标物体的实际关键点位置信息以及所述目标物体的预测关键点位置信息,确定目标物体的关键点位置检测损失;Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object;
    根据所述目标物体的识别损失、所述目标物体的位置检测损失和所述目标物体的关键点位置检测损失,确定第一检测模型的检测损失;Determine the detection loss of the first detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object;
    根据所述第一检测模型的检测损失以及第一检测模型的历史检测损失,确定第一检测模型的检测损失变化率;所述第一检测模型的历史检测损失是在历史训练过程中确定的第一检测模型的检测损失;According to the detection loss of the first detection model and the historical detection loss of the first detection model, the detection loss change rate of the first detection model is determined; the historical detection loss of the first detection model is the first determined in the historical training process. 1. Detection loss of the detection model;
    若所述第一检测模型的检测损失变化率低于第一预设损失阈值,或第一检测模型的训练轮数达到第一预设轮数阈值,则确定所述第一检测模型达到第一预设条件;If the detection loss change rate of the first detection model is lower than the first preset loss threshold, or the number of training rounds of the first detection model reaches the first preset number of rounds threshold, it is determined that the first detection model reaches the first Precondition
    若所述第一检测模型的检测损失变化率不低于第一预设损失阈值,且第一检测模型的训练轮数未达到第一预设轮数阈值,则确定所述第一检测模型未达到第一预设条件。If the detection loss change rate of the first detection model is not lower than the first preset loss threshold, and the number of training rounds of the first detection model does not reach the first preset number of rounds threshold, it is determined that the first detection model is not The first preset condition is reached.
  8. 根据权利要求1所述的方法,其特征在于,所述根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息以及所述目标物体的关键点位置信息,具体为:The method according to claim 1, wherein the determining the position information of the target object on the image to be detected and the key point position information of the target object according to the image characteristics corresponding to the image to be detected is specifically :
    根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息和待检测图像上各个图像子区域内的关键点位置信息,并根据所述目标物体在待检测图像上的位置信息和所述待检测图像上各个图像子区域内的关键点位置信息,确定所述目标物体的关键点位置信息;所述图像子区域是锚点对应的感兴趣区域。According to the image features corresponding to the image to be detected, the position information of the target object on the image to be detected and the position information of key points in each image subregion on the image to be detected are determined, and according to the position of the target object on the image to be detected The position information and the key point position information in each image sub-region on the image to be detected determine the key point position information of the target object; the image sub-region is the region of interest corresponding to the anchor point.
  9. 根据权利要求8所述的方法,其特征在于,所述根据所述待检测图像 对应的图像特征,确定目标物体在待检测图像上的位置信息和待检测图像上各个图像子区域内的关键点位置信息,并根据所述目标物体在待检测图像上的位置信息和所述待检测图像上各个图像子区域内的关键点位置信息,确定所述目标物体的关键点位置信息,具体为:The method according to claim 8, wherein the position information of the target object on the image to be detected and the key points in each image subregion on the image to be detected are determined according to the image characteristics corresponding to the image to be detected Position information, and determine the key point position information of the target object according to the position information of the target object on the image to be detected and the key point position information in each image subregion of the image to be detected, specifically:
    根据所述待检测图像对应的图像特征,利用预先构建的第二检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的位置信息以及目标物体的关键点位置信息;According to the image features corresponding to the image to be detected, use a pre-built second detection model to detect the target object and its key points, and determine the position information of the target object on the image to be detected and the key point position information of the target object;
    其中,所述第二检测模型包括第二物体位置检测网络层、第二关键点位置检测网络层和物体关键点位置确定网络层,且所述第二物体位置检测网络层的输出结果和所述第二关键点位置检测网络层的输出结果是所述物体关键点位置确定网络层的输入数据,且所述第二物体位置检测网络层用于根据所述待检测图像对应的图像特征确定目标物体在待检测图像上的位置信息;且所述第二关键点位置检测网络层用于根据所述待检测图像对应的图像特征确定待检测图像上各个图像子区域内的关键点位置信息;且所述物体关键点位置确定网络层用于根据所述目标物体在待检测图像上的位置信息和所述待检测图像上各个图像子区域内的关键点位置信息,确定所述目标物体的关键点位置信息。Wherein, the second detection model includes a second object position detection network layer, a second key point position detection network layer, and an object key point position determination network layer, and the output result of the second object position detection network layer and the The output result of the second key point position detection network layer is the input data of the object key point position determination network layer, and the second object position detection network layer is used to determine the target object according to the image feature corresponding to the image to be detected Position information on the image to be detected; and the second key point position detection network layer is used to determine the position information of key points in each image subregion on the image to be detected according to the image characteristics corresponding to the image to be detected; and The object key point position determination network layer is used to determine the key point position of the target object according to the position information of the target object on the image to be detected and the key point position information in each image subregion of the image to be detected information.
  10. 根据权利要求9所述的方法,其特征在于,所述第二检测模型的训练过程,具体包括:The method according to claim 9, wherein the training process of the second detection model specifically includes:
    获取训练图像对应的图像特征、目标物体在训练图像上的实际位置信息以及目标物体的实际关键点位置信息;Obtain the image features corresponding to the training image, the actual position information of the target object on the training image, and the actual key point position information of the target object;
    根据所述训练图像对应的图像特征,利用第二检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的预测位置信息以及目标物体的预测关键点位置信息;According to the image features corresponding to the training image, use the second detection model to detect the target object and its key points, and determine the predicted position information of the target object on the image to be detected and the predicted key point position information of the target object;
    根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第二检测模型是否达到第二预设条件;According to the actual position information of the target object on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object, Judging whether the second detection model meets a second preset condition;
    若确定所述第二检测模型未达到第二预设条件,则更新第二检测模型,继续执行“根据所述训练图像对应的图像特征,利用第二检测模型对目标物体及其关键点进行检测,确定目标物体在待检测图像上的预测位置信息以及目标物体的预测关键点位置信息”。If it is determined that the second detection model does not meet the second preset condition, update the second detection model, and continue to execute "Using the second detection model to detect the target object and its key points according to the image features corresponding to the training image , Determine the predicted location information of the target object on the image to be detected and the predicted key point location information of the target object".
  11. 根据权利要求10所述的方法,其特征在于,所述第二预设条件为:第二检测模型的检测损失的变化率低于第二预设损失阈值,和/或,第二检测模型的训练轮数达到第二预设轮数阈值。The method according to claim 10, wherein the second preset condition is: the rate of change of the detection loss of the second detection model is lower than the second preset loss threshold, and/or, the detection loss of the second detection model The number of training rounds reaches the second preset round number threshold.
  12. 根据权利要求11所述的方法,其特征在于,当所述第二预设条件包括第二检测模型的检测损失的变化率低于第二预设损失阈值时,所述根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第二检测模型是否达到第二预设条件,具体包括:The method according to claim 11, wherein when the second preset condition includes that the rate of change of the detection loss of the second detection model is lower than a second preset loss threshold, the The actual position information on the training image, the actual key point position information of the target object, the predicted position information of the target object on the image to be detected, and the predicted key point position information of the target object, determine the second detection Whether the model meets the second preset condition, including:
    根据所述目标物体在训练图像上的实际位置信息以及目标物体在待检测图像上的预测位置信息,确定目标物体的检测损失;其中,所述目标物体的检测损失包括目标物体的识别损失和目标物体的位置检测损失;According to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected, the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
    根据所述目标物体的实际关键点位置信息以及所述目标物体的预测关键点位置信息,确定目标物体的关键点位置检测损失;Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object;
    根据所述目标物体的识别损失、所述目标物体的位置检测损失和所述目标物体的关键点位置检测损失,确定第二检测模型的检测损失;Determine the detection loss of the second detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object;
    根据所述第二检测模型的检测损失以及第二检测模型的历史检测损失,确定第二检测模型的检测损失变化率;所述第二检测模型的历史检测损失是在历史训练过程中确定的第二检测模型的检测损失;According to the detection loss of the second detection model and the historical detection loss of the second detection model, the detection loss change rate of the second detection model is determined; the historical detection loss of the second detection model is the first determined in the historical training process. 2. Detection loss of the detection model;
    若所述第二检测模型的检测损失变化率低于第二预设损失阈值,则确定所述第二检测模型达到第二预设条件;If the detection loss change rate of the second detection model is lower than a second preset loss threshold, determining that the second detection model meets the second preset condition;
    若所述第二检测模型的检测损失变化率不低于第二预设损失阈值,则确定所述第二检测模型未达到第二预设条件。If the detection loss change rate of the second detection model is not lower than the second preset loss threshold, it is determined that the second detection model does not meet the second preset condition.
  13. 根据权利要求10所述的方法,其特征在于,所述根据所述目标物体在训练图像上的实际位置信息、所述目标物体的实际关键点位置信息、所述目标物体在待检测图像上的预测位置信息以及所述目标物体的预测关键点位置信息,判断所述第二检测模型是否达到第二预设条件,具体包括:The method according to claim 10, characterized in that, according to the actual position information of the target object on the training image, the actual key point position information of the target object, and the position of the target object on the image to be detected The predicted position information and the predicted key point position information of the target object, and determining whether the second detection model meets the second preset condition specifically includes:
    根据所述目标物体在训练图像上的实际位置信息以及目标物体在待检测图像上的预测位置信息,确定目标物体的检测损失;其中,所述目标物体的检测损失包括目标物体的识别损失和目标物体的位置检测损失;According to the actual position information of the target object on the training image and the predicted position information of the target object on the image to be detected, the detection loss of the target object is determined; wherein the detection loss of the target object includes the recognition loss of the target object and the target Loss of object position detection;
    根据所述目标物体的实际关键点位置信息以及所述目标物体的预测关键 点位置信息,确定目标物体的关键点位置检测损失;Determine the key point position detection loss of the target object according to the actual key point position information of the target object and the predicted key point position information of the target object;
    根据所述目标物体的识别损失、所述目标物体的位置检测损失和所述目标物体的关键点位置检测损失,确定第二检测模型的检测损失;Determine the detection loss of the second detection model according to the recognition loss of the target object, the position detection loss of the target object, and the key point position detection loss of the target object;
    根据所述第二检测模型的检测损失以及第二检测模型的历史检测损失,确定第二检测模型的检测损失变化率;所述第二检测模型的历史检测损失是在历史训练过程中确定的第二检测模型的检测损失;According to the detection loss of the second detection model and the historical detection loss of the second detection model, the detection loss change rate of the second detection model is determined; the historical detection loss of the second detection model is the first determined in the historical training process. 2. Detection loss of the detection model;
    若所述第二检测模型的检测损失变化率低于第二预设损失阈值,或第二检测模型的训练轮数达到第二预设轮数阈值,则确定所述第二检测模型达到第二预设条件;If the detection loss change rate of the second detection model is lower than the second preset loss threshold, or the number of training rounds of the second detection model reaches the second preset number of rounds threshold, it is determined that the second detection model reaches the second Precondition
    若所述第二检测模型的检测损失变化率不低于第二预设损失阈值,且第二检测模型的训练轮数未达到第二预设轮数阈值,则确定所述第二检测模型未达到第二预设条件。If the detection loss change rate of the second detection model is not lower than the second preset loss threshold, and the number of training rounds of the second detection model does not reach the second preset number of rounds threshold, it is determined that the second detection model is not The second preset condition is reached.
  14. 一种图像中物体及其关键点的确定装置,其特征在于,包括:A device for determining objects and their key points in an image is characterized in that they include:
    提取单元,用于对待检测图像进行特征提取,得到所述待检测图像对应的图像特征;An extraction unit, configured to perform feature extraction on the image to be detected to obtain image features corresponding to the image to be detected;
    确定单元,用于根据所述待检测图像对应的图像特征,确定目标物体在待检测图像上的位置信息以及所述目标物体的关键点位置信息;其中,所述目标物体的关键点用于表征所述目标物体的结构特征。The determining unit is configured to determine the position information of the target object on the image to be detected and the key point position information of the target object according to the image feature corresponding to the image to be detected; wherein, the key point of the target object is used to characterize The structural characteristics of the target object.
  15. 一种设备,其特征在于,所述设备包括处理器以及存储器:A device, characterized in that the device includes a processor and a memory:
    所述存储器用于存储计算机程序;The memory is used to store a computer program;
    所述处理器用于根据所述计算机程序执行权利要求1-13中任一项所述的方法。The processor is configured to execute the method according to any one of claims 1-13 according to the computer program.
  16. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质用于存储计算机程序,所述计算机程序用于执行权利要求1-13中任一项所述的方法。A computer-readable storage medium, wherein the computer-readable storage medium is used to store a computer program, and the computer program is used to execute the method according to any one of claims 1-13.
PCT/CN2020/102518 2019-10-09 2020-07-17 Method and apparatus for determining object and key points thereof in image WO2021068589A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910954556.7 2019-10-09
CN201910954556.7A CN110765898B (en) 2019-10-09 2019-10-09 Method and device for determining object and key point thereof in image

Publications (1)

Publication Number Publication Date
WO2021068589A1 true WO2021068589A1 (en) 2021-04-15

Family

ID=69331015

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/102518 WO2021068589A1 (en) 2019-10-09 2020-07-17 Method and apparatus for determining object and key points thereof in image

Country Status (2)

Country Link
CN (1) CN110765898B (en)
WO (1) WO2021068589A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115083020A (en) * 2022-07-22 2022-09-20 海易科技(北京)有限公司 Information generation method and device, electronic equipment and computer readable medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110765898B (en) * 2019-10-09 2022-11-22 东软睿驰汽车技术(沈阳)有限公司 Method and device for determining object and key point thereof in image

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108038469A (en) * 2017-12-27 2018-05-15 百度在线网络技术(北京)有限公司 Method and apparatus for detecting human body
CN108205655A (en) * 2017-11-07 2018-06-26 北京市商汤科技开发有限公司 A kind of key point Forecasting Methodology, device, electronic equipment and storage medium
CN108256431A (en) * 2017-12-20 2018-07-06 中车工业研究院有限公司 A kind of hand position identification method and device
CN108985259A (en) * 2018-08-03 2018-12-11 百度在线网络技术(北京)有限公司 Human motion recognition method and device
CN109598234A (en) * 2018-12-04 2019-04-09 深圳美图创新科技有限公司 Critical point detection method and apparatus
CN110765898A (en) * 2019-10-09 2020-02-07 东软睿驰汽车技术(沈阳)有限公司 Method and device for determining object and key point thereof in image

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107590482A (en) * 2017-09-29 2018-01-16 百度在线网络技术(北京)有限公司 information generating method and device
CN108121951A (en) * 2017-12-11 2018-06-05 北京小米移动软件有限公司 Characteristic point positioning method and device
CN108121952B (en) * 2017-12-12 2022-03-08 北京小米移动软件有限公司 Face key point positioning method, device, equipment and storage medium
EP3547211B1 (en) * 2018-03-30 2021-11-17 Naver Corporation Methods for training a cnn and classifying an action performed by a subject in an inputted video using said cnn
CN109460704B (en) * 2018-09-18 2020-09-15 厦门瑞为信息技术有限公司 Fatigue detection method and system based on deep learning and computer equipment
CN109617909B (en) * 2019-01-07 2021-04-27 福州大学 Malicious domain name detection method based on SMOTE and BI-LSTM network
CN109934129B (en) * 2019-02-27 2023-05-30 嘉兴学院 Face feature point positioning method, device, computer equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108205655A (en) * 2017-11-07 2018-06-26 北京市商汤科技开发有限公司 A kind of key point Forecasting Methodology, device, electronic equipment and storage medium
CN108256431A (en) * 2017-12-20 2018-07-06 中车工业研究院有限公司 A kind of hand position identification method and device
CN108038469A (en) * 2017-12-27 2018-05-15 百度在线网络技术(北京)有限公司 Method and apparatus for detecting human body
CN108985259A (en) * 2018-08-03 2018-12-11 百度在线网络技术(北京)有限公司 Human motion recognition method and device
CN109598234A (en) * 2018-12-04 2019-04-09 深圳美图创新科技有限公司 Critical point detection method and apparatus
CN110765898A (en) * 2019-10-09 2020-02-07 东软睿驰汽车技术(沈阳)有限公司 Method and device for determining object and key point thereof in image

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115083020A (en) * 2022-07-22 2022-09-20 海易科技(北京)有限公司 Information generation method and device, electronic equipment and computer readable medium
CN115083020B (en) * 2022-07-22 2022-11-01 海易科技(北京)有限公司 Information generation method and device, electronic equipment and computer readable medium

Also Published As

Publication number Publication date
CN110765898B (en) 2022-11-22
CN110765898A (en) 2020-02-07

Similar Documents

Publication Publication Date Title
CN111640089B (en) Defect detection method and device based on feature map center point
CN106886216B (en) Robot automatic tracking method and system based on RGBD face detection
CN110807385A (en) Target detection method and device, electronic equipment and storage medium
KR20200015728A (en) Target object recognition method, apparatus, storage medium and electronic device
KR102476022B1 (en) Face detection method and apparatus thereof
WO2021068589A1 (en) Method and apparatus for determining object and key points thereof in image
CN111126153B (en) Safety monitoring method, system, server and storage medium based on deep learning
CN107452015A (en) A kind of Target Tracking System with re-detection mechanism
CN105469430A (en) Anti-shielding tracking method of small target in large-scale scene
CN109145696B (en) Old people falling detection method and system based on deep learning
CN109934077B (en) Image identification method and electronic equipment
CN109376609A (en) Recognition methods, device and the intelligent terminal of pantograph abrasion
US20220327676A1 (en) Method and system for detecting change to structure by using drone
CN105740751A (en) Object detection and identification method and system
CN111079621A (en) Method and device for detecting object, electronic equipment and storage medium
JP6255944B2 (en) Image analysis apparatus, image analysis method, and image analysis program
CN112560584A (en) Face detection method and device, storage medium and terminal
CN112183356A (en) Driving behavior detection method and device and readable storage medium
CN111382638B (en) Image detection method, device, equipment and storage medium
CN111881775B (en) Real-time face recognition method and device
CN112347818B (en) Method and device for screening difficult sample images of video target detection model
KR101313879B1 (en) Detecting and Tracing System of Human Using Gradient Histogram and Method of The Same
US11301692B2 (en) Information processing apparatus, control method, and program
CN111860289B (en) Time sequence action detection method and device and computer equipment
CN111160190B (en) Vehicle-mounted pedestrian detection-oriented classification auxiliary kernel correlation filtering tracking method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20874210

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20874210

Country of ref document: EP

Kind code of ref document: A1