CN111523452B - Method and device for detecting human body position in image - Google Patents

Method and device for detecting human body position in image Download PDF

Info

Publication number
CN111523452B
CN111523452B CN202010321816.XA CN202010321816A CN111523452B CN 111523452 B CN111523452 B CN 111523452B CN 202010321816 A CN202010321816 A CN 202010321816A CN 111523452 B CN111523452 B CN 111523452B
Authority
CN
China
Prior art keywords
human body
regression
image
position coordinates
body frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010321816.XA
Other languages
Chinese (zh)
Other versions
CN111523452A (en
Inventor
钟东宏
袁宇辰
孙昊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202010321816.XA priority Critical patent/CN111523452B/en
Publication of CN111523452A publication Critical patent/CN111523452A/en
Application granted granted Critical
Publication of CN111523452B publication Critical patent/CN111523452B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a method and a device for detecting the position of a human body in an image, which relate to the field of artificial intelligence and specifically comprise the following steps: determining the position coordinates of a human body frame corresponding to the human body image from the image to be detected comprising the human body image; based on the estimated threshold vector obtained by learning, performing multiple regression calculation on the position coordinates of the human body frame corresponding to the human body image in the image to be detected by using the frame regression loss function to obtain a regression offset parameter set corresponding to the estimated threshold vector; performing multidimensional comparison analysis on each group of regression offset parameters in the regression offset parameter set to determine an optimal regression offset parameter set; and generating the final position coordinates of the human body frame based on the optimal regression offset parameter set. According to the scheme, the accuracy of detecting the position coordinates of the human body frame is improved, and the detection result of the human body image is further accurate.

Description

Method and device for detecting human body position in image
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to the field of artificial intelligence, and particularly relates to a method and a device for detecting the position of a human body in an image.
Background
With the continuous development of the internet and artificial intelligence technology, more and more fields begin to relate to automated computing and analysis, wherein the human body detection function of monitoring security protection is one of important scenes
Human body frame results obtained by human body detection in the common security video field sometimes deviate from an actual target to some extent, and although the human body frame results overlap with the actual target, the whole human body cannot be completely covered. When the inaccurate human body detection frame is used for subsequent tasks (such as classification and tracking), more errors are often introduced, so that if the detection frame can better cover a target human body, the effect of the whole service can be greatly improved.
Disclosure of Invention
The application provides a method, a device, equipment and a storage medium for detecting the position of a human body in an image.
According to a first aspect, the present application provides a method for detecting a position of a person in an image, the method comprising: determining the position coordinates of a human body frame corresponding to the human body image from the image to be detected comprising the human body image; based on the estimated threshold vector obtained by learning, performing multiple regression calculation on the position coordinates of the human body frame corresponding to the human body image in the image to be detected by using the frame regression loss function to obtain a regression offset parameter set corresponding to the estimated threshold vector; performing multidimensional comparison analysis on each group of regression offset parameters in the regression offset parameter set to determine an optimal regression offset parameter set; and generating the final position coordinates of the human body frame based on the optimal regression offset parameter set.
In some embodiments, based on the learned evaluation threshold vector, performing cascade regression calculation by using a frame regression loss function on the position coordinates of a frame of a human body corresponding to a human body image in an image to be detected, to obtain a regression offset parameter set corresponding to the evaluation threshold vector, including: based on each evaluation threshold value in the evaluation threshold value vectors obtained through learning, carrying out cascade regression calculation on position coordinates of a human body frame corresponding to a human body image in an image to be detected by utilizing a frame regression loss function to obtain a regression offset parameter set corresponding to the evaluation threshold value vectors, wherein the evaluation threshold value vectors are ordered value sets of evaluation threshold values representing the image overlapping degree IOU, and the numerical values in the ordered value sets gradually increase from front to back.
In some embodiments, based on the learned evaluation threshold vector, performing multiple regression calculation on the position coordinates of the human body frame corresponding to the human body image in the image to be detected by using the frame regression loss function to obtain a regression offset parameter set corresponding to the evaluation threshold vector, including: based on the estimated threshold vector obtained by learning, carrying out cascade regression calculation on the position coordinates of the human body frame corresponding to the human body image in the image to be detected by utilizing the frame regression loss function to obtain a regression offset parameter set corresponding to the estimated threshold vector; the evaluation threshold vector is used for representing an initial evaluation threshold and a cascade step length required by cascade regression calculation, and each regression calculation in the cascade regression calculation is performed based on a regression offset parameter set obtained by the last regression calculation.
In some embodiments, the method further comprises: based on each group of regression offset parameters in the regression offset parameter group set, obtaining the position coordinates of the human body frame corresponding to each group of regression offset parameters; and comparing the position coordinates of the human body frames corresponding to each group of regression offset parameters with the preset human body frames respectively, and determining the position coordinates of the final human body frames, wherein the position coordinates of the final human body frames are the position coordinates representing the nearest preset human body frames.
In some embodiments, determining, from an image to be detected including a human body image, position coordinates of a human body frame corresponding to the human body image includes: and detecting the image to be detected by using the depth network model obtained through training to obtain the position coordinates of the human body frame corresponding to the human body image in the image to be detected, wherein the depth network model is obtained through training by changing the evaluation threshold value in the evaluation threshold value vector in the frame regression loss function.
In some embodiments, the depth network model is trained based on the following steps: obtaining a training sample set, wherein training samples in the training sample set comprise: various categories of images to be detected; and outputting the position coordinates of the human body frames of the human body images in the input various types of images to be detected and the learned evaluation threshold vectors by using a deep learning method, wherein the learning target of the evaluation threshold vectors is to enable the position coordinates of the output human body frames to approach the position coordinates of the real human body frames.
In a second aspect, the present application provides an apparatus for detecting a position of a human body in an image, the apparatus comprising: a human body position determining unit configured to determine position coordinates of a human body frame corresponding to a human body image from an image to be detected including the human body image; the regression offset calculation unit is configured to perform multiple regression calculation on the position coordinates of the human body frame corresponding to the human body image in the image to be detected by utilizing the frame regression loss function based on the learned evaluation threshold vector, so as to obtain a regression offset parameter set corresponding to the evaluation threshold vector; the offset parameter analysis unit is configured to carry out multidimensional comparison analysis on each group of regression offset parameters in the regression offset parameter set to determine an optimal regression offset parameter set; and the first position generating unit is configured to generate the position coordinates of the final human body frame based on the optimal regression offset parameter set.
In some embodiments, the regression offset calculation unit includes: the first offset calculation module is configured to perform cascade regression calculation on position coordinates of a human body frame corresponding to a human body image in an image to be detected by using a frame regression loss function based on each evaluation threshold value in the learned evaluation threshold value vector to obtain a regression offset parameter set corresponding to the evaluation threshold value vector, wherein the evaluation threshold value vector is an ordered value set of evaluation threshold values representing the image overlapping degree IOU, and the numerical values in the ordered value set gradually increase from front to back.
In some embodiments, the regression offset calculation unit includes: the second offset calculation module is configured to perform cascade regression calculation by using a frame regression loss function based on the learned evaluation threshold vector and the position coordinates of the frame of the human body corresponding to the human body image in the image to be detected, so as to obtain a regression offset parameter set corresponding to the evaluation threshold vector; the evaluation threshold vector is used for representing an initial evaluation threshold and a cascade step length required by cascade regression calculation, and each regression calculation in the cascade regression calculation is performed based on a regression offset parameter set obtained by the last regression calculation.
In some embodiments, the apparatus further comprises: the human body position calculation unit is configured to obtain the position coordinates of the human body frame corresponding to each group of regression offset parameters based on each group of regression offset parameters in the regression offset parameter group set; the second position generating unit is configured to compare the position coordinates of the human body frame corresponding to each group of regression offset parameters with the preset human body frame respectively, and determine the position coordinates of the final human body frame, wherein the position coordinates of the final human body frame are the position coordinates representing the nearest preset human body frame.
In some embodiments, the human body position determining unit includes: the human body position detection module is configured to detect an image to be detected by using a depth network model obtained through training to obtain position coordinates of a human body frame corresponding to a human body image in the image to be detected, wherein the depth network model is obtained through training by changing an evaluation threshold value in an evaluation threshold value vector in a frame regression loss function.
In some embodiments, the depth network model of the human body position detection module is trained based on the following modules: a sample acquisition module configured to acquire a training sample set, wherein training samples in the training sample set comprise: various categories of images to be detected; the sample training module is configured to output the position coordinates of the human body frame of the human body image in the input various types of images to be detected and the learned evaluation threshold vector by taking various types of images to be detected included in the training sample set as the input of the detection network by using a deep learning method, and train to obtain a deep network model, wherein the learning target of the evaluation threshold vector is to enable the position coordinates of the output human body frame to approach the position coordinates of the real human body frame.
In a third aspect, the present application provides an electronic device, comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method as described in any one of the implementations of the first aspect.
In a fourth aspect, the application provides a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform a method as described in any of the implementations of the first aspect.
According to the technology, the position coordinates of the human body frame corresponding to the human body image are determined from the image to be detected comprising the human body image, based on the learned evaluation threshold vector, multiple regression calculation is carried out on the position coordinates of the human body frame corresponding to the human body image in the image to be detected by utilizing the frame regression loss function, a regression offset parameter set corresponding to the evaluation threshold vector is obtained, multidimensional comparison analysis is carried out on each regression offset parameter in the regression offset parameter set, an optimal regression offset parameter set is determined, the final position coordinates of the human body frame are generated based on the optimal regression offset parameter set, the human body detection algorithm is optimized, the problem that the position coordinates of the human body frame are not accurately predicted due to the fact that only one regression calculation is carried out on the frame regression loss function in the prior art is solved, the accuracy of detecting the position coordinates of the human body frame is improved, and the detection result of the human body image is further accurate.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the application or to delineate the scope of the application. Other features of the present application will become apparent from the description that follows.
Drawings
The drawings are included to provide a better understanding of the present application and are not to be construed as limiting the application.
FIG. 1 is a schematic diagram of a first embodiment of a method for detecting a position of a person in an image according to the present application;
FIG. 2 is a scene diagram of a method for detecting a position of a person in an image in which embodiments of the application may be implemented;
FIG. 3 is a foreground interactive schematic interface corresponding to a background for performing the method of the application for detecting a position of a person in an image;
FIG. 4 is a schematic diagram of a second embodiment of a method for detecting a position of a person in an image according to the present application;
FIG. 5 is a schematic structural view of one embodiment of an apparatus for detecting a position of a human body in an image according to the present application;
fig. 6 is a block diagram of an electronic device for implementing a method for detecting a position of a person in an image according to an embodiment of the application.
Detailed Description
Exemplary embodiments of the present application will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present application are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The application will be described in detail below with reference to the drawings in connection with embodiments.
Fig. 1 shows a schematic diagram 100 of a first embodiment of a method for detecting a position of a person in an image according to the application. The method for detecting the position of the human body in the image comprises the following steps:
step 101, determining the position coordinates of the human body frame corresponding to the human body image from the image to be detected including the human body image.
In this embodiment, the execution body determines, according to a preset frame position algorithm, a position coordinate of a human frame corresponding to a human image with respect to the human image of the image to be detected.
Step 102, based on the learned evaluation threshold vector, performing multiple regression calculation on the position coordinates of the human body frame corresponding to the human body image in the image to be detected by using the frame regression loss function, and obtaining a regression offset parameter set corresponding to the evaluation threshold vector.
In this embodiment, the executing body sequentially performs regression calculation on the position coordinates of the human body frame corresponding to the human body image in the image to be detected according to each evaluation threshold value in the learned evaluation threshold value vector by using the frame regression loss function, so as to obtain a regression offset parameter set composed of regression calculation results. The parameters in the regression offset parameter set may include an offset of the human body frame center point coordinates and a scaling ratio of the human body frame.
And 103, carrying out multidimensional comparison analysis on each group of regression offset parameters in the regression offset parameter group set to determine the optimal regression offset parameter group.
In this embodiment, multidimensional comparison analysis is performed on each set of regression offset parameters obtained by regression calculation, and an optimal regression offset parameter set is determined. The multi-dimensional comparison may include comparing separately based on each set of regression offset parameters with corresponding parameters of the standard box or comparing between different sets of regression offset parameters.
And 104, generating the final position coordinates of the human body frame based on the optimal regression offset parameter set.
In this embodiment, the position coordinates of the human body frame corresponding to the optimal regression offset parameter set are used as the final position coordinates of the human body frame.
It should be noted that the above-mentioned frame regression loss function is a well-known technique widely studied and applied at present, and will not be described here.
With continued reference to fig. 2, the method 200 for detecting a position of a person in an image of the present embodiment operates in an electronic device 201. In the field of monitoring security, when an electronic device 201 for monitoring determines a position coordinate 202 of a human body frame corresponding to a human body image from an image to be detected including the human body image, based on an estimated threshold vector obtained by learning, multiple regression calculation is performed on the position coordinate of the human body frame corresponding to the human body image in the image to be detected by using a frame regression loss function to obtain a regression offset parameter set 203 corresponding to the estimated threshold vector, based on an optimal regression offset parameter set, a final position coordinate 204 of the human body frame is generated, whether the person enters the monitoring field 205 is determined based on the final position coordinate of the human body frame, and a monitoring result is sent to a person to be monitored in a voice or text mode, and information received by the person is shown in fig. 3.
According to the method for detecting the positions of the human bodies in the images, which is provided by the embodiment of the application, the position coordinates of the human body frames corresponding to the human body images are determined from the images to be detected comprising the human body images, the position coordinates of the human body frames corresponding to the human body images in the images to be detected are subjected to multiple regression calculation by using the frame regression loss function based on the learned evaluation threshold value vector, the regression offset parameter set corresponding to the evaluation threshold value vector is obtained, the multi-dimensional comparison analysis is carried out on each group of regression offset parameters in the regression offset parameter set, the optimal regression offset parameter set is determined, the final position coordinates of the human body frames are generated based on the optimal regression offset parameter set, the human body detection algorithm is optimized, the problem that the position coordinates of the human body frames are not sufficiently accurately predicted due to the fact that only one regression calculation is carried out on the frame regression loss function in the prior art is solved, the accuracy of detecting the position coordinates of the human body frames is improved, and the detection result of the human body images is more accurate.
With further reference to fig. 4, a schematic diagram 400 of a second embodiment of a method for detecting a position of a person in an image is shown. The flow of the method comprises the following steps:
And step 401, detecting the image to be detected by using the depth network model obtained by training, and obtaining the position coordinates of the human body frame corresponding to the human body image in the image to be detected.
In this embodiment, based on a depth network model, an image to be detected is input into a detection network to obtain position coordinates of a human body frame corresponding to a human body image in the image to be detected, and the depth network model is obtained by training an evaluation threshold in an evaluation threshold vector in a frame regression loss function. The position coordinates of the human body frame are detected by using a deep learning technology, so that the detection result is more accurate.
In some optional implementations of the present embodiment, the depth network model is trained based on the following steps: obtaining a training sample set, wherein training samples in the training sample set comprise: various categories of images to be detected; and outputting the position coordinates of the human body frames of the human body images in the input various types of images to be detected and the learned evaluation threshold vectors by using a deep learning method, wherein the learning target of the evaluation threshold vectors is to enable the position coordinates of the output human body frames to approach the position coordinates of the real human body frames. The depth model is utilized to learn the evaluation threshold value vector, so that the problem that the prediction of the position coordinates of the pedestrian frame is wrong due to the fact that the evaluation threshold value is judged based on the standard that the evaluation threshold value is 0.5 during model training is avoided, the evaluation threshold value vector is more accurate, and further the detection of the position coordinates of the human body frame is more accurate.
Step 402, based on each evaluation threshold in the learned evaluation threshold vector, performing cascade regression calculation respectively by using the frame regression loss function on the position coordinates of the frame of the human body corresponding to the human body image in the image to be detected, so as to obtain a regression offset parameter set corresponding to the evaluation threshold vector.
In this embodiment, cascade regression calculation is performed on the position coordinates of the human body frame corresponding to the human body image in the image to be detected by changing the evaluation threshold of the evaluation threshold vector in the frame regression loss function, so as to obtain a regression offset parameter set corresponding to the evaluation threshold vector. The evaluation threshold vector is an ordered value set of the evaluation threshold characterizing the image overlapping degree IOU, and the numerical values in the ordered value set gradually increase from front to back. And the regression offset parameter set which is more and more close to the position coordinates of the real human frame is obtained by using cascade regression operation, so that the regression operation efficiency is improved.
In some optional implementations of the present embodiment, based on the learned evaluation threshold vector, performing multiple regression calculation on the position coordinates of the human body frame corresponding to the human body image in the image to be detected by using the frame regression loss function to obtain a regression offset parameter set corresponding to the evaluation threshold vector, including: based on the estimated threshold vector obtained by learning, carrying out cascade regression calculation on the position coordinates of the human body frame corresponding to the human body image in the image to be detected by utilizing the frame regression loss function to obtain a regression offset parameter set corresponding to the estimated threshold vector; the evaluation threshold vector is used for representing an initial evaluation threshold and a cascade step length required by cascade regression calculation, and each regression calculation in the cascade regression calculation is performed based on a regression offset parameter set obtained by the last regression calculation. And the regression offset parameter set which is more and more close to the position coordinates of the real human frame is obtained by using cascade regression operation, so that the regression operation efficiency is improved.
Step 403, obtaining the position coordinates of the human body frame corresponding to each group of regression offset parameters based on each group of regression offset parameters in the regression offset parameter group set.
In some embodiments, the position coordinates of the human body frame corresponding to each set of regression offset parameters are obtained based on each set of regression offset parameters in the regression offset parameter set, where the parameters in the regression offset parameter set include an offset of the human body frame center point coordinates and a scaling ratio of the human body frame.
And step 404, comparing the position coordinates of the human body frame corresponding to each group of regression offset parameters with the preset human body frame respectively, and determining the final position coordinates of the human body frame.
In this embodiment, the position coordinates of the human body frame corresponding to each set of regression offset parameters are respectively compared with the preset human body frame, the final position coordinates of the human body frame are determined, and the cascade regression calculation is verified again. The final position coordinates of the human body frame represent the position coordinates closest to the preset human body frame, and the preset human body frame can be artificially set based on the position coordinates of the real human body frame. And the cascade regression calculation is verified, so that the detection precision of the position coordinates of the human body frame is improved.
As can be seen from fig. 4, compared with the embodiment corresponding to fig. 1, the schematic diagram 400 of the method for detecting the position of the human body in the image in this embodiment learns the estimated threshold vector by using the depth model, so that the problem that the prediction of the position coordinate of the pedestrian frame is wrong due to the fact that the estimated threshold vector is judged only based on the criterion that the estimated threshold is 0.5 during model training is avoided, the estimated threshold vector is more accurate, and the detection of the position coordinate of the human body frame is more accurate; and obtaining a regression offset parameter set which is more and more close to the position coordinates of the real human frame through operation cascade regression operation, and verifying the cascade regression operation result, thereby improving the accuracy of detecting the position coordinates of the human frame.
With further reference to fig. 5, as an implementation of the method shown in the above figures, the present application provides an embodiment of an apparatus for detecting a position of a human body in an image, which corresponds to the embodiment of the method shown in fig. 1, and which is particularly applicable to various electronic devices.
As shown in fig. 5, the apparatus 500 for detecting a position of a human body in an image according to the present embodiment includes: a human body position determining unit 501, a regression offset calculating unit 502, an offset parameter analyzing unit 503, and a first position generating unit 504. The human body position determining unit is configured to determine the position coordinates of a human body frame corresponding to the human body image from the image to be detected comprising the human body image; the regression offset calculation unit is configured to perform multiple regression calculation on the position coordinates of the human body frame corresponding to the human body image in the image to be detected by utilizing the frame regression loss function based on the learned evaluation threshold vector, so as to obtain a regression offset parameter set corresponding to the evaluation threshold vector; the offset parameter analysis unit is configured to carry out multidimensional comparison analysis on each group of regression offset parameters in the regression offset parameter set to determine an optimal regression offset parameter set; and the first position generating unit is configured to generate the position coordinates of the final human body frame based on the optimal regression offset parameter set.
In this embodiment, the specific processes and the technical effects of the human body position determining unit 501, the regression offset calculating unit 502, the offset parameter analyzing unit 503 and the first position generating unit 504 of the apparatus 500 for detecting the human body position in the image can refer to the relevant descriptions of the steps 101 to 104 in the corresponding embodiment of fig. 1, and are not repeated here.
In some optional implementations of the present embodiment, the regression offset calculation unit includes: the first offset calculation module is configured to perform cascade regression calculation on position coordinates of a human body frame corresponding to a human body image in an image to be detected by using a frame regression loss function based on each evaluation threshold value in the learned evaluation threshold value vector to obtain a regression offset parameter set corresponding to the evaluation threshold value vector, wherein the evaluation threshold value vector is an ordered value set of evaluation threshold values representing the image overlapping degree IOU, and the numerical values in the ordered value set gradually increase from front to back.
In some optional implementations of the present embodiment, the regression offset calculation unit includes: the second offset calculation module is configured to perform cascade regression calculation by using a frame regression loss function based on the learned evaluation threshold vector and the position coordinates of the frame of the human body corresponding to the human body image in the image to be detected, so as to obtain a regression offset parameter set corresponding to the evaluation threshold vector; the evaluation threshold vector is used for representing an initial evaluation threshold and a cascade step length required by cascade regression calculation, and each regression calculation in the cascade regression calculation is performed based on a regression offset parameter set obtained by the last regression calculation.
In some optional implementations of this embodiment, the apparatus further includes: the human body position calculation unit is configured to obtain the position coordinates of the human body frame corresponding to each group of regression offset parameters based on each group of regression offset parameters in the regression offset parameter group set; the second position generating unit is configured to compare the position coordinates of the human body frame corresponding to each group of regression offset parameters with the preset human body frame respectively, and determine the position coordinates of the final human body frame, wherein the position coordinates of the final human body frame are the position coordinates representing the nearest preset human body frame.
In some optional implementations of the present embodiment, the human body position determining unit includes: the human body position detection module is configured to detect an image to be detected by using a depth network model obtained through training to obtain position coordinates of a human body frame corresponding to a human body image in the image to be detected, wherein the depth network model is obtained through training by changing an evaluation threshold value in an evaluation threshold value vector in a frame regression loss function.
In some optional implementations of this embodiment, the depth network model of the human body position detection module is trained based on the following modules: a sample acquisition module configured to acquire a training sample set, wherein training samples in the training sample set comprise: various categories of images to be detected; the sample training module is configured to output the position coordinates of the human body frame of the human body image in the input various types of images to be detected and the learned evaluation threshold vector by taking various types of images to be detected included in the training sample set as the input of the detection network by using a deep learning method, and train to obtain a deep network model, wherein the learning target of the evaluation threshold vector is to enable the position coordinates of the output human body frame to approach the position coordinates of the real human body frame.
According to an embodiment of the present application, the present application also provides an electronic device and a readable storage medium.
As shown in fig. 6, there is a block diagram of an electronic device for a method of detecting a position of a person in an image according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the applications described and/or claimed herein.
As shown in fig. 6, the electronic device includes: one or more processors 601, memory 602, and interfaces for connecting the components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple electronic devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). One processor 601 is illustrated in fig. 6.
The memory 602 is a non-transitory computer readable storage medium provided by the present application. The memory stores instructions executable by the at least one processor to cause the at least one processor to perform the method for detecting a position of a person in an image provided by the present application. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to perform the method for detecting a position of a human body in an image provided by the present application.
The memory 602 is used as a non-transitory computer readable storage medium, and may be used to store a non-transitory software program, a non-transitory computer executable program, and modules, such as program instructions/modules corresponding to a method for detecting a position of a human body in an image in an embodiment of the present application (e.g., the human body position determining unit 501, the regression offset calculating unit 502, the offset parameter analyzing unit 503, and the first position generating unit 504 shown in fig. 5). The processor 601 executes various functional applications of the server and data processing by running non-transitory software programs, instructions, and modules stored in the memory 602, that is, implements the method for detecting a position of a person in an image in the above-described method embodiment.
The memory 602 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, at least one application program required for a function; the storage data area may store data created from the use of an electronic device for detecting the position of a person in an image, or the like. In addition, the memory 602 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory 602 optionally includes memory remotely located relative to processor 601, which may be connected via a network to an electronic device for detecting the location of a person in an image. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device for the method of detecting a position of a person in an image may further comprise: an input device 603 and an output device 604. The processor 601, memory 602, input device 603 and output device 604 may be connected by a bus or otherwise, for example in fig. 6.
The input device 603 may receive input numeric or character information and generate key signal inputs related to user settings and function controls of the electronic device for detecting the position of the person in the image, such as a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointer stick, one or more mouse buttons, a track ball, a joystick, etc. input devices. The output means 604 may include a display device, auxiliary lighting means (e.g., LEDs), tactile feedback means (e.g., vibration motors), and the like. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASIC (application specific integrated circuit), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
These computing programs (also referred to as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
According to the technical scheme provided by the embodiment of the application, the position coordinates of the human body frame corresponding to the human body image are determined from the image to be detected comprising the human body image, based on the learned evaluation threshold vector, multiple regression calculation is performed on the position coordinates of the human body frame corresponding to the human body image in the image to be detected by utilizing the frame regression loss function, a regression offset parameter set corresponding to the evaluation threshold vector is obtained, multidimensional comparison analysis is performed on each regression offset parameter in the regression offset parameter set, an optimal regression offset parameter set is determined, the final position coordinates of the human body frame are generated based on the optimal regression offset parameter set, the human body detection algorithm is optimized, the problem that the position coordinates of the human body frame are not accurate enough due to the fact that only one-time regression calculation is performed on the frame regression loss function in the prior art is solved, the accuracy of detecting the position coordinates of the human body frame is improved, and the detection result of the human body image is more accurate.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present application may be performed in parallel, sequentially, or in a different order, provided that the desired results of the disclosed embodiments are achieved, and are not limited herein.
The above embodiments do not limit the scope of the present application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application should be included in the scope of the present application.

Claims (12)

1. A method for detecting a position of a human body in an image, the method comprising:
determining the position coordinates of a human body frame corresponding to a human body image from an image to be detected comprising the human body image;
based on the learned evaluation threshold vector, performing multiple regression calculation on the position coordinates of the human body frame corresponding to the human body image in the image to be detected by utilizing a frame regression loss function to obtain a regression offset parameter set corresponding to the evaluation threshold vector, wherein parameters in the regression offset parameter set comprise offset of the center point coordinates of the human body frame and the scaling ratio of the human body frame;
Performing multidimensional comparison analysis on each group of regression offset parameters in the regression offset parameter set to determine an optimal regression offset parameter set, wherein the multidimensional comparison analysis comprises comparison analysis based on the regression offset parameters of each group and corresponding parameters of a standard frame respectively;
generating final position coordinates of the human body frame based on the optimal regression offset parameter set;
the method further comprises the steps of:
based on each group of regression offset parameters in the regression offset parameter group set, obtaining the position coordinates of the human body frame corresponding to each group of regression offset parameters;
and comparing the position coordinates of the human body frame corresponding to each group of regression offset parameters with preset human body frames respectively, determining the position coordinates of the final human body frame, and verifying the cascade regression operation result, wherein the preset human body frame is set based on the position coordinates of the real human body frame, and the position coordinates of the final human body frame are the position coordinates which represent the closest to the preset human body frame.
2. The method of claim 1, wherein the performing multiple regression calculation on the position coordinates of the human body frame corresponding to the human body image in the image to be detected based on the evaluation threshold vector obtained by learning by using the frame regression loss function to obtain a regression offset parameter set corresponding to the evaluation threshold vector includes:
And respectively carrying out cascade regression calculation on the position coordinates of the human body frame corresponding to the human body image in the image to be detected based on each evaluation threshold value in the learned evaluation threshold value vector by utilizing the frame regression loss function to obtain a regression offset parameter set corresponding to the evaluation threshold value vector, wherein the evaluation threshold value vector is an ordered value set of the evaluation threshold value representing the image overlapping degree IOU, and the numerical value in the ordered value set gradually increases from front to back.
3. The method of claim 1, wherein the obtaining the regression offset parameter set corresponding to the evaluation threshold vector based on the learned evaluation threshold vector performs multiple regression calculation on the position coordinates of the human body frame corresponding to the human body image in the image to be detected by using a frame regression loss function, and includes:
based on the evaluation threshold vector obtained by learning, carrying out cascade regression calculation on the position coordinates of the human body frame corresponding to the human body image in the image to be detected by utilizing the frame regression loss function to obtain a regression offset parameter set corresponding to the evaluation threshold vector;
the evaluation threshold vector is used for representing an initial evaluation threshold and a cascade step length required by cascade regression calculation, and each regression calculation in the cascade regression calculation is performed based on a regression offset parameter set obtained by the last regression calculation.
4. The method of claim 1, wherein the determining, from the image to be detected including the human body image, the position coordinates of the human body frame corresponding to the human body image includes:
and detecting the image to be detected by using the depth network model obtained through training to obtain the position coordinates of the human body frame corresponding to the human body image in the image to be detected, wherein the depth network model is obtained through training by changing the evaluation threshold value in the evaluation threshold value vector in the frame regression loss function.
5. The method of claim 4, wherein the depth network model is trained based on:
obtaining a training sample set, wherein training samples in the training sample set comprise: various categories of images to be detected;
and outputting the position coordinates of the human body frames of the human body images in the input various types of images to be detected and the learned evaluation threshold vectors by using a deep learning method, wherein the learning target of the evaluation threshold vectors is to enable the position coordinates of the output human body frames to approach the position coordinates of the real human body frames.
6. An apparatus for detecting a position of a human body in an image, the apparatus comprising:
a human body position determining unit configured to determine position coordinates of a human body frame corresponding to a human body image from an image to be detected including the human body image;
the regression offset calculation unit is configured to perform multiple regression calculation on the position coordinates of the human body frame corresponding to the human body image in the image to be detected by utilizing a frame regression loss function based on the learned evaluation threshold vector to obtain a regression offset parameter set corresponding to the evaluation threshold vector, wherein parameters in the regression offset parameter set comprise offset of the center point coordinates of the human body frame and scaling ratio of the human body frame;
the deviation parameter analysis unit is configured to carry out multidimensional comparison analysis on each group of regression deviation parameters in the regression deviation parameter set to determine an optimal regression deviation parameter set, and the multidimensional comparison analysis comprises comparison analysis based on the regression deviation parameters of each group and corresponding parameters of a standard frame respectively;
the first position generating unit is configured to generate the position coordinates of the final human body frame based on the optimal regression offset parameter set;
The apparatus further comprises:
the human body position calculation unit is configured to obtain the position coordinates of the human body frame corresponding to each group of regression offset parameters based on each group of regression offset parameters in the regression offset parameter group set;
the second position generating unit is configured to compare the position coordinates of the human body frame corresponding to each group of regression offset parameters with preset human body frames respectively, determine the position coordinates of the final human body frame, and verify the cascade regression operation result, wherein the preset human body frame is set based on the position coordinates of the real human body frame, and the position coordinates of the final human body frame are the position coordinates representing the nearest human body frame.
7. The apparatus of claim 6, wherein the regression offset calculation unit comprises:
the first offset calculation module is configured to perform cascade regression calculation on position coordinates of a human body frame corresponding to a human body image in the image to be detected based on each evaluation threshold value in the learned evaluation threshold value vector, and obtain a regression offset parameter set corresponding to the evaluation threshold value vector by using the frame regression loss function, wherein the evaluation threshold value vector is an ordered value set of evaluation threshold values representing the image overlapping degree IOU, and the values in the ordered value set gradually increase from front to back.
8. The apparatus of claim 6, wherein the regression offset calculation unit comprises:
the second offset calculation module is configured to perform cascade regression calculation on the position coordinates of the human frame corresponding to the human body image in the image to be detected based on the learned evaluation threshold vector by using the frame regression loss function to obtain a regression offset parameter set corresponding to the evaluation threshold vector; the evaluation threshold vector is used for representing an initial evaluation threshold and a cascade step length required by cascade regression calculation, and each regression calculation in the cascade regression calculation is performed based on a regression offset parameter set obtained by the last regression calculation.
9. The apparatus of claim 6, wherein the human body position determining unit comprises:
the human body position detection module is configured to detect the image to be detected by using a depth network model obtained through training to obtain the position coordinates of a human body frame corresponding to the human body image in the image to be detected, wherein the depth network model is obtained through training by changing an evaluation threshold value in an evaluation threshold value vector in a frame regression loss function.
10. The apparatus of claim 9, wherein the depth network model of the human position detection module is trained based on:
a sample acquisition module configured to acquire a training sample set, wherein training samples in the training sample set comprise: various categories of images to be detected;
the sample training module is configured to output the position coordinates of the human body frame of the human body image in the input various types of images to be detected and the learned evaluation threshold vector by using a deep learning method as input of a detection network, and train to obtain a deep network model, wherein the learning target of the evaluation threshold vector is to enable the position coordinates of the output human body frame to approach the position coordinates of the real human body frame.
11. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.
12. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-5.
CN202010321816.XA 2020-04-22 2020-04-22 Method and device for detecting human body position in image Active CN111523452B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010321816.XA CN111523452B (en) 2020-04-22 2020-04-22 Method and device for detecting human body position in image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010321816.XA CN111523452B (en) 2020-04-22 2020-04-22 Method and device for detecting human body position in image

Publications (2)

Publication Number Publication Date
CN111523452A CN111523452A (en) 2020-08-11
CN111523452B true CN111523452B (en) 2023-08-25

Family

ID=71910458

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010321816.XA Active CN111523452B (en) 2020-04-22 2020-04-22 Method and device for detecting human body position in image

Country Status (1)

Country Link
CN (1) CN111523452B (en)

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107492067A (en) * 2017-09-07 2017-12-19 维沃移动通信有限公司 A kind of image beautification method and mobile terminal
CN109003267A (en) * 2017-08-09 2018-12-14 深圳科亚医疗科技有限公司 From the computer implemented method and system of the automatic detected target object of 3D rendering
CN109241914A (en) * 2018-09-11 2019-01-18 广州广电银通金融电子科技有限公司 A kind of Small object pedestrian detection method under complex scene
CN109815814A (en) * 2018-12-21 2019-05-28 天津大学 A kind of method for detecting human face based on convolutional neural networks
CN109858481A (en) * 2019-01-09 2019-06-07 杭州电子科技大学 A kind of Ship Target Detection method based on the detection of cascade position sensitivity
CN109886286A (en) * 2019-01-03 2019-06-14 武汉精测电子集团股份有限公司 Object detection method, target detection model and system based on cascade detectors
CN109961006A (en) * 2019-01-30 2019-07-02 东华大学 A kind of low pixel multiple target Face datection and crucial independent positioning method and alignment schemes
CN110059667A (en) * 2019-04-28 2019-07-26 上海应用技术大学 Pedestrian counting method
CN110298226A (en) * 2019-04-03 2019-10-01 复旦大学 A kind of cascade detection method of millimeter-wave image human body belongings
CN110490052A (en) * 2019-07-05 2019-11-22 山东大学 Face datection and face character analysis method and system based on cascade multi-task learning
CN110580445A (en) * 2019-07-12 2019-12-17 西北工业大学 Face key point detection method based on GIoU and weighted NMS improvement
CN110705583A (en) * 2019-08-15 2020-01-17 平安科技(深圳)有限公司 Cell detection model training method and device, computer equipment and storage medium
CN110992311A (en) * 2019-11-13 2020-04-10 华南理工大学 Convolutional neural network flaw detection method based on feature fusion
CN111008576A (en) * 2019-11-22 2020-04-14 高创安邦(北京)技术有限公司 Pedestrian detection and model training and updating method, device and readable storage medium thereof

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016006626A (en) * 2014-05-28 2016-01-14 株式会社デンソーアイティーラボラトリ Detector, detection program, detection method, vehicle, parameter calculation device, parameter calculation program, and parameter calculation method
US10032067B2 (en) * 2016-05-28 2018-07-24 Samsung Electronics Co., Ltd. System and method for a unified architecture multi-task deep learning machine for object recognition

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109003267A (en) * 2017-08-09 2018-12-14 深圳科亚医疗科技有限公司 From the computer implemented method and system of the automatic detected target object of 3D rendering
CN107492067A (en) * 2017-09-07 2017-12-19 维沃移动通信有限公司 A kind of image beautification method and mobile terminal
CN109241914A (en) * 2018-09-11 2019-01-18 广州广电银通金融电子科技有限公司 A kind of Small object pedestrian detection method under complex scene
CN109815814A (en) * 2018-12-21 2019-05-28 天津大学 A kind of method for detecting human face based on convolutional neural networks
CN109886286A (en) * 2019-01-03 2019-06-14 武汉精测电子集团股份有限公司 Object detection method, target detection model and system based on cascade detectors
CN109858481A (en) * 2019-01-09 2019-06-07 杭州电子科技大学 A kind of Ship Target Detection method based on the detection of cascade position sensitivity
CN109961006A (en) * 2019-01-30 2019-07-02 东华大学 A kind of low pixel multiple target Face datection and crucial independent positioning method and alignment schemes
CN110298226A (en) * 2019-04-03 2019-10-01 复旦大学 A kind of cascade detection method of millimeter-wave image human body belongings
CN110059667A (en) * 2019-04-28 2019-07-26 上海应用技术大学 Pedestrian counting method
CN110490052A (en) * 2019-07-05 2019-11-22 山东大学 Face datection and face character analysis method and system based on cascade multi-task learning
CN110580445A (en) * 2019-07-12 2019-12-17 西北工业大学 Face key point detection method based on GIoU and weighted NMS improvement
CN110705583A (en) * 2019-08-15 2020-01-17 平安科技(深圳)有限公司 Cell detection model training method and device, computer equipment and storage medium
CN110992311A (en) * 2019-11-13 2020-04-10 华南理工大学 Convolutional neural network flaw detection method based on feature fusion
CN111008576A (en) * 2019-11-22 2020-04-14 高创安邦(北京)技术有限公司 Pedestrian detection and model training and updating method, device and readable storage medium thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
改进的多目标回归实时人脸检测算法;吴志洋等;《计算机工程与应用》;20180531;第54卷(第11期);第1-7页 *

Also Published As

Publication number Publication date
CN111523452A (en) 2020-08-11

Similar Documents

Publication Publication Date Title
CN111986178B (en) Product defect detection method, device, electronic equipment and storage medium
US20220383535A1 (en) Object Tracking Method and Device, Electronic Device, and Computer-Readable Storage Medium
CN107766839B (en) Motion recognition method and device based on 3D convolutional neural network
CN111598164B (en) Method, device, electronic equipment and storage medium for identifying attribute of target object
KR20220113829A (en) Vehicle tracking methods, devices and electronic devices
CN112270399B (en) Operator registration processing method and device based on deep learning and electronic equipment
CN110659600B (en) Object detection method, device and equipment
CN112966742A (en) Model training method, target detection method and device and electronic equipment
CN112529073A (en) Model training method, attitude estimation method and apparatus, and electronic device
CN111968203B (en) Animation driving method, device, electronic equipment and storage medium
CN111783606B (en) Training method, device, equipment and storage medium of face recognition network
US20210312799A1 (en) Detecting traffic anomaly event
CN111783621A (en) Method, device, equipment and storage medium for facial expression recognition and model training
CN111738072A (en) Training method and device of target detection model and electronic equipment
CN111783605A (en) Face image recognition method, device, equipment and storage medium
CN111563541B (en) Training method and device of image detection model
JP7270114B2 (en) Face keypoint detection method, device and electronic device
CN110717933B (en) Post-processing method, device, equipment and medium for moving object missed detection
CN112149741B (en) Training method and device for image recognition model, electronic equipment and storage medium
CN112241716B (en) Training sample generation method and device
CN116228867B (en) Pose determination method, pose determination device, electronic equipment and medium
CN111832611B (en) Training method, device, equipment and storage medium for animal identification model
CN112561879A (en) Ambiguity evaluation model training method, image ambiguity evaluation method and device
CN112749701B (en) License plate offset classification model generation method and license plate offset classification method
CN112016523B (en) Cross-modal face recognition method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant