CN114219978A

CN114219978A - Target multi-part association method and device, terminal and computer-readable storage medium

Info

Publication number: CN114219978A
Application number: CN202111362327.XA
Authority: CN
Inventors: 于润润; 潘华东; 殷俊; 李中振; 巩海军
Original assignee: Zhejiang Dahua Technology Co Ltd
Current assignee: Zhejiang Dahua Technology Co Ltd
Priority date: 2021-11-17
Filing date: 2021-11-17
Publication date: 2022-03-22
Anticipated expiration: 2041-11-17
Also published as: CN114219978B

Abstract

The invention provides a target multi-part association method and device, a terminal and a computer readable storage medium, wherein the target multi-part association method is used for detecting a plurality of parts to be processed of at least one target in an image to be detected; performing feature extraction on image areas corresponding to the parts to be processed in the plurality of parts to be processed to obtain part features corresponding to the parts to be processed; determining at least one to-be-processed part feature set according to the acquired similarity among the part features; wherein, one part feature set comprises part features corresponding to different parts of the same target; and associating the parts to be processed corresponding to the part features contained in each part feature set to be processed. According to the method and the device, the part characteristics with the incidence relation are determined according to the similarity between the part characteristics, the parts to be detected which are mutually associated are determined according to the associated part characteristics, and the association accuracy and the generalization performance are improved.

Description

Target multi-part association method and device, terminal and computer-readable storage medium

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to a target multi-location association method and apparatus, a terminal, and a computer-readable storage medium.

Background

In security monitoring, in order to improve the accuracy of target detection, the target is subjected to multi-part marking. For example, when the target is a human, the head, the shoulder, the body, and the like of the human are detected. The detection results of different parts are mutually independent, and different parts of the same target are not related. The existing multi-part association technology is a two-step method: firstly, target detection is carried out to obtain all detection frames; and secondly, determining the incidence relation of the detection frames of different parts of the same target by using the intersection comparison result of the detection frames or other empirical parameters. The method for determining the incidence relation of the detection frames by manually selecting the empirical parameters has weak generalization performance, can only carry out the association aiming at a fixed scene and a specific part to be processed, can not randomly change the quantity and the position characteristics of the part to be associated of a target, and has low practicability.

Disclosure of Invention

The invention mainly solves the technical problem of providing a target multi-part association method and device, a terminal and a computer readable storage medium, and solves the problem of weak generalization performance of target multi-part detection frame association in the prior art.

In order to solve the technical problems, the first technical scheme adopted by the invention is as follows: provided is a target multi-part association method, including: detecting a plurality of parts to be processed of at least one target in an image to be detected; performing feature extraction on image areas corresponding to the parts to be processed in the plurality of parts to be processed to obtain part features corresponding to the parts to be processed; determining at least one to-be-processed part feature set according to the acquired similarity among the part features; wherein, one part feature set comprises part features corresponding to different parts of the same target; and associating the parts to be processed corresponding to the part features contained in each part feature set to be processed.

Determining at least one to-be-processed part feature set according to the acquired similarity among the part features, wherein the determining comprises the following steps of: determining a set consisting of other part characteristics and target part characteristics, of the part characteristics corresponding to each part to be processed, wherein the similarity between the part characteristics and the target part characteristics is greater than a similarity threshold value, as a part characteristic set to be processed; the target part feature is any part feature in the part features corresponding to the parts to be processed.

Wherein, detect a plurality of pending positions of at least one target in waiting to examine the image, still include: detecting detection types corresponding to the parts to be processed in the parts to be processed; determining a set consisting of other part features and target part features, of the part features corresponding to the parts to be processed, of which the similarity with the target part feature is greater than a similarity threshold value, as a part feature set to be processed, wherein the method also comprises the following steps: determining each candidate part from each part to be processed based on each detected detection type, wherein the candidate part is different from the detection type of the part to be processed corresponding to the target part characteristic; and determining the similarity between the part characteristics corresponding to each candidate part and the target part characteristics.

The similarity threshold is determined based on the similarity between the class features corresponding to the first detection class and the class features corresponding to the second detection class, the first detection class is the detection class of the to-be-processed part corresponding to the target part feature, and the second detection class is the detection class corresponding to the candidate part.

Wherein, associating the parts to be processed corresponding to each part feature contained in each part feature set to be processed comprises: respectively taking each part feature set to be processed as a part feature set to be associated, and performing the following processing: associating each part feature contained in the part feature set to be associated; associating the parts to be processed corresponding to the part features contained in the part feature set to be associated according to the associated part features; and configuring the same identification information for the parts to be processed corresponding to the part features contained in the part feature set to be associated.

Wherein, detect a plurality of pending positions of at least one target in waiting to examine the image, include: extracting the characteristics of the acquired image to be detected through a convolutional neural network to obtain a characteristic diagram; the convolutional neural network is obtained by training based on a plurality of training sample images, each part to be processed of the historical target and the detection type of each part to be processed are marked in the training sample images, and the incidence relation of each part to be processed of the historical target is marked in the training images; and carrying out target detection on the characteristic diagram to obtain a plurality of parts to be processed of at least one target.

The convolutional neural network is obtained by the following method: acquiring a training sample set, wherein a training sample image comprises each to-be-processed part of a historical target and the detection category of each to-be-processed part for marking, and the training image marks the incidence relation of each to-be-processed part of the historical target; detecting the image through an initial convolutional neural network to obtain a predicted part of a historical target, a prediction type of the predicted part and an incidence relation of each predicted part; constructing a loss function through the marked part to be processed and the predicted part corresponding to the historical target, the marked detection type of each part to be processed and the predicted type of the predicted part, and the marked incidence relation of each part to be processed and the incidence relation of each predicted part of the historical target; and performing iterative training on the initial convolutional neural network by using the loss function to obtain the convolutional neural network.

In order to solve the above technical problems, the second technical solution adopted by the present invention is: there is provided a target multi-site association device comprising: the detection module is used for detecting a plurality of parts to be processed of at least one target in the image to be detected; the characteristic extraction module is used for extracting the characteristics of the image area corresponding to each part to be processed in the parts to be processed to obtain the part characteristics corresponding to each part to be processed; the analysis module is used for determining at least one feature set of the part to be processed according to the acquired similarity among the part features; wherein, one part feature set comprises part features corresponding to different parts of the same target; and the processing module is used for associating the parts to be processed corresponding to the part features contained in each part feature set to be processed.

In order to solve the above technical problems, the third technical solution adopted by the present invention is: there is provided a terminal comprising a memory, a processor for executing the sequence data to achieve the steps in the target multi-site association method, and a computer program stored in the memory and run on the processor.

In order to solve the technical problems, the fourth technical scheme adopted by the invention is as follows: there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps in the above-described target multi-site association method.

The invention has the beneficial effects that: different from the prior art, the target multi-part association method, the device, the terminal and the computer-readable storage medium are provided, wherein the target multi-part association method detects a plurality of to-be-processed parts of at least one target in an image to be detected; performing feature extraction on image areas corresponding to the parts to be processed in the plurality of parts to be processed to obtain part features corresponding to the parts to be processed; determining at least one to-be-processed part feature set according to the acquired similarity among the part features; wherein, one part feature set comprises part features corresponding to different parts of the same target; and associating the parts to be processed corresponding to the part features contained in each part feature set to be processed. According to the method and the device, the plurality of to-be-processed parts are obtained by detecting the to-be-detected image, the part features are obtained by extracting the features of the to-be-processed parts, high-level feature information is used, the part features contain abundant context global information, the similarity between the part features corresponding to the to-be-processed parts is more conveniently detected, the part features with the association relation are determined according to the similarity between the part features, the calculated amount in the association process is reduced, the mutually-associated to-be-detected parts are determined according to the associated part features, and the association accuracy and the generalization performance are improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic flow chart of a target multi-site association method provided by the present invention;

FIG. 2 is a flowchart illustrating a target multi-site association method according to an embodiment of the present invention;

fig. 3 is a flowchart illustrating an embodiment of step S201 in the target multi-location association method provided in fig. 2;

FIG. 4 is a schematic flow diagram of a convolutional neural network in the target multi-site correlation method provided in FIG. 2;

FIG. 5 is a block diagram of a target multi-site association apparatus provided by the present invention;

FIG. 6 is a schematic block diagram of one embodiment of a terminal provided by the present invention;

FIG. 7 is a schematic block diagram of one embodiment of a computer-readable storage medium provided by the present invention.

Detailed Description

The following describes in detail the embodiments of the present application with reference to the drawings attached hereto.

In the following description, for purposes of explanation and not limitation, specific details are set forth such as particular system structures, interfaces, techniques, etc. in order to provide a thorough understanding of the present application.

The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship. Further, the term "plurality" herein means two or more than two.

In order to make those skilled in the art better understand the technical solution of the present invention, the following describes an objective multi-site association method provided by the present invention in further detail with reference to the accompanying drawings and the detailed description.

Referring to fig. 1, fig. 1 is a flow chart illustrating a target multi-location association method according to the present invention. The present embodiment provides a target multi-site association method, which includes the following steps.

S11: a plurality of portions to be processed of at least one target in an image to be detected are detected.

Specifically, extracting the characteristics of the acquired image to be detected through a convolutional neural network to obtain a characteristic diagram; the convolutional neural network is obtained by training based on a plurality of training sample images, each part to be processed of the historical target and the detection type of each part to be processed are marked in the training sample images, and the incidence relation of each part to be processed of the historical target is marked in the training images; and carrying out target detection on the characteristic diagram to obtain a plurality of parts to be processed of at least one target.

In an alternative embodiment, the detection category corresponding to each of the plurality of parts to be processed is detected.

S12: and performing feature extraction on image areas corresponding to the parts to be processed in the plurality of parts to be processed to obtain part features corresponding to the parts to be processed.

Specifically, an image region including a portion to be processed is segmented and convolved to obtain a portion feature corresponding to the region image.

In an optional embodiment, based on each detected detection type, each candidate part is determined from each part to be processed, and the detection types of the parts to be processed corresponding to the candidate part and the target part feature are different; and determining the similarity between the part characteristics corresponding to each candidate part and the target part characteristics.

S13: and determining at least one to-be-processed part feature set according to the acquired similarity among the part features.

Specifically, a set consisting of other part features and target part features, of the part features corresponding to the parts to be processed, of which the similarity with the target part feature is greater than a similarity threshold value, is determined as a part feature set to be processed; the target part feature is any part feature in the part features corresponding to the parts to be processed. The similarity threshold is determined based on a degree of similarity between a category feature corresponding to a first detection category, which is a detection category of the to-be-processed portion corresponding to the target portion feature, and a category feature corresponding to a second detection category, which is a detection category corresponding to the candidate portion.

S14: and associating the parts to be processed corresponding to the part features contained in each part feature set to be processed.

Specifically, associating each part feature contained in the part feature set to be associated; associating the parts to be processed corresponding to the part features contained in the part feature set to be associated according to the associated part features; and configuring the same identification information for the parts to be processed corresponding to the part features contained in the part feature set to be associated.

The target multi-part association method provided by the embodiment detects a plurality of parts to be processed of at least one target in an image to be detected; performing feature extraction on image areas corresponding to the parts to be processed in the plurality of parts to be processed to obtain part features corresponding to the parts to be processed; determining at least one to-be-processed part feature set according to the acquired similarity among the part features; wherein, one part feature set comprises part features corresponding to different parts of the same target; and associating the parts to be processed corresponding to the part features contained in each part feature set to be processed. According to the method and the device, the plurality of to-be-processed parts are obtained by detecting the to-be-detected image, the part features are obtained by extracting the features of the to-be-processed parts, high-level feature information is used, the part features contain abundant context global information, the similarity between the part features corresponding to the to-be-processed parts is more conveniently detected, the part features with the association relation are determined according to the similarity between the part features, the calculated amount in the association process is reduced, the mutually-associated to-be-detected parts are determined according to the associated part features, and the association accuracy and the generalization performance are improved.

Referring to fig. 2 to 4, fig. 2 is a schematic flowchart illustrating a target multi-location association method according to an embodiment of the present invention; fig. 3 is a flowchart illustrating an embodiment of step S201 in the target multi-location association method provided in fig. 2; fig. 4 is a schematic flow chart of the convolutional neural network in the target multi-site association method provided in fig. 2.

The embodiment provides a target multi-part association method, which is convenient for tracking a target object and can further improve the accuracy of tracking the target object. The target multi-site association method includes the following steps.

S201: and training to obtain the convolutional neural network.

Specifically, referring to fig. 3, the specific steps of training the primary convolutional neural network to obtain the convolutional neural network are as follows.

2011: a training sample set is obtained.

Specifically, a plurality of images containing the historical target are obtained, the plurality of images containing the historical target form a training sample set, the to-be-processed parts of the historical target in the images are labeled, the detection types of the to-be-processed parts are labeled, and the incidence relation of the to-be-processed parts of the historical target is labeled in the training images. For example, the historical target in the image may be a person, and the three parts of the head, the head and the shoulder and the human body of the person in the image are labeled, the head detection frame, the head and shoulder detection frame and the human body detection frame are labeled, and the detection type of the head, the detection type of the head and the shoulder and the detection type of the human body are labeled. For example, all detection frames and detection categories for different parts of the kth history object are labeled as k. The number of the parts of the history target may be multiple, which is not limited herein.

And aiming at different detection targets, corresponding training sample sets can be set to train the initial convolutional neural network model. If a plurality of parts of a person need to be detected through the convolutional neural network obtained through training subsequently, the initial convolutional neural network needs to be trained through a plurality of images containing the person.

2012: and detecting the image through an initial convolutional neural network to obtain the predicted part of the historical target, the prediction type of the predicted part and the incidence relation between the predicted part and each predicted part.

Specifically, the initial convolutional neural network includes an initial feature extraction network, an initial target detection network, and an initial home vector generation network. And performing feature extraction on the images in the training sample set through an initial feature extraction network to obtain a target feature map. Detecting parts in the target feature map through an initial target detection network to obtain predicted parts of the historical target and predicted types of the predicted parts; and (4) performing convolution on each predicted part through the initial home vector generation network to obtain a one-dimensional home vector corresponding to the predicted part.

And judging whether the prediction types respectively corresponding to the two prediction parts are the same. And if the corresponding prediction types of the two prediction parts are different, calculating the similarity between the two prediction parts, and calculating the association relationship between the prediction part and other prediction parts according to the similarity. Specifically, the distance between the one-dimensional home vectors corresponding to the two predicted portions is calculated. The distance may be a euclidean distance, a cosine distance, or the like. And when the distance between the one-dimensional home vectors respectively corresponding to the two predicted positions is smaller than a preset value, indicating that the two one-dimensional home vectors are correlated. And sequentially judging whether the one-dimensional home vectors corresponding to every two predicted parts are associated or not according to the method. And when the associated one-dimensional home vector is determined, associating the predicted parts corresponding to the one-dimensional home vector to obtain the association relation of each predicted part.

2013: and constructing a loss function through the parts to be processed and the predicted parts marked by the same target, the detection types of the marked parts to be processed and the prediction types of the predicted parts, and the incidence relations of the marked parts to be processed and the incidence relations of the predicted parts.

Specifically, error values between the part to be processed and the predicted part marked on the same target, the detection type of the marked part to be processed and the prediction type of the predicted part, the association relationship of the marked parts to be processed and the association relationship between the predicted parts are calculated. In one embodiment, the Loss function is a Cross-entropy Loss.

2014: and performing iterative training on the initial convolutional neural network by using the loss function to obtain the convolutional neural network.

Specifically, the initial convolutional neural network is iteratively trained through error values between the labeled to-be-processed parts and the predicted parts corresponding to the same target and correlated with each other, the detection types of the labeled to-be-processed parts and the prediction types of the predicted parts, and the labeled incidence relations between the to-be-processed parts and the incidence relations between the predicted parts, so that the convolutional neural network is obtained.

In an alternative embodiment, the results of the initial convolutional neural network are propagated backwards, and the weights of the initial convolutional neural network are modified according to the loss values fed back by the loss function. That is, the weights in the initial feature extraction network, the initial target detection network, and the initial home vector generation network are corrected according to the loss value fed back by the loss function. In an optional embodiment, parameters in the initial convolutional neural network may also be modified, so as to implement training of the initial convolutional neural network.

The training image is input into an initial convolutional neural network, which predicts and correlates the location of the target in the image. When the error value between the labeled to-be-processed part and the predicted part, the labeled detection type of the to-be-processed part and the predicted type of the predicted part, the labeled incidence relation of each to-be-processed part and the incidence relation between each predicted part, which correspond to the same target and are correlated with each other, is smaller than a preset threshold, the preset threshold can be set by self, for example, 1%, 5%, and the like, the training of the initial convolutional neural network is stopped, and the convolutional neural network is obtained. The convolutional neural network comprises a feature extraction network, a target detection network and an attribution vector generation network.

The convolutional neural network is obtained through initial convolutional neural network training, the generalization of the network can be increased, the extraction of high-level features of convolutional neural network learning robustness is promoted, and the accuracy of feature association is improved.

S202: and acquiring an image to be detected.

Specifically, an image to be detected is acquired through an image acquisition device. The image to be detected contains a target object. The target object may be a human, an animal, or the like. The image acquisition device may be a camera or other devices capable of acquiring images. In this embodiment, the image to be detected is an image obtained in real time. In this embodiment, the target object in the image to be detected is a person.

S203: and performing feature extraction on the image to be detected through a convolutional neural network to obtain a feature map.

Specifically, feature extraction is performed on the image to be detected through a feature extraction network to obtain a feature map. In one embodiment, the feature extraction network may be a ResNet or other down-sampling type network or a Hourglass type Hourglass network. In this embodiment, the Hourglass network is used as the feature extraction network.

S204: and carrying out target detection on the characteristic diagram to obtain a plurality of parts to be detected of at least one target.

Specifically, referring to fig. 4, the parts in the feature map are detected through the target detection network, so as to obtain a plurality of parts to be detected of the target object and the detection categories of the parts to be detected. In one embodiment, the target detection network uses the anchor free architecture, taking centrnet as an example, to generate three feature blocks, namely a target Heat map (Heat-map), a target frame length-width map (Wh _ map), and a center point x and y offset map (Reg _ map). The target heat map is defined as the detection type of the part, the target frame length and width map is defined as the size of an area image containing the part to be processed, and the central point x and y offset maps are defined as the offset of the central point of the area image containing the part to be processed from the central point of the part. The three feature blocks obtained above are the detection categories of the parts in the region image including the part to be processed and the region image including the part to be processed.

S205: and segmenting the regional image containing the part to be processed and performing convolution to obtain the part characteristics corresponding to the part to be processed.

Specifically, a to-be-processed part obtained through detection is extracted, a regional image containing the to-be-processed part is segmented, feature mapping is performed on the to-be-processed part through an attribution vector generation network, and part features of the to-be-processed part are extracted.

S206: and judging whether the detection types corresponding to the part features and the target part features are the same or not.

Specifically, in order to reduce the workload, it is possible to make no judgment on the parts having the same category according to the characteristic that the same target object has a unique part. And judging whether the detection types of the part features and the target part features are the same or not, and further performing primary judgment on the correlation between the part features and the target part features.

If the detection types respectively corresponding to the part characteristics and the target part characteristics are different, directly jumping to the step S207; if the detection categories corresponding to the part features and the target part features are the same, the process directly goes to step S208.

S207: and calculating the similarity between the part characteristics corresponding to the candidate parts and the target part characteristics.

Specifically, if the detection categories corresponding to the part features and the target part features are different, it is indicated that there may be an association relationship between the part features and the target part features, and the parts to be processed corresponding to the part features and the target part features may belong to the same target object. Determining the parts to be processed with different detection types as candidate parts, and calculating the similarity between the part features corresponding to the candidate parts and the target part features. In a specific embodiment, feature vectors corresponding to the part features and the target part features corresponding to the candidate part are obtained, and distances between the feature vectors corresponding to the part features and the target part features corresponding to the candidate part are calculated, wherein the distances include euclidean distances or cosine distances.

S208: the similarity between the site feature and the target site feature is not calculated.

Specifically, if the detection categories corresponding to the part feature and the target part feature are the same, it is indicated that there is no possibility of an association relationship between the part feature and the target part feature, and the part feature and the target part feature belong to different target objects, and it is not necessary to calculate the similarity between the part feature and the target part feature.

S209: and judging whether the similarity between the part feature corresponding to the candidate part and the target part feature is greater than a corresponding similarity threshold value or not.

Specifically, the similarity threshold is determined based on a degree of similarity between a category feature corresponding to a first detection category, which is a detection category of the to-be-processed portion corresponding to the target portion feature, and a category feature corresponding to a second detection category, which is a detection category corresponding to the candidate portion.

Based on the principle that the similarity between the part features corresponding to different parts of the same target object is large and the similarity between the part features corresponding to different parts of different target objects is small, whether the part features have the association relationship or not can be judged according to the similarity between the part features.

In a specific embodiment, based on the principle that the distance between feature vectors of different parts of the same target object is small and the distance between feature vectors of different parts of different target objects is large, whether the part features have correlation or not and belong to the same target object is judged according to the distance between the feature vectors. Presetting a preset distance between the feature vectors corresponding to the parts with different detection types. For example, a preset distance between a feature vector corresponding to a human body and a feature vector corresponding to a human head is a first threshold; the preset distance between the feature vector corresponding to the human body and the feature vector corresponding to the head and the shoulder is a second threshold value; and the preset distance between the feature vector corresponding to the head and the shoulder is a third threshold value. That is, when the detection types corresponding to the two feature vectors are a human body and a human head, it is determined whether the calculated distance is smaller than the first threshold. And when the detection categories respectively corresponding to the two feature vectors are the human body and the head and the shoulder, judging whether the calculated distance is smaller than a second threshold value. And when the detection types respectively corresponding to the two feature vectors are head, shoulder and head, judging whether the calculated distance is smaller than a third threshold value.

If the similarity between the part feature corresponding to the candidate part and the target part feature is greater than the corresponding similarity threshold, directly jumping to step S210; if the similarity between the part feature corresponding to the candidate part and the target part feature is not greater than the corresponding similarity threshold, the process directly goes to step S211.

S210: and determining that the part characteristics corresponding to the candidate part have an association relation with the target part characteristics.

Specifically, if the similarity between the part feature corresponding to the candidate part and the target part feature is greater than the corresponding similarity threshold, it is determined that the part feature corresponding to the candidate part has an association relationship with the target part feature.

Specifically, if the distance between the feature vector corresponding to the two candidate parts and the feature vector corresponding to the target part feature is smaller than the preset distance between the corresponding detection categories, it is determined that there is an association relationship between the feature vector corresponding to the candidate parts and the feature vector corresponding to the target part feature.

In one embodiment, if the distance between the feature vector corresponding to the candidate region and the feature vector corresponding to the target region feature is smaller than the preset distance between the corresponding detection categories. In order to further verify the association relationship between the feature vector corresponding to the candidate part and the feature vector corresponding to the target part feature, calculating the distance between the feature vector corresponding to the target part feature and the feature vectors corresponding to other candidate parts; judging whether the distance between the feature vectors corresponding to the candidate parts is smaller than the distance between the feature vector corresponding to the target part feature and the feature vectors corresponding to other candidate parts; and if the candidate part is smaller than the target part, determining that the feature vector corresponding to the candidate part has an association relation with the feature vector corresponding to the target part feature.

S211: and determining that the part characteristics corresponding to the candidate part have no association relation with the target part characteristics.

Specifically, if the similarity between the part feature corresponding to the candidate part and the target part feature is not greater than the corresponding similarity threshold, it is determined that the part feature corresponding to the candidate part does not have an association relationship with the target part feature. In a specific embodiment, if the distance between the feature vector corresponding to the candidate portion and the feature vector corresponding to the target portion feature is not less than the preset distance between the corresponding detection categories, it is determined that there is no association relationship between the feature vector corresponding to the candidate portion and the feature vector corresponding to the target portion feature.

S212: and determining a set of to-be-processed part characteristics.

Specifically, the parts to be processed corresponding to the part features having the association relationship are associated according to the part features having the association relationship included in the associated part feature set, and the parts to be processed corresponding to the part features included in the part feature set to be associated are associated according to the associated part features.

S213: and configuring the same identification information for the parts to be processed corresponding to the part features contained in the part feature set to be associated.

Specifically, the parts to be processed corresponding to all the part features contained in the part feature set to be associated corresponding to the target object are marked, and the same identification information is marked and output.

The target multi-part association method provided in this embodiment detects a plurality of parts to be processed of at least one target in an image to be detected; performing feature extraction on image areas corresponding to the parts to be processed in the plurality of parts to be processed to obtain part features corresponding to the parts to be processed; determining at least one to-be-processed part feature set according to the acquired similarity among the part features; wherein, one part feature set comprises part features corresponding to different parts of the same target; and associating the parts to be processed corresponding to the part features contained in each part feature set to be processed. According to the method and the device, the plurality of to-be-processed parts are obtained by detecting the to-be-detected image, the part features are obtained by extracting the features of the to-be-processed parts, high-level feature information is used, the part features contain abundant context global information, the similarity between the part features corresponding to the to-be-processed parts is more conveniently detected, the part features with the association relation are determined according to the similarity between the part features, the calculated amount in the association process is reduced, the mutually-associated to-be-detected parts are determined according to the associated part features, and the association accuracy and the generalization performance are improved.

Referring to fig. 5, fig. 5 is a block diagram of a target multi-location association apparatus according to the present invention. The present embodiment provides a target multi-part association apparatus 50, and the target multi-part association apparatus 50 includes a detection module 51, a feature extraction module 52, an analysis module 53 and a processing module 54. The detection module 51 is configured to detect a plurality of to-be-processed portions of at least one target in an image to be detected; the feature extraction module 52 is configured to perform feature extraction on an image region corresponding to each to-be-processed portion of the multiple to-be-processed portions to obtain a portion feature corresponding to each to-be-processed portion; the analysis module 53 is configured to determine at least one feature set of the to-be-processed part according to the similarity between the acquired features of the parts; wherein, one part feature set comprises part features corresponding to different parts of the same target; the processing module 54 is configured to associate the to-be-processed parts corresponding to the part features included in each to-be-processed part feature set.

The target multi-part association device provided by this embodiment obtains a plurality of parts to be processed by detecting an image to be detected, extracts features of the parts to be processed to obtain part features, uses high-level feature information, and the part features contain rich context global information, thereby being more convenient to detect the similarity between the part features corresponding to the parts to be processed, determining the part features having an association relationship according to the similarity between the part features, reducing the amount of calculation in the association process, determining the parts to be detected which are associated with each other according to the associated part features, and improving the association accuracy and generalization performance.

Referring to fig. 6, fig. 6 is a schematic block diagram of an embodiment of a terminal provided in the present invention. The terminal 70 in this embodiment includes: the processor 71, the memory 72, and a computer program stored in the memory 72 and capable of running on the processor 71 are not repeated herein to avoid repetition in the above-mentioned objective multi-part associating method when the computer program is executed by the processor 71.

Referring to fig. 7, fig. 7 is a schematic block diagram of an embodiment of a computer-readable storage medium provided by the present invention.

The embodiment of the present application further provides a computer-readable storage medium 90, where the computer-readable storage medium 90 stores a computer program 901, and the computer program 901 includes program instructions, and a processor executes the program instructions to implement the target multi-location association method provided in the embodiment of the present application.

The computer-readable storage medium 90 may be an internal storage unit of the computer device of the foregoing embodiment, such as a hard disk or a memory of the computer device. The computer-readable storage medium 90 may also be an external storage device of the computer device, such as a plug-in hard disk provided on the computer device, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like.

The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes performed by the present specification and drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A target multi-site association method, comprising:

detecting a plurality of parts to be processed of at least one target in an image to be detected;

performing feature extraction on an image area corresponding to each part to be processed in the plurality of parts to be processed to obtain part features corresponding to each part to be processed;

determining at least one to-be-processed part feature set according to the acquired similarity among the part features; wherein one of the part feature sets comprises the part features corresponding to different parts of the same target;

and associating the parts to be processed corresponding to the part features contained in each part feature set to be processed.

2. The target multi-site correlation method of claim 1,

determining at least one to-be-processed part feature set according to the acquired similarity between the part features, wherein the determining comprises the following steps:

determining a set consisting of other part features and the target part features, wherein the similarity between the other part features and the target part features is greater than a similarity threshold value, and the similarity between the other part features and the target part features is greater than a similarity threshold value, in the part features corresponding to the parts to be processed respectively, as a part feature set to be processed; wherein,

the target part feature is any part feature in the part features corresponding to the parts to be processed.

3. The method for multi-part association of objects according to claim 2, wherein said detecting a plurality of parts to be processed of at least one object in the image to be detected, further comprises:

detecting detection types corresponding to the parts to be processed in the plurality of parts to be processed;

determining a set consisting of other part features and the target part features, of the part features corresponding to the parts to be processed, wherein the similarity between the part features and the target part features is greater than a similarity threshold, as the part feature set to be processed, and the method also comprises the following steps:

determining each candidate part from each part to be processed based on each detected detection type, wherein the candidate part is different from the detection type of the part to be processed corresponding to the target part characteristic;

and determining the similarity between the part characteristic corresponding to each candidate part and the target part characteristic.

4. The target multi-site correlation method of claim 2,

the similarity threshold is determined based on a degree of similarity between a category feature corresponding to a first detection category, which is a detection category of the to-be-processed portion corresponding to the target portion feature, and a category feature corresponding to a second detection category, which is a detection category corresponding to the candidate portion.

5. The target multi-site correlation method of claim 1,

associating the parts to be processed corresponding to the part features included in each part feature set to be processed, including:

and respectively taking each part feature set to be processed as a part feature set to be associated, and processing the part feature sets as follows:

associating each part feature contained in the part feature set to be associated; and

associating the parts to be processed corresponding to the part features contained in the part feature set to be associated according to the associated part features; and are

And configuring the same identification information for the parts to be processed corresponding to the part features contained in the part feature set to be associated.

6. The target multi-site correlation method of claim 1,

the method for detecting the multiple parts to be processed of at least one target in the image to be detected comprises the following steps:

extracting the characteristics of the acquired image to be detected through a convolutional neural network to obtain a characteristic diagram; the convolutional neural network is obtained by training based on a plurality of training sample images, each to-be-processed part of a historical target and the detection category of each to-be-processed part are marked in the training sample images, and the incidence relation of each to-be-processed part of the historical target is marked in the training images;

and carrying out target detection on the characteristic diagram to obtain a plurality of to-be-processed parts of at least one target.

7. The target multi-site correlation method of claim 6, wherein the convolutional neural network is obtained by:

acquiring a training sample set, wherein the training sample image comprises each to-be-processed part of a historical target and the detection category of each to-be-processed part for marking, and the training image marks the incidence relation of each to-be-processed part of the historical target;

detecting the image through an initial convolutional neural network to obtain a predicted part of the historical target, a predicted category of the predicted part and an association relation between the predicted part and each predicted part;

constructing a loss function through the marked part to be processed corresponding to the historical target and the predicted part, the detection type marked by each part to be processed and the prediction type of the predicted part, and the incidence relation marked by each part to be processed of the historical target and the incidence relation of each predicted part;

and performing iterative training on the initial convolutional neural network by using the loss function to obtain the convolutional neural network.

8. A target multi-site association apparatus, the target multi-site association apparatus comprising:

the detection module is used for detecting a plurality of parts to be processed of at least one target in the image to be detected;

the characteristic extraction module is used for extracting the characteristics of the image area corresponding to each part to be processed in the parts to be processed to obtain the part characteristics corresponding to each part to be processed;

the analysis module is used for determining at least one feature set of the part to be processed according to the acquired similarity between the features of the parts; wherein one of the part feature sets comprises the part features corresponding to different parts of the same target;

and the processing module is used for associating the parts to be processed corresponding to the part features contained in each part feature set to be processed.

9. A terminal, characterized in that the terminal comprises a memory, a processor and a computer program stored in the memory and run on the processor, the processor being configured to execute sequence data to implement the steps in the target multi-site association method according to any of claims 1-7.

10. A computer-readable storage medium, having stored thereon a computer program which, when executed by a processor, carries out the steps of the target multi-site association method according to any one of claims 1 to 7.