CN112052771A - Object re-identification method and device - Google Patents

Object re-identification method and device Download PDF

Info

Publication number
CN112052771A
CN112052771A CN202010896120.XA CN202010896120A CN112052771A CN 112052771 A CN112052771 A CN 112052771A CN 202010896120 A CN202010896120 A CN 202010896120A CN 112052771 A CN112052771 A CN 112052771A
Authority
CN
China
Prior art keywords
image
target
sample
images
object sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010896120.XA
Other languages
Chinese (zh)
Inventor
蔡冠羽
张均
蒋忻洋
孙星
彭湃
郭晓威
黄小明
吴永坚
黄飞跃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202010896120.XA priority Critical patent/CN112052771A/en
Publication of CN112052771A publication Critical patent/CN112052771A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a method and a device for re-identifying an object; the method is related to the technical field of computer vision and cloud of artificial intelligence, and can be used for acquiring a plurality of object sample images in a target area and sample labels and space-time information of each object sample image; constructing an undirected graph according to the plurality of object sample images, wherein the undirected graph comprises image nodes which are connected with each other, and the image nodes comprise image characteristics of the object sample images; determining space-time transition probability and image similarity between adjacent image nodes in an undirected graph based on space-time probability distribution in a target region and image characteristics and space-time information of a plurality of object sample images; adjusting network parameters of the neural network model based on the neural network model, the space-time transition probability, the image similarity and the image characteristics of the plurality of object sample images and the sample labels, so as to perform object re-identification on the plurality of object images to be identified through the trained neural network model; the method and the device can improve the accuracy of object re-identification.

Description

Object re-identification method and device
Technical Field
The application relates to the field of artificial intelligence, in particular to an object re-identification method and device.
Background
Pedestrian Re-identification (Person Re-identification) is an identification technology in the field of artificial intelligence computer vision, can identify an image set, determine a plurality of target images including target pedestrians from the image set, and can be widely applied to the fields of security, traffic, buildings and the like. The image set comprises a plurality of pedestrian images, each pedestrian image carries time information and camera information when being shot, in a specific area, the time difference of pedestrian images of different pedestrians among specific cameras can be counted, the space-time probability distribution of pedestrian movement among the specific cameras can be obtained, and in the prior art, the recognition result of pedestrian re-recognition can be adjusted based on the space-time transition probability.
In the research and practice processes of the prior art, the inventor of the present application finds that, because the continuity of the spatio-temporal probability distribution obtained through statistics is poor, the effect of improving the recognition accuracy is not obvious in a manner of adjusting the recognition result through the spatio-temporal transition probability.
Disclosure of Invention
The embodiment of the application provides an object re-identification method and device, which can effectively improve the accuracy of object re-identification.
The embodiment of the application provides an object re-identification method, which comprises the following steps:
acquiring a plurality of object sample images in a target area, and a sample label and spatio-temporal information of each object sample image;
constructing an undirected graph from the plurality of object sample images, the undirected graph comprising interconnected image nodes, the image nodes representing object sample images, the image nodes comprising image features of the object sample images;
determining spatiotemporal transition probabilities between adjacent image nodes in the undirected graph based on spatiotemporal probability distributions of object transitions within the target region and spatiotemporal information of the plurality of object sample images;
calculating the image similarity between adjacent image nodes in the undirected graph according to the image characteristics of each image node;
performing object recognition on the image characteristics of the target object sample image based on a neural network model, the space-time transition probability, the image similarity and the sample labels of the plurality of object sample images to obtain a prediction result of the target object sample image;
and adjusting network parameters of the neural network model based on the image characteristics and the sample labels of the plurality of object sample images and the prediction result so as to perform object re-identification on the plurality of object images to be identified through the trained neural network model.
Accordingly, the present application provides an object re-recognition apparatus, comprising:
the acquisition module is used for acquiring a plurality of object sample images in a target area, and a sample label and space-time information of each object sample image;
a construction module for constructing an undirected graph from the plurality of object sample images, the undirected graph comprising interconnected image nodes, the image nodes representing object sample images, the image nodes comprising image features of the object sample images;
a determining module for determining spatio-temporal transition probabilities between adjacent image nodes in the undirected graph based on spatio-temporal probability distributions of object transitions within the target region and spatio-temporal information of the plurality of object sample images;
the calculation module is used for calculating the image similarity between adjacent image nodes in the undirected graph according to the image characteristics of each image node;
the identification module is used for carrying out object identification on the image characteristics of the target object sample image based on a neural network model, the space-time transition probability, the image similarity and the sample labels of the plurality of object sample images to obtain a prediction result of the target object sample image;
and the adjusting module is used for adjusting the network parameters of the neural network model based on the image characteristics and the sample labels of the plurality of object sample images and the prediction result so as to perform object re-identification on the plurality of object images to be identified through the trained neural network model.
In some embodiments, the object re-recognition apparatus further comprises:
and the extraction module is used for extracting the characteristics of the plurality of object sample images through a first sub-model of the neural network model to obtain the image characteristics of each object sample image.
In some embodiments, the identification module may include an extraction sub-module and an identification sub-module, wherein,
the extraction submodule is used for carrying out feature extraction on the image features of the target object sample image based on a second submodel of a neural network model, the space-time transition probability, the image similarity and the sample labels of the plurality of object sample images to obtain the target image features of the target object sample image;
and the identification submodule is used for carrying out object identification on the target object sample image according to the target image characteristics to obtain a prediction result of the target object sample image.
In some embodiments, the identification submodule may include a determination unit and an extraction unit, wherein,
the determination unit is used for determining the adjacent image nodes as candidate adjacent image nodes when the sample labels of the object sample images of the adjacent image nodes are the same;
and the extraction unit is used for performing feature extraction on the image features of the target object image based on a second sub-model of the neural network model, the space-time transition probability of the candidate adjacent image nodes in the undirected graph and the image similarity to obtain the target image features of the target object sample image.
In some embodiments, the extraction unit may be specifically configured to:
determining at least one target neighboring image node associated with the target object sample image from among the candidate neighboring image nodes of the undirected graph, the target neighboring image node representing the target object sample image and a neighboring object sample image;
determining the characteristic weight of each adjacent object sample image based on the space-time transition probability and the image similarity of all target adjacent image nodes;
and fusing the image characteristics and the characteristic weight of each adjacent object sample image and the image characteristics of the target object sample image to obtain the target image characteristics of the target object sample image.
In some embodiments, the adjustment module may include an adjustment sub-module and a re-identification sub-module, wherein,
an adjusting submodule, configured to adjust network parameters of the neural network model based on image features and sample labels of the plurality of object sample images and the prediction result;
and the re-recognition submodule is used for carrying out object re-recognition on the plurality of to-be-recognized object images through the trained neural network model.
In some embodiments, the re-identification sub-module includes an acquisition unit, an extraction unit, a calculation unit, and a re-identification unit, wherein,
an acquisition unit configured to acquire a plurality of target images to be recognized;
the extraction unit is used for extracting the characteristics of the plurality of to-be-recognized object images through the trained neural network model to obtain the target image characteristics of each to-be-recognized object image;
the calculation unit is used for calculating the target similarity between the object images to be recognized based on the space-time information and the target image characteristics of each object image to be recognized;
and the re-recognition unit is used for re-recognizing the object based on the target similarity between the images of the object to be recognized to obtain a recognition result.
In some embodiments, the computing unit may be specifically configured to:
calculating the feature similarity between the object images to be recognized according to the target image features of each object image to be recognized;
determining the space-time transition probability between the object images to be identified based on the space-time transition probability distribution and the space-time information of the object images to be identified;
and fusing the feature similarity and the space-time transition probability between the images of the objects to be identified to obtain the target similarity between the images of the objects to be identified.
In some embodiments, the spatiotemporal information includes temporal information and spatial information, and the determining module is specifically configured to:
calculating time difference information and space transfer information between adjacent image nodes according to the time information and the space information of the object sample images of the adjacent image nodes in the undirected graph;
determining a target space-time probability distribution from space-time probability distribution of object transfer in the target region according to the space transfer information, wherein the target space-time probability distribution comprises a mapping relation between time information and probability information;
and determining the space-time transition probability between adjacent image nodes in the undirected graph according to the time difference information and the target space-time probability distribution.
In some embodiments, the tuning submodule is specifically configured to:
calculating a first loss of the neural network model based on image features and sample labels of the plurality of object sample images;
calculating a second loss of the neural network model based on the prediction result and a sample label of the target object sample image;
adjusting network parameters of the neural network model in conjunction with the first loss and the second loss.
Correspondingly, the embodiment of the present application further provides a storage medium, where a computer program is stored, and the computer program is suitable for being loaded by a processor to execute any one of the object re-identification methods provided in the embodiment of the present application.
Correspondingly, the embodiment of the present application further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements any one of the object re-identification methods provided in the embodiment of the present application when executing the computer program.
The method and the device can acquire a plurality of object sample images in a target area, and sample labels and space-time information of each object sample image; constructing an undirected graph according to the plurality of object sample images, wherein the undirected graph comprises image nodes which are connected with each other, the image nodes represent the object sample images, and the image nodes comprise image characteristics of the object sample images; determining space-time transition probability between adjacent image nodes in the undirected graph based on space-time probability distribution of object transition in the target region and space-time information of a plurality of object sample images; calculating the image similarity between adjacent image nodes in the undirected graph according to the image characteristics of each image node; performing object recognition on the image characteristics of the target object sample image based on the neural network model, the space-time transition probability, the image similarity and the sample labels of the plurality of target sample images to obtain a prediction result of the target object sample image; and adjusting network parameters of the neural network model based on the image characteristics and the sample labels of the multiple object sample images and the prediction result so as to perform object re-identification on the multiple object images to be identified through the trained neural network model.
The method can integrate the space-time transition probability into the training process and the application process of the neural network model, can perform feature extraction and neural network model training based on the sample label, the space-time transition probability and the image similarity in the training process, improve the continuity of space-time probability distribution obtained by statistics, and can integrate the space-time transition probability and the image similarity based on the trained neural network model in the application process, and perform object re-identification through the integrated target similarity, thereby effectively improving the accuracy of object re-identification.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic view of a scene of an object re-identification system provided in an embodiment of the present application;
fig. 2 is a schematic flowchart of an object re-identification method according to an embodiment of the present application;
fig. 3 is another schematic flowchart of an object re-identification method according to an embodiment of the present application;
fig. 4 is a diagram illustrating an embodiment of an object re-identification method according to an embodiment of the present application;
fig. 5 is a diagram illustrating another example of an object re-identification method provided in an embodiment of the present application;
fig. 6 is a schematic structural diagram of an object re-identification apparatus according to an embodiment of the present application;
fig. 7 is another schematic structural diagram of an object re-identification apparatus according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of a computer device provided in an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described clearly and completely with reference to the drawings in the embodiments of the present application, and it is obvious that the embodiments described in the present application are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.
Computer Vision technology (CV) Computer Vision is a science for researching how to make a machine "see", and further refers to that a camera and a Computer are used to replace human eyes to perform machine Vision such as identification, tracking and measurement on a target, and further image processing is performed, so that the Computer processing becomes an image more suitable for human eyes to observe or transmitted to an instrument to detect. As a scientific discipline, computer vision research-related theories and techniques attempt to build artificial intelligence systems that can capture information from images or multidimensional data. Computer vision technologies generally include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technologies, virtual reality, augmented reality, synchronous positioning, map construction, and other technologies, and also include common biometric technologies such as face recognition and fingerprint recognition.
The object recognition, the network parameter adjustment, and the object re-recognition in the embodiments of the present application relate to the field of artificial intelligence computer vision, for example, object re-recognition is performed on a plurality of images of an object to be recognized through a trained neural network model, and the following embodiments will be specifically described in detail.
The embodiment of the application provides an object re-identification method and device. In particular, the embodiment of the application can be integrated in an object re-identification device.
The object re-recognition device may be integrated in a computer device, the computer device may include a terminal, a server, or the like, and the server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing a cloud computing service. The terminal may be, but is not limited to, a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, and the like. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and the application is not limited herein.
Referring to fig. 1, the object re-recognition apparatus may be integrated in a computer device such as a terminal and a server, wherein the computer device may acquire a plurality of object sample images in a target area, and a sample label and spatiotemporal information of each object sample image; constructing an undirected graph according to the plurality of object sample images, wherein the undirected graph comprises image nodes which are connected with each other, the image nodes represent the object sample images, and the image nodes comprise image characteristics of the object sample images; determining space-time transition probability between adjacent image nodes in the undirected graph based on space-time probability distribution of object transition in the target region and space-time information of a plurality of object sample images; calculating the image similarity between adjacent image nodes in the undirected graph according to the image characteristics of each image node; performing object recognition on the image characteristics of the target object sample image based on the neural network model, the space-time transition probability, the image similarity and the sample labels of the plurality of target sample images to obtain a prediction result of the target object sample image; and adjusting network parameters of the neural network model based on the image characteristics and the sample labels of the multiple object sample images and the prediction result so as to perform object re-identification on the multiple object images to be identified through the trained neural network model.
It should be noted that the scene schematic diagram of the object re-identification system shown in fig. 1 is only an example, and the object re-identification system and the scene described in the embodiment of the present application are for more clearly illustrating the technical solution of the embodiment of the present application, and do not form a limitation on the technical solution provided in the embodiment of the present application.
The following are detailed below. In this embodiment, a detailed description will be given of an object re-identification method, which may be integrated on a computer device, as shown in fig. 2, where fig. 2 is a schematic flowchart of the object re-identification method provided in this embodiment of the present application. The object re-identification method may include:
101. a plurality of object sample images in a target area, and a sample label and spatio-temporal information of each object sample image are obtained.
The target area may include an area with a limited range, such as a mall, a residential area, a city, a street, a scenic spot, the earth, or a village, etc., the target area of the street may include a plurality of image capturing devices, the image capturing devices may be fixed or movable when capturing an object image or an object sample image, and need to determine position information and time information during capturing, the image capturing devices may be fixed or movable, the position set for image capturing is the position information during capturing, and if the image capturing devices are movable, the position information may be determined by assistance in a manner of configuring a position determining function interface for the image capturing devices, etc.
Wherein, the object sample image can be the image in the collected target area, the object can be a movable object such as a person, an animal or an intelligent device, the object can also be an object which moves relatively depending on the movable object, such as a luggage case, a flowerpot or a doll, the object sample image can be used for training the neural network model,
the sample label may be information that uniquely identifies an object in the object sample image, and the representation of the sample label may be various, for example, the sample label may be description of main features of the object in the object sample image, such as woman 1 and blue cat 2, for convenience of storing and managing the object sample data, the sample label may also be in a form of numbers or characters, such as budug and 234e, for the convenience of storing and managing the object sample data, and a mapping relationship between the sample label and the main features of the object in the object sample image is established, and the main features may be recorded in a form of image, text, and the like.
The spatiotemporal information may include temporal and spatial information capable of characterizing the object sample images as they are acquired, and an object sample image corresponds to the spatiotemporal information, and it can be proved that an object in the object sample image is at a specific position at a specific time through the spatiotemporal information of the object sample image.
With the difference of the image acquisition frequency of the image acquisition device, the movement rate of the object within the acquirable range of the image acquisition device, and the like, the spatio-temporal information of the object sample image obtained by the image acquisition device may be a continuous period of time or space, a plurality of time points or space points, and at this time, the acquired spatio-temporal information may be simplified, for example, one time point or space point (position) may be selected as the spatio-temporal information of the object sample image.
Specifically, the manner of obtaining a plurality of object sample images in the target area, and the sample label and the spatio-temporal information of each object sample image may include multiple manners, where each object sample image, its sample label and the spatio-temporal information may be simply referred to as sample data, for example, sending a data request to another computer device (e.g., a server), and receiving the sample data returned by the other computer device, or directly obtaining the sample data from the computer device integrated with the present solution, and so on.
For example, a plurality of object sample images in the a cell, and a sample label and spatiotemporal information of each object sample image are acquired.
102. And constructing an undirected graph according to the plurality of object sample images, wherein the undirected graph comprises image nodes which are connected with each other, the image nodes represent the object sample images, and the image nodes comprise image characteristics of the object sample images.
The undirected graph may include a plurality of interconnected nodes, and each group of interconnected nodes does not indicate a direction, where the image nodes may identify the object sample image, and the two are in a one-to-one correspondence relationship. The process of constructing the undirected graph according to the multiple object sample data can be realized in various ways, such as an Adjacency matrix (Adjacency-matrix), an Adjacency list (Adjacency-list), and the like, and when the undirected graph is actually applied, flexible selection can be performed according to the number of the object sample data, which is not described herein again.
The image features may include content obtained by performing feature extraction on the object sample image, and the image features may be extracted through an existing feature extraction model or a feature extraction algorithm, for example, a Backbone Network model (backhaul Network), or the like.
For example, an undirected graph M may be constructed from a plurality of object sample images, where the undirected graph M includes interconnected image nodes, and each image node may include an image feature of an object sample image.
In some embodiments, the object re-identification method further comprises:
and performing feature extraction on the plurality of object sample images through a first sub-model of the neural network model to obtain the image features of each object sample image.
The neural network model to be trained may include at least one sub-model, for example, before feature extraction is performed on an image of an object sample to obtain image features, a first sub-model of the neural network model may be pre-constructed, network parameters in the first sub-model are initialized, and then, feature extraction is performed on the image of the object sample through the first sub-model, specifically, the feature extraction may include processes of convolution, pooling, and the like, so as to finally obtain the image features of each image of the object sample. The first sub-model may be constructed based on an existing Network model, for example, a Backbone Network model (Backbone Network) is constructed.
For example, feature extraction is performed through a first sub-model S1 of the neural network model, so as to obtain image features of each object sample image.
103. And determining the space-time transition probability between adjacent image nodes in the undirected graph based on the space-time probability distribution of the object transition in the target region and the space-time information of the plurality of object sample images.
The spatiotemporal probability distribution of the object transition may represent a probability distribution of time required for the object to transition between specific positions of the target area, for example, a probability distribution of time required for a queen to walk from a milky tea shop 1 to a milky tea shop 2 in a mall, and the probability distribution may be a mapping relationship between time information and probability information.
The spatiotemporal probability distribution of the target region can be determined in advance, in the practical application process, the image acquired by an object at a position (the position information recorded by the image acquisition equipment) can comprise a plurality of images, each image corresponds to a time point, the intermediate value in the time points of the images can be taken, and the intermediate value is taken as the time point of the object at the position.
Then, for two specific positions (e.g., position 1 and position 2) in the target area, the time taken for the object to move between position 1 and position 2 may be counted, the time 1 when the object appears at position 1 may be determined by the image of the object captured by the image capturing device at position 1, the time 2 when the object appears at position 2 may be determined by the image of the object captured by the image capturing device at position 2, and the time taken for the object to move between position 1 and position 2 may be represented by the difference between time 1 and time 2.
The time spent by a plurality of objects moving between the position 1 and the position 2 is counted to obtain a plurality of time segment data, and then a spatio-temporal probability distribution in the target area can be obtained based on the time segment data.
Specifically, for convenience of subsequent operations, a possible time period (between the maximum value and the minimum value) consuming time for moving between the position 1 and the position 2 may be determined according to the maximum value and the minimum value in the counted time period data, and the possible time period is averagely divided based on actual requirements, for example, the maximum value is 60 seconds, the minimum value is 20 seconds, the possible time period consuming time for moving between the position 1 and the position 2 is 40 seconds, and the 40 seconds are averagely divided into 4 minutes, so as to obtain four interval time periods: 0 to 10 seconds, 10 to 20 seconds, 20 to 30 seconds, and 30 to 40 seconds.
Then, according to the counted time period data, the frequency count in each interval time period is sequentially obtained, for example, after the statistics, the frequency count of 0 to 10 seconds is 20, the frequency count of 10 to 20 seconds is 18, the frequency count of 20 to 30 seconds is 35, and the frequency count of 30 to 40 seconds is 19, the frequency in each interval time period can be obtained through the frequency count in each interval time period, and finally, the space-time transition probability between the position 1 and the position 2 is obtained. For example, the frequency is smoothed by a statistical method (e.g., smoothed by a gaussian kernel function).
Wherein the probability of spatiotemporal transition between adjacent image nodes may be a probability that the object spends sample time moving between sample positions, wherein the sample time and sample position may be determined from the object sample images of the adjacent image nodes. And searching in the space-time probability distribution of the target area according to the sample time and the sample plate position, so that the space-time transition probability between adjacent image nodes can be determined.
For example, the spatio-temporal transition probability between adjacent image nodes in the undirected graph M is determined based on the spatio-temporal probability distribution O of object transitions within a small region of the target region a and spatio-temporal information of a plurality of object sample images.
In some embodiments, the spatiotemporal information includes temporal information and spatial information, and the step of determining spatiotemporal transition probabilities between adjacent image nodes in the undirected graph based on spatiotemporal probability distributions of object transitions within the target region and spatiotemporal information of the plurality of object sample images may comprise:
calculating time difference information and space transfer information between adjacent image nodes according to the time information and the space information of the object sample images of the adjacent image nodes in the undirected graph; determining target space-time probability distribution from space-time probability distribution of object transfer in the target region according to the space transfer information, wherein the target space-time probability distribution comprises a mapping relation between time information and probability information; and determining the space-time transition probability between adjacent image nodes in the undirected graph according to the time difference information and the target space-time probability distribution.
Specifically, the time information may be a time point when the object sample image is acquired, the spatial sample information may be a position when the object sample image is acquired, the spatial information may be acquired manually, or may be determined according to position information of the image acquisition device, for example, the position of the image acquisition device is the position of the object sample image.
The space-time probability distribution of the target area can include a plurality of space-time probability distributions, each space-time probability distribution indicates two pieces of position information, therefore, time difference information and space transfer information between adjacent image nodes can be calculated through time information and space information of object sample images of the adjacent image nodes, the target space-time probability distribution corresponding to the adjacent image nodes is determined through the space transfer information, table lookup is carried out in the target space-time probability distribution according to the time information, and the space-time transfer probability between the adjacent image nodes is determined. The operation is carried out on each group of adjacent image nodes in the undirected graph, and finally the space-time transition probability among all the adjacent image nodes in the undirected graph can be obtained.
For example, the spatio-temporal probability distribution O of the target region a cell may include a spatio-temporal probability distribution 1, a spatio-temporal probability distribution 2, and a spatio-temporal probability distribution 3, a set of adjacent image nodes in the undirected graph M includes an image node J1 and an image node J2, the image node J1 represents the object sample image 1, the object sample image 1 corresponds to the temporal information T1 and the spatial information p1, the image node J2 represents the object sample image 2, the object sample image 2 corresponds to the temporal information T2 and the spatial information p2, the time difference information T1-T2 between the image node T1 and the image node T2, the spatial transition information p2_ p1 may be calculated, and then, according to the spatial transition information p2_ p1, the target usage probability distribution is determined: and (3) space-time probability distribution 1, finally determining space-time probability between the image nodes J1 and J2 according to the time difference information t1-t2 and the space-time probability distribution 1, and performing the above operation on all adjacent image nodes in the undirected graph M to determine the space-time transition probability between all adjacent nodes in the undirected graph M.
104. And calculating the image similarity between adjacent image nodes in the undirected graph through the image characteristics of each image node.
The image similarity can measure the similarity of the object sample images, the image similarity can be determined through calculation, before the image similarity is calculated, image features and network parameters can be convoluted, then the convoluted features are processed through an activation function, nonlinear elements can be introduced into a neural network model, and common activation functions can include a Sigmoid function (an activation function), a Tanh nonlinear function (an activation function), a Leaky ReLU function (an activation function), a ReLU (an activation function) and the like. For example, the formula for convolving and activating the image features may be:
Figure BDA0002658475650000121
wherein, WlIs a network parameter of the neural network model,
Figure BDA0002658475650000122
in order to be a feature of the image,
Figure BDA0002658475650000123
the method is the initial image characteristic obtained after the image characteristic is convoluted and activated.
Then, the image similarity between adjacent image nodes may be calculated, for example, the calculation formula of the image similarity may be:
Figure BDA0002658475650000124
where s is the image similarity between adjacent image nodes, i and j are image nodes,
Figure BDA0002658475650000125
and
Figure BDA0002658475650000126
is the initial image feature of the neighboring image node.
For example, the image features of each image node are convolved and activated to obtain initial image features corresponding to the image features of each image node, and then the image similarity between adjacent image nodes in the undirected graph M is calculated through the initial image features of each image node.
105. And carrying out object recognition on the image characteristics of the target object sample image based on the neural network model, the space-time transition probability, the image similarity and the sample labels of the plurality of target sample images to obtain a prediction result of the target object sample image.
The prediction result of the target object sample image can be obtained by performing object recognition based on the image characteristics of the target object sample image through the space-time transition probability and the image similarity of all adjacent image nodes in the undirected graph, the neural network model, the sample labels and the image characteristics of a plurality of object sample images.
The prediction result may be a prediction result of an object in the target object sample image, or may be a prediction result of the target object sample image and another standard image, for example, objects in the target object sample image and the other standard image are the same object.
For example, the object recognition may be performed on the image features of the target object sample image 1 based on the neural network model X to be trained, the spatio-temporal transition probabilities and image similarities of all adjacent image nodes in the undirected graph M, and the sample labels of all object sample images, so as to obtain the prediction result yy of the target object sample image.
In some embodiments, the step of performing object recognition on the image features of the target object sample image based on the neural network model, the spatio-temporal transition probability, the image similarity, and the sample labels of the plurality of object sample images to obtain the prediction result of the target object sample image may include:
performing feature extraction on the image features of the target object sample image based on a second sub-model of the neural network model, the space-time transition probability, the image similarity and the sample labels of the plurality of object sample images to obtain the target image features of the target object sample image; and carrying out object identification on the target object sample image according to the target image characteristics to obtain a prediction result of the target object sample image.
Specifically, the neural network model may include a second sub-model, the second sub-model may perform further feature extraction on the image features, the network parameters of the second sub-model may be the network parameters mentioned in step 104, the second sub-model may complete feature extraction by the spatio-temporal transition probability between all adjacent image nodes in the undirected graph, the image similarity, and the sample label of each object sample image, to obtain the target image features of the target object sample image, and then, the target image features are input to the full-link layer, so that the prediction result of the target object sample image output by the full-link layer may be obtained.
For example, feature extraction is performed on the image features of the target object sample image 1 based on a second sub-model of the neural network model X, space-time transition probabilities and image similarities of all adjacent image nodes in the undirected graph M, and sample labels of all object sample images, so as to obtain target image features 1Q of the target object sample image; and carrying out object recognition on the target object sample image 1 according to the target image characteristics 1Q to obtain a prediction result yy of the target object sample image.
In some embodiments, the step of performing feature extraction on the image features of the target object sample image based on the second sub-model of the neural network model, the spatio-temporal transition probability, the image similarity, and the sample labels of the plurality of object sample images to obtain the target image features of the target object sample image may include:
when the sample labels of the object sample images of the adjacent image nodes are the same, determining the adjacent image nodes as candidate adjacent image nodes; and performing feature extraction on the image features of the target object image based on the second submodel of the neural network model, the space-time transition probability of the candidate adjacent image nodes in the undirected graph and the image similarity to obtain the target image features of the target object sample image.
Specifically, sample labels of object sample images represented by image nodes in the undirected graph may be the same or different, and if the sample labels are different, objects in the object sample images are different persons, and the space-time transition probability is for the probability that a specific object moves and consumes time, so in the present application, only adjacent image nodes with the same sample label are needed, that is, by comparing sample labels of each group of adjacent image nodes, if the sample labels are the same, the adjacent image nodes are candidate adjacent image nodes, and finally all candidate adjacent image nodes in the undirected graph can be determined.
Then, feature extraction can be carried out on the initial image features of the target object image through the space-time transition probability and the image similarity of all candidate adjacent image nodes in the undirected graph, and the target image features of the target object sample image are obtained.
It should be noted that, in the present application, the step "determining candidate neighboring image nodes in an undirected graph", the step "calculating image similarity between neighboring image nodes", and the step "determining space-time transition probability between neighboring image nodes" do not have a restriction on a sequence during operation, and a sequence of performing different steps may be flexibly selected according to an actual situation, which is not described herein again.
For example, when the sample labels of the object sample images of the adjacent image nodes are the same, the adjacent image nodes are determined as candidate adjacent image nodes to determine all the candidate adjacent image nodes in the undirected graph M, and feature extraction is performed on the image features of the target object image 1 based on the second sub-model of the neural network model X, the space-time transition probability of the candidate adjacent image nodes in the undirected graph M and the image similarity, so as to obtain the target image features 1Q of the target object sample image.
In some embodiments, the step of performing feature extraction on the image features of the target object image based on the second sub-model of the neural network model, the spatio-temporal transition probability of the candidate adjacent image nodes in the undirected graph and the image similarity to obtain the target image features of the target object sample image may include:
determining at least one target neighboring image node associated with the target object sample image from the candidate neighboring image nodes of the undirected graph, the target neighboring image node representing the target object sample image and the neighboring object sample image; determining the characteristic weight of each adjacent object sample image based on the space-time transition probability and the image similarity of all target adjacent image nodes; and fusing the image characteristics and the characteristic weight of each adjacent object sample image and the image characteristics of the target object sample image to obtain the target image characteristics of the target object sample image.
The group of adjacent image nodes may include two image nodes, each image node represents an object sample image, the target adjacent image node may include a target object sample image and an adjacent object sample image, then, the calculation may be performed according to the spatio-temporal transition probability and the image similarity of all target adjacent image nodes to obtain the similarity of each adjacent object sample image, and then, according to the similarity, the feature weight of each adjacent object sample image is calculated, specifically, the calculation formula of the similarity of the adjacent object sample images may be:
Figure BDA0002658475650000151
wherein i and j are image nodes,
Figure BDA0002658475650000152
and
Figure BDA0002658475650000153
for the initial image features of the neighboring image nodes,
Figure BDA0002658475650000154
to indicate a function, the output of the indicating function is 1 when the sample labels of the image nodes i and j are the same, otherwise it is 0.
The calculation formula of the feature weight of the adjacent object sample image may be:
Figure BDA0002658475650000155
wherein i is the image node of the target object sample image, N (i) is the image node of the neighboring object sample node, i.e. i and N (i) are a set of target neighboring image nodes,
Figure BDA0002658475650000156
for the similarity of the target neighboring object sample nodes,
Figure BDA0002658475650000157
is the feature weight of the neighboring object sample image.
Finally, the target image feature of the target object sample graph can be obtained based on the feature weight and the initial image feature of the adjacent object sample image and the image feature of the target object sample image, wherein the calculation formula of the target image feature can be as follows:
Figure BDA0002658475650000158
wherein the content of the first and second substances,
Figure BDA0002658475650000159
is the image characteristic of the target object sample image,
Figure BDA00026584756500001510
n (i) is the image node of the neighboring object sample node, i.e. i and n (i) is a set of target neighboring image nodes,
Figure BDA00026584756500001511
is the image characteristic of the target object sample image.
106. And adjusting network parameters of the neural network model based on the image characteristics and the sample labels of the multiple object sample images and the prediction result so as to perform object re-identification on the multiple object images to be identified through the trained neural network model.
The process of adjusting the network parameters of the neural network model is the process of performing model training on the neural network model, and after the training is completed, object re-recognition can be performed on a plurality of images of the object to be recognized through the trained model.
For example, based on the image features and sample labels of the multiple object sample images and the prediction result of the target object sample image 1, the network parameters of the neural network model X are adjusted, so that the trained neural network model X performs object re-identification on the multiple object images to be identified.
In some embodiments, the step of performing object re-recognition on a plurality of images of an object to be recognized through the trained neural network model may include:
acquiring a plurality of images of objects to be identified; carrying out feature extraction on a plurality of to-be-recognized object images through the trained neural network model to obtain target image features of each to-be-recognized object image; calculating target similarity between the object images to be recognized based on the spatio-temporal information and the target image characteristics of each object image to be recognized; and re-identifying the object based on the target similarity between the images of the object to be identified to obtain an identification result.
When the object re-identification is carried out through the trained neural network model, the characteristic extraction can be carried out on the images of the object to be identified sequentially through the first sub-model and the second sub-model, and the target image characteristic of each image of the object to be identified is obtained.
Then, in the process of object re-recognition, a reference object image (the reference object image may be one of the object images to be recognized or an image different from the object image to be recognized) may be determined first, then the target similarity between each object image to be recognized and the reference object image is determined, and finally, the target object image which is the same as the object in the reference object image is determined from the object images to be recognized according to the target similarity of each object image to be recognized, so as to obtain the recognition result of object re-recognition.
For example, a plurality of images of the object to be recognized may be obtained, a reference object image CC may be determined therefrom, then feature extraction may be performed on each image of the object to be recognized through the trained neural network model X, so as to obtain a target image feature of each recognized object, then, based on the spatio-temporal information and the target image feature of each object to be recognized, a target similarity between the reference object image CC and the image of the object to be recognized may be calculated, and re-recognition may be performed on the image of the object to be recognized based on the target similarity, so as to determine a target object image in which the object is the same as the object in the reference object image CC.
In some embodiments, the step of calculating the target similarity between the object images to be recognized based on the spatio-temporal information and the target image feature of each object image to be recognized may include:
calculating the feature similarity between the object images to be recognized according to the target image features of each object image to be recognized; determining the space-time transition probability between the object images to be identified based on the space-time transition probability distribution and the space-time information of the object images to be identified; and fusing the feature similarity and the space-time transition probability between the images of the objects to be identified to obtain the target similarity between the images of the objects to be identified.
Specifically, the target similarity may be determined by a spatio-temporal transition probability and a feature similarity between the object images to be recognized, wherein the spatio-temporal transition probability may be determined by spatio-temporal information of the object images to be recognized and a spatio-temporal probability distribution of the target region, and an implementation process is similar to a process of determining the spatio-temporal transition probability corresponding to the sample object images, and is not described herein again; the determination of the feature similarity between the images of the objects to be recognized may be measured by a related distance formula, such as cosine distance, euclidean distance, and the like.
The process of fusing the space-time transition probability and the feature similarity can be in various modes, such as weighted average, addition, multiplication and the like, and can be flexibly selected according to requirements in practical application, for example, the calculation formula of the target similarity can be as follows:
Figure BDA0002658475650000171
wherein the content of the first and second substances,
Figure BDA0002658475650000172
for feature similarity between the images of the objects to be recognized,
Figure BDA0002658475650000173
for the probability, s, of a temporal-spatial transition between images of objects to be recognizedi,jThe target similarity between the sample images of the objects to be identified.
In some embodiments, the step of "adjusting network parameters of the neural network model based on image features and sample labels of the plurality of object sample images, and the prediction result" may include:
calculating a first loss of the neural network model based on image features and sample labels of the plurality of object sample images; calculating a second loss of the neural network model based on the prediction result and the sample label of the target object sample image; and adjusting network parameters of the neural network model by combining the first loss and the second loss.
Before adjusting network parameters of the neural network model, Loss of the neural network model needs to be calculated, where the Loss may include a first Loss and a second Loss, where the first Loss may be for a first sub-model, and the second Loss may be for a second sub-model, after feature extraction is performed on an object sample image by the first sub-model to obtain image features, triple Loss (triple Loss) may be calculated by calculating triple Loss, where the triple Loss may be calculated by three image features, and the three image features may correspond to two sample labels, that is, image features in which two sample labels are the same.
The second loss can be calculated by the predicted value of the target object sample image and the sample label, and the loss function can be flexibly selected according to the actual requirement, such as a softmax function (a loss function).
Then, model training may be performed in combination with the first loss and the second loss, for example, the first loss and the second loss may be fused to obtain a target loss of the neural network model, and a network parameter of the neural network model is adjusted by the target loss to complete a process of model training.
The method can integrate the space-time transition probability into the training process and the application process of the neural network model, can perform feature extraction and neural network model training based on the sample label, the space-time transition probability and the image similarity in the training process, improve the continuity of space-time probability distribution obtained by statistics, and can integrate the space-time transition probability and the image similarity based on the trained neural network model in the application process, and perform object re-identification through the integrated target similarity, thereby improving the accuracy of object re-identification.
The method described in the above embodiments is further illustrated in detail by way of example.
The present application will take an object re-recognition system integrated in a server as an example to introduce an object re-recognition method, as shown in fig. 3, where fig. 3 is a schematic flow diagram of the object re-recognition method provided in the embodiment of the present application. The object re-identification method may include:
201. a computer device obtains a plurality of object sample images within a target region, and a sample label and spatiotemporal information for each object sample image.
202. The computer equipment constructs an undirected graph according to the plurality of object sample images, wherein the undirected graph comprises image nodes which are connected with each other, the image nodes represent the object sample images, the image nodes comprise image characteristics of the object sample images, and the image characteristics are obtained by performing characteristic extraction on the object images by a first sub-model of the neural network model.
Referring to fig. 4, the first sub-model may be a Backbone Network (Backbone Network) in fig. 4, the Backbone Network may be a deep Network model such as a Residual Network (ResNet, Residual Neural Network), (Dense Convolutional Neural Network (densneet, Dense Convolutional Neural Network), or a Neural Architecture Search Network on neurons (NASNet), after the object sample image is input into the Backbone Network, an image Feature of the object sample image, that is, Feature 1(Feature1), may be obtained, then, a undirected Graph is constructed based on the image Feature of each object sample image, a Graph Convolutional Neural Network (GCN, Graph Convolutional Neural Network) is obtained, the Graph Convolutional Neural Network includes interconnected image nodes, and the image nodes include image features of the object sample image.
203. The computer device determines spatiotemporal transition probabilities between adjacent image nodes in the undirected graph based on spatiotemporal probability distributions of object transitions within the target region and spatiotemporal information of the plurality of object sample images.
For example, the spatiotemporal probability distribution of the target region may be as shown in fig. 5, where different curves represent different specific locations in the target region, and in some embodiments, the locations may be determined by cameras installed at the locations, and then the different curves in fig. 5 may be between different cameras, each curve representing a probability distribution of the time required for transition between the cameras.
The spatiotemporal probability distribution of object transition in the target region may be updated with the change of data, for example, the spatiotemporal probability distribution is updated by the newly acquired object sample image of the target region.
204. The computer device calculates image similarity between adjacent image nodes in the undirected graph by using the second sub-network of the neural network model and the image features of each image node.
205. When the sample labels of the object sample images of the adjacent image nodes are the same, the computer device determines the adjacent image nodes as candidate adjacent image nodes to obtain all candidate adjacent image nodes in the undirected graph.
206. And the computer equipment performs feature extraction on the image features of the target object image based on the second submodel of the neural network model, the space-time transition probability of the candidate adjacent image nodes in the undirected graph and the image similarity to obtain the target image features of the target object sample image.
For example, the second sub-network may be a graph convolutional neural network, and referring to fig. 4, the weight (Attention) of the edge in the undirected graph may be calculated based on the same identity (i.e., the same sample label), the feature similarity (i.e., the image similarity), and the spatio-temporal probability (i.e., the instantaneous idle-shift probability), so as to obtain the target image feature of the target object sample image: feature 2(Feature 2).
207. And the computer equipment adjusts the network parameters of the neural network model based on the image characteristics and the sample labels of the plurality of object sample images and the prediction result so as to obtain the trained neural network model.
Before model training, a plurality of batches of object sample image data can be obtained by sampling from the object sample data in a centralized manner, and uniform sampling can be performed to ensure that the number of object sample images corresponding to sample labels in each batch of object sample image data is the same.
As shown in fig. 4, the loss 1 obtained by the first sub-network and the loss 2 obtained by the second sub-network may be combined to train the neural network model, where the loss 1 obtained by the first sub-network may be a triplet loss, the target image feature obtained by the second sub-network is input into the fully-connected layer to obtain a prediction result corresponding to the target image feature, and then the sample label is combined to obtain the loss 2 obtained by the second sub-network.
In the process of training the neural network model, optimization algorithms such as random Gradient Descent (SGD), random Gradient Descent using Momentum (Momentum SGD), Adaptive Gradient (Adaptive Gradient) and the like can be used.
208. And the computer equipment performs feature extraction on the plurality of to-be-recognized object images through the trained neural network model to obtain the target image features of each to-be-recognized object image.
209. The computer device calculates a target similarity between the object images to be recognized based on the spatio-temporal information and the target image feature of each object image to be recognized.
210. And the computer equipment performs object re-identification based on the target similarity between the images of the objects to be identified to obtain an identification result.
The method can integrate the space-time transition probability into the training process and the application process of the neural network model, can perform feature extraction and neural network model training based on the sample label, the space-time transition probability and the image similarity in the training process, improve the continuity of space-time probability distribution obtained by statistics, and can integrate the space-time transition probability and the image similarity based on the trained neural network model in the application process, and perform object re-identification through the integrated target similarity, thereby improving the accuracy of object re-identification.
In order to better implement the object re-identification method provided by the embodiment of the present application, an embodiment of the present application further provides a device based on the object re-identification method. The meanings of the nouns are the same as those in the object re-identification method, and specific implementation details can refer to the description in the method embodiment.
Fig. 6 is a schematic structural diagram of an object re-identification apparatus according to an embodiment of the present application, shown in fig. 6, where the object re-identification apparatus may include an obtaining module 301, a constructing module 302, a determining module 303, a calculating module 304, an identifying module 305, and an adjusting module 306, where,
an obtaining module 301, configured to obtain a plurality of object sample images in a target region, and a sample label and spatio-temporal information of each object sample image;
a construction module 302, configured to construct an undirected graph from a plurality of object sample images, where the undirected graph includes interconnected image nodes, the image nodes represent the object sample images, and the image nodes include image features of the object sample images;
a determining module 303, configured to determine a spatio-temporal transition probability between adjacent image nodes in the undirected graph based on spatio-temporal probability distribution of object transitions in the target region and spatio-temporal information of the plurality of object sample images;
a calculating module 304, configured to calculate an image similarity between adjacent image nodes in the undirected graph according to an image feature of each image node;
the identification module 305 is configured to perform object identification on image features of the target object sample image based on the neural network model, the spatio-temporal transition probability, the image similarity, and sample labels of the plurality of target sample images, so as to obtain a prediction result of the target object sample image;
the adjusting module 306 is configured to adjust network parameters of the neural network model based on image features and sample labels of the multiple object sample images and the prediction result, so as to perform object re-identification on the multiple object images to be identified through the trained neural network model.
In some embodiments, the object re-recognition apparatus further comprises:
and the extraction module is used for extracting the characteristics of the plurality of object sample images through a first sub-model of the neural network model to obtain the image characteristics of each object sample image.
In some embodiments, referring to fig. 7, the recognition module 305 may include an extraction sub-module 3051 and a recognition sub-module 3052, wherein,
the extraction submodule 3051 is configured to perform feature extraction on image features of the target object sample image based on a second sub-model of the neural network model, the space-time transition probability, the image similarity, and sample labels of the plurality of object sample images to obtain target image features of the target object sample image;
the identification submodule 3052 is configured to perform object identification on the target object sample image according to the target image feature, so as to obtain a prediction result of the target object sample image.
In some embodiments, the identification submodule may include a determination unit and an extraction unit, wherein,
the determining unit is used for determining the adjacent image nodes as candidate adjacent image nodes when the sample labels of the object sample images of the adjacent image nodes are the same;
and the extraction unit is used for extracting the features of the image features of the target object image based on the second sub-model of the neural network model, the space-time transition probability of the candidate adjacent image nodes in the undirected graph and the image similarity to obtain the target image features of the target object sample image.
In some embodiments, the extraction unit may be specifically configured to:
determining at least one target neighboring image node associated with the target object sample image from the candidate neighboring image nodes of the undirected graph, the target neighboring image node representing the target object sample image and the neighboring object sample image;
determining the characteristic weight of each adjacent object sample image based on the space-time transition probability and the image similarity of all target adjacent image nodes;
and fusing the image characteristics and the characteristic weight of each adjacent object sample image and the image characteristics of the target object sample image to obtain the target image characteristics of the target object sample image.
In some embodiments, the adjustment module may include an adjustment sub-module and a re-identification sub-module, wherein,
the adjusting submodule is used for adjusting network parameters of the neural network model based on the image characteristics and the sample labels of the plurality of object sample images and the prediction result;
and the re-recognition submodule is used for carrying out object re-recognition on the plurality of to-be-recognized object images through the trained neural network model.
In some embodiments, the re-identification sub-module includes an acquisition unit, an extraction unit, a calculation unit, and a re-identification unit, wherein,
an acquisition unit configured to acquire a plurality of target images to be recognized;
the extraction unit is used for extracting the characteristics of a plurality of to-be-recognized object images through the trained neural network model to obtain the target image characteristics of each to-be-recognized object image;
the calculation unit is used for calculating the target similarity between the object images to be recognized based on the space-time information and the target image characteristics of each object image to be recognized;
and the re-recognition unit is used for re-recognizing the object based on the target similarity between the images of the object to be recognized to obtain a recognition result.
In some embodiments, the computing unit may be specifically configured to:
calculating the feature similarity between the object images to be recognized according to the target image features of each object image to be recognized;
determining the space-time transition probability between the object images to be identified based on the space-time transition probability distribution and the space-time information of the object images to be identified;
and fusing the feature similarity and the space-time transition probability between the images of the objects to be identified to obtain the target similarity between the images of the objects to be identified.
In some embodiments, the spatiotemporal information comprises temporal information and spatial information, and the determining module is specifically configured to:
calculating time difference information and space transfer information between adjacent image nodes according to the time information and the space information of the object sample images of the adjacent image nodes in the undirected graph;
determining target space-time probability distribution from space-time probability distribution of object transfer in the target region according to the space transfer information, wherein the target space-time probability distribution comprises a mapping relation between time information and probability information;
and determining the space-time transition probability between adjacent image nodes in the undirected graph according to the time difference information and the target space-time probability distribution.
In some embodiments, the tuning submodule is specifically configured to:
calculating a first loss of the neural network model based on image features and sample labels of the plurality of object sample images;
calculating a second loss of the neural network model based on the prediction result and the sample label of the target object sample image;
and adjusting network parameters of the neural network model by combining the first loss and the second loss.
In this application, the obtaining module 301 may obtain a plurality of object sample images in a target region, and a sample label and spatio-temporal information of each object sample image, the constructing module 302 constructs an undirected graph from the plurality of object sample images, the undirected graph includes image nodes connected to each other, the image nodes represent the object sample images, the image nodes include image features of the object sample images, the determining module 303 determines spatio-temporal transition probabilities between adjacent image nodes in the undirected graph based on spatio-temporal probability distributions of object transitions in the target region and the spatio-temporal information of the plurality of object sample images, the calculating module 304 calculates image similarities between adjacent image nodes in the undirected graph by the image features of each image node, the identifying module 305 is based on a neural network model, the spatio-temporal transition probabilities, the image similarities, and the sample labels of the plurality of object sample images, and finally, the adjusting module 306 adjusts the network parameters of the neural network model based on the image features of the plurality of object sample images, the sample labels and the prediction results, so as to perform object re-identification on the plurality of object images to be identified through the trained neural network model.
The method can integrate the space-time transition probability into the training process and the application process of the neural network model, can perform feature extraction and neural network model training based on the sample label, the space-time transition probability and the image similarity in the training process, improve the continuity of space-time probability distribution obtained by statistics, and can integrate the space-time transition probability and the image similarity based on the trained neural network model in the application process, and perform object re-identification through the integrated target similarity, thereby improving the accuracy of object re-identification.
In addition, an embodiment of the present application further provides a computer device, where the computer device may be a terminal or a server, as shown in fig. 8, which shows a schematic structural diagram of the computer device according to the embodiment of the present application, and specifically:
the computer device may include components such as a processor 401 of one or more processing cores, memory 402 of one or more computer-readable storage media, a power supply 403, and an input unit 404. Those skilled in the art will appreciate that the computer device configuration illustrated in FIG. 8 does not constitute a limitation of computer devices, and may include more or fewer components than those illustrated, or some components may be combined, or a different arrangement of components. Wherein:
the processor 401 is a control center of the computer device, connects various parts of the entire computer device using various interfaces and lines, and performs various functions of the computer device and processes data by running or executing software programs and/or modules stored in the memory 402 and calling data stored in the memory 402, thereby monitoring the computer device as a whole. Optionally, processor 401 may include one or more processing cores; preferably, the processor 401 may integrate an application processor and a modem processor, wherein the application processor mainly handles operating systems, user pages, application programs, and the like, and the modem processor mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 401.
The memory 402 may be used to store software programs and modules, and the processor 401 executes various functional applications and data processing by operating the software programs and modules stored in the memory 402. The memory 402 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to use of the computer device, and the like. Further, the memory 402 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 402 may also include a memory controller to provide the processor 401 access to the memory 402.
The computer device further comprises a power supply 403 for supplying power to the various components, and preferably, the power supply 403 is logically connected to the processor 401 via a power management system, so that functions of managing charging, discharging, and power consumption are implemented via the power management system. The power supply 403 may also include any component of one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.
The computer device may also include an input unit 404, the input unit 404 being operable to receive input numeric or character information and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.
Although not shown, the computer device may further include a display unit and the like, which are not described in detail herein. Specifically, in this embodiment, the processor 401 in the computer device loads the executable file corresponding to the process of one or more application programs into the memory 402 according to the following instructions, and the processor 401 runs the application programs stored in the memory 402, thereby implementing various functions as follows:
acquiring a plurality of object sample images in a target area, and a sample label and spatio-temporal information of each object sample image; constructing an undirected graph according to the plurality of object sample images, wherein the undirected graph comprises image nodes which are connected with each other, the image nodes represent the object sample images, and the image nodes comprise image characteristics of the object sample images; determining space-time transition probability between adjacent image nodes in the undirected graph based on space-time probability distribution of object transition in the target region and space-time information of a plurality of object sample images; calculating the image similarity between adjacent image nodes in the undirected graph according to the image characteristics of each image node; performing object recognition on the image characteristics of the target object sample image based on the neural network model, the space-time transition probability, the image similarity and the sample labels of the plurality of target sample images to obtain a prediction result of the target object sample image; and adjusting network parameters of the neural network model based on the image characteristics and the sample labels of the multiple object sample images and the prediction result so as to perform object re-identification on the multiple object images to be identified through the trained neural network model.
The above operations can be implemented in the foregoing embodiments, and are not described in detail herein.
According to an aspect of the application, a computer program product or computer program is provided, comprising computer instructions, the computer instructions being stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the method provided in the various alternative implementations of the above embodiments.
It will be understood by those skilled in the art that all or part of the steps of the methods of the above embodiments may be performed by a computer program, which may be stored in a computer-readable storage medium and loaded and executed by a processor, or by related hardware controlled by the computer program.
To this end, embodiments of the present application further provide a storage medium, in which a computer program is stored, where the computer program can be loaded by a processor to execute the steps in any one of the object re-identification methods provided in the embodiments of the present application. For example, the computer program may perform the steps of:
acquiring a plurality of object sample images in a target area, and a sample label and spatio-temporal information of each object sample image; constructing an undirected graph according to the plurality of object sample images, wherein the undirected graph comprises image nodes which are connected with each other, the image nodes represent the object sample images, and the image nodes comprise image characteristics of the object sample images; determining space-time transition probability between adjacent image nodes in the undirected graph based on space-time probability distribution of object transition in the target region and space-time information of a plurality of object sample images; calculating the image similarity between adjacent image nodes in the undirected graph according to the image characteristics of each image node; performing object recognition on the image characteristics of the target object sample image based on the neural network model, the space-time transition probability, the image similarity and the sample labels of the plurality of target sample images to obtain a prediction result of the target object sample image; and adjusting network parameters of the neural network model based on the image characteristics and the sample labels of the multiple object sample images and the prediction result so as to perform object re-identification on the multiple object images to be identified through the trained neural network model.
Wherein the storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
Since the computer program stored in the storage medium can execute the steps in any object re-identification method provided in the embodiments of the present application, beneficial effects that can be achieved by any object re-identification method provided in the embodiments of the present application can be achieved, and detailed descriptions are omitted herein for the purpose of describing the foregoing embodiments.
The object re-identification method and device provided by the embodiment of the present application are described in detail above, and a specific example is applied in the description to explain the principle and the implementation of the present application, and the description of the above embodiment is only used to help understand the method and the core idea of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (10)

1. An object re-recognition method, comprising:
acquiring a plurality of object sample images in a target area, and a sample label and spatio-temporal information of each object sample image;
constructing an undirected graph from the plurality of object sample images, the undirected graph comprising interconnected image nodes, the image nodes representing object sample images, the image nodes comprising image features of the object sample images;
determining spatiotemporal transition probabilities between adjacent image nodes in the undirected graph based on spatiotemporal probability distributions of object transitions within the target region and spatiotemporal information of the plurality of object sample images;
calculating the image similarity between adjacent image nodes in the undirected graph according to the image characteristics of each image node;
performing object recognition on the image characteristics of the target object sample image based on a neural network model, the space-time transition probability, the image similarity and the sample labels of the plurality of object sample images to obtain a prediction result of the target object sample image;
and adjusting network parameters of the neural network model based on the image characteristics and the sample labels of the plurality of object sample images and the prediction result so as to perform object re-identification on the plurality of object images to be identified through the trained neural network model.
2. The method of claim 1, further comprising:
and performing feature extraction on the plurality of object sample images through a first sub-model of the neural network model to obtain the image features of each object sample image.
3. The method of claim 2, wherein the performing object recognition on the image features of the target object sample image based on the neural network model, the spatio-temporal transition probability, the image similarity, and the sample labels of the plurality of object sample images to obtain the prediction result of the target object sample image comprises:
performing feature extraction on image features of a target object sample image based on a second sub-model of a neural network model, the space-time transition probability, the image similarity and sample labels of the plurality of object sample images to obtain target image features of the target object sample image;
and carrying out object identification on the target object sample image according to the target image characteristics to obtain a prediction result of the target object sample image.
4. The method of claim 3, wherein the performing feature extraction on the image features of the target object sample image based on the second sub-model of the neural network model, the spatio-temporal transition probability, the image similarity, and the sample labels of the plurality of object sample images to obtain the target image features of the target object sample image comprises:
when the sample labels of the object sample images of the adjacent image nodes are the same, determining the adjacent image nodes as candidate adjacent image nodes;
and performing feature extraction on the image features of the target object image based on a second submodel of a neural network model, the space-time transition probability of candidate adjacent image nodes in the undirected graph and the image similarity to obtain the target image features of the target object sample image.
5. The method according to claim 4, wherein the performing feature extraction on the image features of the target object image based on the second sub-model of the neural network model, the spatio-temporal transition probability of the candidate adjacent image nodes in the undirected graph and the image similarity to obtain the target image features of the target object sample image comprises:
determining at least one target neighboring image node associated with the target object sample image from among the candidate neighboring image nodes of the undirected graph, the target neighboring image node representing the target object sample image and a neighboring object sample image;
determining the characteristic weight of each adjacent object sample image based on the space-time transition probability and the image similarity of all target adjacent image nodes;
and fusing the image characteristics and the characteristic weight of each adjacent object sample image and the image characteristics of the target object sample image to obtain the target image characteristics of the target object sample image.
6. The method of claim 1, wherein the object re-recognition of the plurality of images of the object to be recognized by the trained neural network model comprises:
acquiring a plurality of images of objects to be identified;
carrying out feature extraction on the plurality of to-be-recognized object images through the trained neural network model to obtain target image features of each to-be-recognized object image;
calculating target similarity between the object images to be recognized based on the spatio-temporal information and the target image characteristics of each object image to be recognized;
and re-identifying the object based on the target similarity between the images of the object to be identified to obtain an identification result.
7. The method according to claim 6, wherein the calculating of the target similarity between the object images to be recognized based on the spatio-temporal information and the target image features of each object image to be recognized comprises:
calculating the feature similarity between the object images to be recognized according to the target image features of each object image to be recognized;
determining the space-time transition probability between the object images to be identified based on the space-time transition probability distribution and the space-time information of the object images to be identified;
and fusing the feature similarity and the space-time transition probability between the images of the objects to be identified to obtain the target similarity between the images of the objects to be identified.
8. The method of claim 1, wherein the spatiotemporal information comprises temporal information and spatial information, and wherein determining spatiotemporal transition probabilities between neighboring image nodes in the undirected graph based on spatiotemporal probability distributions of object transitions within the target region and spatiotemporal information of the plurality of object sample images comprises:
calculating time difference information and space transfer information between adjacent image nodes according to the time information and the space information of the object sample images of the adjacent image nodes in the undirected graph;
determining a target space-time probability distribution from space-time probability distribution of object transfer in the target region according to the space transfer information, wherein the target space-time probability distribution comprises a mapping relation between time information and probability information;
and determining the space-time transition probability between adjacent image nodes in the undirected graph according to the time difference information and the target space-time probability distribution.
9. The method of claim 1, wherein the adjusting network parameters of the neural network model based on image features and sample labels of the plurality of sample images of the object and the prediction results comprises:
calculating a first loss of the neural network model based on image features and sample labels of the plurality of object sample images;
calculating a second loss of the neural network model based on the prediction result and a sample label of the target object sample image;
adjusting network parameters of the neural network model in conjunction with the first loss and the second loss.
10. An object re-recognition apparatus, comprising:
the acquisition module is used for acquiring a plurality of object sample images in a target area, and a sample label and space-time information of each object sample image;
a construction module for constructing an undirected graph from the plurality of object sample images, the undirected graph comprising interconnected image nodes, the image nodes representing object sample images, the image nodes comprising image features of the object sample images;
a determining module for determining spatio-temporal transition probabilities between adjacent image nodes in the undirected graph based on spatio-temporal probability distributions of object transitions within the target region and spatio-temporal information of the plurality of object sample images;
the calculation module is used for calculating the image similarity between adjacent image nodes in the undirected graph according to the image characteristics of each image node;
the identification module is used for carrying out object identification on the image characteristics of the target object sample image based on a neural network model, the space-time transition probability, the image similarity and the sample labels of the plurality of object sample images to obtain a prediction result of the target object sample image;
and the adjusting module is used for adjusting the network parameters of the neural network model based on the image characteristics and the sample labels of the plurality of object sample images and the prediction result so as to perform object re-identification on the plurality of object images to be identified through the trained neural network model.
CN202010896120.XA 2020-08-31 2020-08-31 Object re-identification method and device Pending CN112052771A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010896120.XA CN112052771A (en) 2020-08-31 2020-08-31 Object re-identification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010896120.XA CN112052771A (en) 2020-08-31 2020-08-31 Object re-identification method and device

Publications (1)

Publication Number Publication Date
CN112052771A true CN112052771A (en) 2020-12-08

Family

ID=73607608

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010896120.XA Pending CN112052771A (en) 2020-08-31 2020-08-31 Object re-identification method and device

Country Status (1)

Country Link
CN (1) CN112052771A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112418191A (en) * 2021-01-21 2021-02-26 深圳阜时科技有限公司 Fingerprint identification model construction method, storage medium and computer equipment
CN112766180A (en) * 2021-01-22 2021-05-07 重庆邮电大学 Pedestrian re-identification method based on feature fusion and multi-core learning
CN113269129A (en) * 2021-06-11 2021-08-17 成都商汤科技有限公司 Identity recognition method and device, electronic equipment and storage medium
CN113761263A (en) * 2021-09-08 2021-12-07 杭州海康威视数字技术股份有限公司 Similarity determination method and device and computer readable storage medium

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112418191A (en) * 2021-01-21 2021-02-26 深圳阜时科技有限公司 Fingerprint identification model construction method, storage medium and computer equipment
CN112418191B (en) * 2021-01-21 2021-04-20 深圳阜时科技有限公司 Fingerprint identification model construction method, storage medium and computer equipment
CN112766180A (en) * 2021-01-22 2021-05-07 重庆邮电大学 Pedestrian re-identification method based on feature fusion and multi-core learning
CN112766180B (en) * 2021-01-22 2022-07-12 重庆邮电大学 Pedestrian re-identification method based on feature fusion and multi-core learning
CN113269129A (en) * 2021-06-11 2021-08-17 成都商汤科技有限公司 Identity recognition method and device, electronic equipment and storage medium
WO2022257306A1 (en) * 2021-06-11 2022-12-15 成都商汤科技有限公司 Identity identification method and apparatus, electronic device, and storage medium
CN113761263A (en) * 2021-09-08 2021-12-07 杭州海康威视数字技术股份有限公司 Similarity determination method and device and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN111523621B (en) Image recognition method and device, computer equipment and storage medium
CN112052771A (en) Object re-identification method and device
CN107153817B (en) Pedestrian re-identification data labeling method and device
CN112131978B (en) Video classification method and device, electronic equipment and storage medium
CN111666919B (en) Object identification method and device, computer equipment and storage medium
CN111768336B (en) Face image processing method and device, computer equipment and storage medium
CN111368943B (en) Method and device for identifying object in image, storage medium and electronic device
CN110555481A (en) Portrait style identification method and device and computer readable storage medium
CN111340013B (en) Face recognition method and device, computer equipment and storage medium
CN113313053B (en) Image processing method, device, apparatus, medium, and program product
CN112132197A (en) Model training method, image processing method, device, computer equipment and storage medium
CN111652331B (en) Image recognition method and device and computer readable storage medium
CN111652329B (en) Image classification method and device, storage medium and electronic equipment
CN109902681B (en) User group relation determining method, device, equipment and storage medium
CN113033507B (en) Scene recognition method and device, computer equipment and storage medium
CN111242019A (en) Video content detection method and device, electronic equipment and storage medium
JP6903117B2 (en) Face recognition methods, facial recognition devices, and computer-readable non-temporary media
CN114519863A (en) Human body weight recognition method, human body weight recognition apparatus, computer device, and medium
CN113128526B (en) Image recognition method and device, electronic equipment and computer-readable storage medium
CN112906517B (en) Self-supervision power law distribution crowd counting method and device and electronic equipment
CN111626212B (en) Method and device for identifying object in picture, storage medium and electronic device
CN113706550A (en) Image scene recognition and model training method and device and computer equipment
CN111709473A (en) Object feature clustering method and device
CN113076963B (en) Image recognition method and device and computer readable storage medium
CN113824989B (en) Video processing method, device and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination