WO2019237870A1 - 目标匹配方法及装置、电子设备和存储介质 - Google Patents
目标匹配方法及装置、电子设备和存储介质 Download PDFInfo
- Publication number
- WO2019237870A1 WO2019237870A1 PCT/CN2019/086670 CN2019086670W WO2019237870A1 WO 2019237870 A1 WO2019237870 A1 WO 2019237870A1 CN 2019086670 W CN2019086670 W CN 2019086670W WO 2019237870 A1 WO2019237870 A1 WO 2019237870A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image sequence
- feature vector
- frame
- query
- candidate
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/73—Querying
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/53—Querying
- G06F16/532—Query formulation, e.g. graphical querying
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/56—Information retrieval; Database structures therefor; File system structures therefor of still image data having vectorial format
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/761—Proximity, similarity or dissimilarity measures
Definitions
- the present disclosure relates to the field of computer vision technology, and in particular, to a method and device for object matching, an electronic device, and a storage medium.
- Target matching refers to returning videos or images in the database that have the same target as the query video or query image.
- Target matching technology is widely used in security monitoring systems at airports, stations, campuses, and shopping malls. In related technologies, the accuracy of target matching is low.
- the present disclosure proposes a target matching technical solution.
- a target matching method including:
- the feature vector of each frame in the query image sequence and the feature vector of each frame in the candidate image sequence are extracted separately, where the query image sequence includes the target to be matched;
- a cooperative expression feature vector of the query image sequence based on a feature vector of each frame in the query image sequence and a self-expression feature vector of the candidate image sequence, and based on a feature of each frame in the candidate image sequence A vector and a self-expressing feature vector of the query image sequence, and determining a cooperative expression feature vector of the candidate image sequence;
- Determining the query image based on the self-expression feature vector of the query image sequence, the co-expression feature vector of the query image sequence, the self-expression feature vector of the candidate image sequence, and the co-expression feature vector of the candidate image sequence A similarity feature vector between the sequence and the candidate image sequence;
- a matching result between the query image sequence and the candidate image sequence is determined.
- extracting the feature vector of each frame in the query image sequence and the feature vector of each frame in the candidate image sequence separately include:
- the feature vector of each frame in the query image sequence and the feature vector of each frame in the candidate image sequence are extracted by the first sub-neural network.
- the method further includes:
- the dimensionality reduction processing is performed on the feature vector of each frame in the query image sequence and the feature vector of each frame in the candidate image sequence through the first fully connected layer of the first sub-neural network to obtain the query image sequence.
- the self-expressing feature vector of the query image sequence and the The self-expression feature vectors of the candidate image sequence include:
- the feature vector of each frame in the candidate image sequence and the first dimension-reduced feature vector of each frame in the candidate image sequence are input into a second sub-neural network to determine the self-expressing feature vector of the candidate image sequence.
- a feature vector of each frame in the query image sequence and a first dimensionality reduction feature vector of each frame in the query image sequence are input to a second sub-neural network to determine the Query the self-expressing feature vector of the image sequence, including:
- Determining the query image based on the second dimensionality reduction feature vector of each frame in the query image sequence, the overall feature vector of the query image sequence, and the first dimensionality reduction feature vector of each frame in the query image sequence The self-expressing feature vector of the sequence.
- the feature vector of each frame in the candidate image sequence and the first dimension-reduced feature vector of each frame in the candidate image sequence are input to a second sub-neural network to obtain the Self-expressing feature vectors of candidate image sequences, including:
- Determining the candidate image based on a second dimensionality reduction feature vector of each frame in the candidate image sequence, an overall feature vector of the candidate image sequence, and a first dimensionality reduction feature vector of each frame in the candidate image sequence The self-expressing feature vector of the sequence.
- Dimensional feature vector, determining the self-expressing feature vector of the query image sequence including:
- the correlation between the second dimensionality-reduced feature vector of each frame in the query image sequence and the overall feature vector of the query image sequence is calculated by a parameterless correlation function to obtain the first correlation of each frame in the query image sequence Weights;
- determining the self-expression feature vector of the candidate image sequence including:
- the first dimensionality reduction feature vector of each frame in the candidate image sequence is weighted to obtain the self-expression feature vector of the candidate image sequence.
- the first correlation weight includes a first normalized correlation weight
- the first normalized correlation weight is obtained by performing a normalization process on the first correlation weight.
- a collaborative expression feature vector of the query image sequence is determined based on a feature vector of each frame in the query image sequence and a self-expression feature vector of the candidate image sequence, and based on the The feature vector of each frame in the candidate image sequence and the self-expression feature vector of the query image sequence, and determining the collaborative expression feature vector of the candidate image sequence include:
- the feature vector of each frame in the query image sequence, the first dimensionality reduction feature vector of each frame in the query image sequence, and the self-expressing feature vector of the candidate image sequence are input into a third sub-neural network to obtain A collaborative expression feature vector of the query image sequence;
- the feature vector of each frame in the candidate image sequence, the first dimensionality reduction feature vector of each frame in the candidate image sequence, and the self-expressing feature vector of the query image sequence are input into a third sub-neural network to obtain A collaborative expression feature vector of the candidate image sequence.
- the feature vector of each frame in the query image sequence, the first dimensionality reduction feature vector of each frame in the query image sequence, and the self-expressing feature vector of the candidate image sequence Input to the third sub-neural network to obtain the collaborative expression feature vector of the query image sequence, including:
- the feature vector of each frame in the candidate image sequence, the first dimensionality reduction feature vector of each frame in the candidate image sequence, and the self-expressing feature vector of the query image sequence are input into a third sub-neural network to obtain
- the collaborative expression feature vector of the candidate image sequence includes:
- the self-expression feature vector of the candidate image sequence Reducing the dimensionality feature vector to obtain the collaborative expression feature vector of the query image sequence, including:
- the correlation between the third dimensionality-reduced feature vector of each frame in the query image sequence and the self-expression feature vector of the candidate image sequence is calculated through a parameterless correlation function to obtain the second of each frame in the query image sequence. Relevant weights;
- the self-expression feature vector of the query image sequence and the first of each frame in the candidate image sequence Reducing the dimensionality feature vector to obtain the collaborative expression feature vector of the candidate image sequence, including:
- the second correlation weight includes a second normalized correlation weight
- the second normalized correlation weight is obtained by performing a normalization process on the second correlation weight.
- a feature vector to obtain a similarity feature vector between the query image sequence and the candidate image sequence including:
- a similarity feature vector of the query image sequence and the candidate image sequence is obtained.
- obtaining a similarity feature vector of the query image sequence and the candidate image sequence based on the first difference vector and the second difference vector includes:
- determining a matching result between the query image sequence and the candidate image sequence based on the similarity feature vector includes:
- a matching result of the query image sequence and the candidate image sequence is determined.
- the method further includes:
- the same pair of labeled data and a binary cross-entropy loss function are used to optimize network parameters.
- the method before extracting a feature vector of each frame in the query image sequence, the method further includes:
- the method further includes:
- a matching result between the query video and the candidate video is determined based on a matching result between the query image sequence of the query video and the candidate image sequence of the candidate video.
- segmenting the query video into multiple query image sequences includes:
- the query video is divided into a plurality of query image sequences according to a preset sequence length and a preset step length, wherein the length of the query image sequence is equal to the preset sequence length, and adjacent query image sequences overlap
- the number of images is equal to a difference between the preset sequence length and the preset step size
- Segment candidate videos into multiple candidate image sequences including:
- the candidate video is divided into multiple candidate image sequences according to a preset sequence length and a preset step length, where the length of the candidate image sequence is equal to the length of the preset sequence, and adjacent candidate image sequences overlap
- the number of images is equal to the difference between the preset sequence length and the preset step size.
- determining a matching result between the query video and the candidate video based on a matching result between the query image sequence of the query video and the candidate image sequence of the candidate video includes:
- N is a positive integer
- a matching result of the query video and the candidate video is determined.
- a target matching device including:
- An extraction module for extracting a feature vector of each frame in a query image sequence and a feature vector of each frame in a candidate image sequence, wherein the query image sequence includes a target to be matched;
- a first determining module configured to determine a self-expressing feature vector of the query image sequence and the candidate based on a feature vector of each frame in the query image sequence and a feature vector of each frame in the candidate image sequence; Self-expressing feature vectors of image sequences;
- a second determining module configured to determine a collaborative expression feature vector of the query image sequence based on a feature vector of each frame in the query image sequence and a self-expression feature vector of the candidate image sequence, and based on the candidate image A feature vector of each frame in the sequence and a self-expression feature vector of the query image sequence, determining a cooperative expression feature vector of the candidate image sequence;
- a third determining module configured to be based on the self-expression feature vector of the query image sequence, the co-expression feature vector of the query image sequence, the self-expression feature vector of the candidate image sequence, and the co-expression feature of the candidate image sequence A vector to determine a similarity feature vector between the query image sequence and the candidate image sequence;
- a fourth determining module is configured to determine a matching result between the query image sequence and the candidate image sequence based on the similarity feature vector.
- the extraction module is configured to:
- the feature vector of each frame in the query image sequence and the feature vector of each frame in the candidate image sequence are extracted by the first sub-neural network.
- the apparatus further includes:
- a dimensionality reduction module configured to perform dimensionality reduction processing on the feature vector of each frame in the query image sequence and the feature vector of each frame in the candidate image sequence through the first fully connected layer of the first sub-neural network to obtain A first dimensionality reduction feature vector of each frame in the query image sequence and a first dimensionality reduction feature vector of each frame in the candidate image sequence.
- the first determining module includes:
- a first determining submodule configured to input a feature vector of each frame in the query image sequence and a first dimensionality reduction feature vector of each frame in the query image sequence into a second sub-neural network to determine the query Self-expressing feature vectors of image sequences;
- a second determining submodule configured to input a feature vector of each frame in the candidate image sequence and a first dimensionality reduction feature vector of each frame in the candidate image sequence into a second sub-neural network to determine the candidate The self-expressing feature vector of the image sequence.
- the first determining submodule includes:
- a first dimensionality reduction unit configured to perform dimensionality reduction processing on a feature vector of each frame in the query image sequence through a second fully connected layer of the second sub-neural network to obtain each frame in the query image sequence
- the second dimensionality reduction feature vector
- a first average pooling unit configured to subject the second dimensionality reduction feature vector of each frame in the query image sequence to a time dimension average pooling process to obtain an overall feature vector of the query image sequence
- a first determining unit configured to be based on a second dimensionality reduction feature vector of each frame in the query image sequence, an overall feature vector of the query image sequence, and a first dimensionality reduction feature of each frame in the query image sequence Vector to determine the self-expressing feature vector of the query image sequence.
- the second determining submodule includes:
- a second dimension reduction unit configured to perform dimension reduction processing on a feature vector of each frame in the candidate image sequence through a second fully connected layer of the second sub-neural network to obtain each frame in the candidate image sequence
- the second dimensionality reduction feature vector
- a second average pooling unit configured to subject the second dimensionality reduction feature vector of each frame in the candidate image sequence to an average pooling process in the time dimension to obtain the overall feature vector of the candidate image sequence;
- a second determining unit configured to be based on a second dimensionality reduction feature vector of each frame in the candidate image sequence, an overall feature vector of the candidate image sequence, and a first dimensionality reduction feature of each frame in the candidate image sequence Vector to determine the self-expression feature vector of the candidate image sequence.
- the first determining unit includes:
- a first calculation subunit configured to calculate a correlation between a second dimensionality-reduced feature vector of each frame in the query image sequence and an overall feature vector of the query image sequence through a parameterless correlation function to obtain the query image sequence
- the first correlation weight of each frame in the frame
- a first weighting subunit for weighting a first dimensionality reduction feature vector of each frame in the query image sequence based on a first correlation weight of each frame in the query image sequence to obtain the query image sequence Self-expressing feature vector.
- the second determining unit includes:
- a second calculation subunit configured to calculate a correlation between a second dimensionality reduction feature vector of each frame in the candidate image sequence and an overall feature vector of the candidate image sequence by using a parameterless correlation function to obtain the candidate image sequence
- the first correlation weight of each frame in the frame
- a second weighting subunit configured to weight the first dimensionality reduction feature vector of each frame in the candidate image sequence based on the first correlation weight of each frame in the candidate image sequence to obtain the candidate image sequence Self-expressing feature vector.
- the first correlation weight includes a first normalized correlation weight
- the first normalized correlation weight is obtained by performing a normalization process on the first correlation weight.
- the second determining module includes:
- a third determining submodule configured to input a feature vector of each frame in the query image sequence, a first dimensionality reduction feature vector of each frame in the query image sequence, and a self-expressing feature vector of the candidate image sequence Obtaining a collaborative expression feature vector of the query image sequence in a third sub-neural network;
- a fourth determining submodule configured to input a feature vector of each frame in the candidate image sequence, a first dimensionality reduction feature vector of each frame in the candidate image sequence, and a self-expressing feature vector of the query image sequence
- a collaborative expression feature vector of the candidate image sequence is obtained.
- the third determining submodule includes:
- a third dimension reduction unit configured to perform dimension reduction processing on a feature vector of each frame in the query image sequence through a third fully connected layer of the third sub-neural network to obtain each frame in the query image sequence
- the third dimensionality reduction feature vector
- a third determining unit configured to be based on a third dimension reduction feature vector of each frame in the query image sequence, a self-expression feature vector of the candidate image sequence, and a first dimension reduction of each frame in the query image sequence A feature vector to obtain a cooperatively expressed feature vector of the query image sequence;
- the fourth determining sub-module includes:
- a fourth dimension reduction unit configured to perform dimension reduction processing on a feature vector of each frame in the candidate image sequence through a third fully connected layer of the third sub-neural network to obtain each frame in the candidate image sequence
- a fourth determining unit configured to be based on a third dimensionality reduction feature vector of each frame in the candidate image sequence, a self-expression feature vector of the query image sequence, and a first dimensionality reduction of each frame in the candidate image sequence A feature vector to obtain a collaborative expression feature vector of the candidate image sequence.
- the third determining unit includes:
- a third calculation subunit configured to calculate a correlation between a third dimension-reduced feature vector of each frame in the query image sequence and a self-expression feature vector of the candidate image sequence by using a parameterless correlation function to obtain the query image
- the second correlation weight of each frame in the sequence
- a third weighting subunit for weighting the first dimensionality reduction feature vector of each frame in the query image sequence based on the second correlation weight of each frame in the query image sequence to obtain the query image sequence Co-expression feature vector.
- the fourth determining unit includes:
- a fourth calculation subunit configured to calculate a correlation between a third dimensionality reduction feature vector of each frame in the candidate image sequence and a self-expression feature vector of the query image sequence by using a parameterless correlation function to obtain the candidate image
- the second correlation weight of each frame in the sequence
- a fourth weighting subunit configured to weight the first dimensionality reduction feature vector of each frame in the candidate image sequence based on the second correlation weight of each frame in the candidate image sequence to obtain the candidate image sequence Co-expression feature vector.
- the second correlation weight includes a second normalized correlation weight
- the second normalized correlation weight is obtained by performing a normalization process on the second correlation weight.
- the third determining module includes:
- a first calculation submodule configured to calculate a difference between a self-expression feature vector of the query image sequence and a collaborative expression feature vector of the candidate image sequence to obtain a first difference vector
- a second calculation submodule configured to calculate a difference between a self-expression feature vector of the candidate image sequence and a co-expression feature vector of the query image sequence to obtain a second difference vector
- a fifth determination submodule is configured to obtain a similarity feature vector of the query image sequence and the candidate image sequence based on the first difference vector and the second difference vector.
- the fifth determining submodule includes:
- a first calculation unit configured to calculate a sum of the first difference vector and the second difference vector to obtain a similarity feature vector between the query image sequence and the candidate image sequence;
- a second calculation unit is configured to calculate a product of the first difference vector and elements of corresponding bits of the second difference vector to obtain a similarity feature vector of the query image sequence and the candidate image sequence.
- the fourth determining module includes:
- a sixth determining submodule configured to input a similarity feature vector of the query image sequence and the candidate image sequence into a fourth fully connected layer to obtain a matching score between the query image sequence and the candidate image sequence;
- a seventh determination submodule is configured to determine a matching result between the query image sequence and the candidate image sequence based on a matching score of the query image sequence and the candidate image sequence.
- the apparatus further includes:
- An optimization module is used to optimize network parameters based on matching scores of the query image sequence and the candidate image sequence, using the same pair of labeled data and a binary cross-entropy loss function.
- the apparatus further includes:
- the first sub-module is used to divide the query video into multiple query image sequences
- a second segmentation module configured to segment a candidate video into multiple candidate image sequences
- a fifth determining module is configured to determine a matching result between the query video and the candidate video based on a matching result between the query image sequence of the query video and the candidate image sequence of the candidate video.
- the first segmentation module is configured to:
- the query video is divided into a plurality of query image sequences according to a preset sequence length and a preset step length, wherein the length of the query image sequence is equal to the preset sequence length, and adjacent query image sequences overlap
- the number of images is equal to a difference between the preset sequence length and the preset step size
- the second segmentation module is configured to:
- the candidate video is divided into multiple candidate image sequences according to a preset sequence length and a preset step length, where the length of the candidate image sequence is equal to the length of the preset sequence, and adjacent candidate image sequences overlap
- the number of images is equal to the difference between the preset sequence length and the preset step size.
- the fifth determining module includes:
- An eighth determining submodule configured to determine a matching score of each query image sequence of the query video and each candidate image sequence of the candidate video
- a third calculation submodule configured to calculate an average value of the highest N matching scores among the matching scores of each query image sequence of the query video and each candidate image sequence of the candidate video to obtain the query video and the Matching scores of candidate videos, where N is a positive integer;
- a ninth determining submodule is configured to determine a matching result between the query video and the candidate video based on a matching score of the query video and the candidate video.
- an electronic device including:
- Memory for storing processor-executable instructions
- the processor is configured to execute the target matching method.
- a computer-readable storage medium having computer program instructions stored thereon, the computer program instructions realizing the above-mentioned target matching method when executed by a processor.
- the query image sequence is determined based on the self-expression feature vector of the query image sequence, the co-expression feature vector of the query image sequence, the self-expression feature vector of the candidate image sequence, and the co-expression feature vector of the candidate image sequence.
- the similarity feature vector of the candidate image sequence and based on the similarity feature vector, determine the matching result between the query image sequence and the candidate image sequence, thereby improving the accuracy of target matching.
- FIG. 1 illustrates a flowchart of a target matching method according to an embodiment of the present disclosure.
- FIG. 2 shows an exemplary flowchart of step S12 of the target matching method according to an embodiment of the present disclosure.
- FIG. 3 illustrates an exemplary flowchart of step S121 of the target matching method according to an embodiment of the present disclosure.
- FIG. 4 illustrates an exemplary flowchart of step S122 of the target matching method according to an embodiment of the present disclosure.
- FIG. 5 illustrates an exemplary flowchart of step S1213 of the target matching method according to an embodiment of the present disclosure.
- FIG. 6 illustrates an exemplary flowchart of step S1223 of the target matching method according to an embodiment of the present disclosure.
- FIG. 7 illustrates an exemplary flowchart of step S13 of the target matching method according to an embodiment of the present disclosure.
- FIG. 8 illustrates an exemplary flowchart of step S131 of the target matching method according to an embodiment of the present disclosure.
- FIG. 9 illustrates an exemplary flowchart of step S132 of the target matching method according to an embodiment of the present disclosure.
- FIG. 10 illustrates an exemplary flowchart of step S1312 of the target matching method according to an embodiment of the present disclosure.
- FIG. 11 illustrates an exemplary flowchart of step S1322 of the target matching method according to an embodiment of the present disclosure.
- FIG. 12 illustrates an exemplary flowchart of step S14 of the target matching method according to an embodiment of the present disclosure.
- FIG. 13 illustrates an exemplary flowchart of step S15 of the target matching method according to an embodiment of the present disclosure.
- FIG. 14 illustrates an exemplary flowchart of a target matching method according to an embodiment of the present disclosure.
- FIG. 15 illustrates an exemplary flowchart of step S28 of the target matching method according to an embodiment of the present disclosure.
- FIG. 16 illustrates a block diagram of a target matching device according to an embodiment of the present disclosure.
- FIG. 17 illustrates an exemplary block diagram of a target matching device according to an embodiment of the present disclosure.
- Fig. 18 is a block diagram showing an electronic device 800 according to an exemplary embodiment.
- Fig. 19 is a block diagram of an electronic device 1900 according to an exemplary embodiment.
- exemplary means “serving as an example, embodiment, or illustration.” Any embodiment described herein as “exemplary” is not necessarily to be construed as superior or better than other embodiments.
- FIG. 1 illustrates a flowchart of a target matching method according to an embodiment of the present disclosure.
- the embodiments of the present disclosure can be applied to fields such as intelligent video analysis or security monitoring.
- the embodiments of the present disclosure can be combined with technologies such as pedestrian detection and pedestrian tracking, and can be applied to security monitoring systems at airports, stations, campuses, or shopping malls.
- the method includes steps S11 to S15.
- step S11 the feature vector of each frame in the query image sequence and the feature vector of each frame in the candidate image sequence are separately extracted, where the query image sequence includes the target to be matched.
- the query image sequence may refer to an image sequence that requires target matching.
- the candidate image sequence may refer to an image sequence in a database.
- the database may contain multiple candidate image sequences, for example, the database may contain a large number of candidate image sequences.
- the query image sequence may include only one target to be matched, or may include multiple targets to be matched.
- the image sequence in the embodiment of the present disclosure may be a video, a video fragment, or another image sequence.
- the number of frames of the query image sequence and the candidate image sequence may be different or the same.
- the query image sequence contains T frames (ie, the query image sequence contains T images)
- the candidate image sequence contains R frames (ie, the candidate image sequence contains R images), where both T and R are positive integers.
- the feature vector of each frame in the query image sequence is extracted to obtain Among them, x t represents the feature vector of the t-th frame in the query image sequence, 1 ⁇ t ⁇ T; the feature vector of each frame in the candidate image sequence is extracted to obtain Among them, y r represents the feature vector of the r-th frame in the candidate image sequence, and 1 ⁇ r ⁇ R.
- extracting the feature vector of each frame in the query image sequence and the feature vector of each frame in the candidate image sequence separately include: extracting the Feature vector and feature vector of each frame in the candidate image sequence.
- the first child neural network may be a CNN (Convolutional Neural Network, Convolutional Neural Network).
- a convolutional neural network with the same parameters can be used to extract the feature vector of each frame in the query image sequence and the feature vector of each frame in the candidate image sequence.
- the method further includes: a first full connection through a first sub-neural network
- the layer performs dimension reduction processing on the feature vector of each frame in the query image sequence and the feature vector of each frame in the candidate image sequence to obtain the first dimensionality reduction feature vector of each frame in the query image sequence and each of the candidate image sequences The first dimensionality reduction feature vector of the frame.
- the first dimensionality reduction feature vector of each frame in the query image sequence can be expressed as among them, Represents the first dimensionality reduction feature vector of the t-th frame in the query image sequence; the first dimensionality reduction feature vector of each frame in the candidate image sequence can be expressed as among them, Represents the first dimensionality reduction feature vector of the r-th frame in the candidate image sequence.
- the dimension of the feature vector of each frame in the query image sequence is 2048
- the dimension of the first dimension-reduced feature vector of each frame in the query image sequence is 128.
- the number of dimensions is 2048
- the dimension of the first dimensionality reduction feature vector of each frame in the candidate image sequence is 128.
- the first fully connected layer may be denoted as fc-0.
- step S12 the self-expression feature vector of the query image sequence and the self-expression feature vector of the candidate image sequence are determined based on the feature vector of each frame in the query image sequence and the feature vector of each frame in the candidate image sequence.
- the self-expression feature vector of the query image sequence may be determined based on the feature vector of each frame in the query image sequence; the self-expression feature of the candidate image sequence may be determined based on the feature vector of each frame in the candidate image sequence. vector.
- the self-expression feature vector of the query image sequence may represent a feature vector determined only by the expression of the query image sequence, that is, the self-expression feature vector of the query image sequence is determined only by the expression of the query image sequence, and the candidate The expression of the image sequence is irrelevant;
- the self-expression feature vector of the candidate image sequence can represent a feature vector determined only by the expression of the candidate image sequence, that is, the self-expression feature vector of the candidate image sequence is determined only by the expression of the candidate image sequence, and is related to the query image The expression of the sequence is irrelevant.
- step S13 based on the feature vector of each frame in the query image sequence and the self-expression feature vector of the candidate image sequence, a collaborative expression feature vector of the query image sequence is determined, and the feature vector and query of each frame in the candidate image sequence are determined.
- the self-expression feature vector of the image sequence determines the cooperative expression feature vector of the candidate image sequence.
- the cooperative expression feature vector of the query image sequence may represent a feature vector determined by the expression of the query image sequence and the expression of the candidate image sequence, that is, the collaborative expression feature vector of the query image sequence is not only related to the query image sequence
- the expression correlation of the candidate image sequence is also related to the expression of the candidate image sequence
- the collaborative expression feature vector of the candidate image sequence can represent a feature vector determined by the expression of the candidate image sequence and the expression of the query image sequence, that is, the collaborative expression feature of the candidate image sequence
- Vectors are not only related to the expression of candidate image sequences, but also to the expression of query image sequences.
- step S14 the query image sequence and the candidate image sequence are determined based on the self-expression feature vector of the query image sequence, the collaborative expression feature vector of the query image sequence, the self-expression feature vector of the candidate image sequence, and the collaborative expression feature vector of the candidate image sequence Similarity feature vector.
- the similarity feature vector of the query image sequence and the candidate image sequence may be used to determine the degree of similarity between the query image sequence and the candidate image sequence, so as to determine whether the query image sequence matches the candidate image sequence.
- step S15 a matching result of the query image sequence and the candidate image sequence is determined based on the similarity feature vector.
- the two matching image sequences may be image sequences of the same person captured from different shooting perspectives, or may be image sequences of the same person captured from the same shooting perspective.
- the embodiments of the present disclosure determine a query image sequence and a candidate image sequence based on a self-expression feature vector of a query image sequence, a cooperative expression feature vector of a query image sequence, a self-expression feature vector of a candidate image sequence, and a cooperative expression feature vector of a candidate image sequence. Based on the similarity feature vector, and based on the similarity feature vector, determine the matching result between the query image sequence and the candidate image sequence, thereby improving the accuracy of target matching.
- FIG. 2 shows an exemplary flowchart of step S12 of the target matching method according to an embodiment of the present disclosure.
- step S12 may include steps S121 and S122.
- step S121 the feature vector of each frame in the query image sequence and the first dimensionality reduction feature vector of each frame in the query image sequence are input to the second sub-neural network to determine the self-expressing feature vector of the query image sequence.
- the second child neural network may be a SAN (Self Attention Subnetwork, an auto-expressor neural network based on the attention mechanism).
- SAN Self Attention Subnetwork, an auto-expressor neural network based on the attention mechanism.
- step S122 the feature vector of each frame in the candidate image sequence and the first dimensionality reduction feature vector of each frame in the candidate image sequence are input into the second sub-neural network to determine the self-expressing feature vector of the candidate image sequence.
- the feature vector of each frame in the candidate image sequence can be And the first dimensionality reduction feature vector of each frame in the candidate image sequence Enter the second sub-neural network to determine the self-expression feature vector of the candidate image sequence
- FIG. 3 illustrates an exemplary flowchart of step S121 of the target matching method according to an embodiment of the present disclosure.
- step S121 may include steps S1211 and S1213.
- step S1211 the feature vector of each frame in the query image sequence is reduced by the second fully connected layer of the second sub-neural network to obtain the second dimension-reduced feature vector of each frame in the query image sequence.
- the second dimensionality reduction feature vector of each frame in the query image sequence can be expressed as among them, The second dimensionality reduction feature vector representing the t-th frame in the query image sequence.
- the second fully connected layer can be denoted as fc-1.
- the dimension of the second dimensionality reduction feature vector of each frame in the query image sequence is 128 dimensions.
- step S1212 the second dimensionality reduction feature vector of each frame in the query image sequence is subjected to an average pooling process in the time dimension to obtain the overall feature vector of the query image sequence.
- the overall feature vector of a query image sequence can be expressed as
- step S1213 based on the second dimensionality reduction feature vector of each frame in the query image sequence, the overall feature vector of the query image sequence, and the first dimensionality reduction feature vector of each frame in the query image sequence, the self-determination of the query image sequence is determined.
- the self-determination of the query image sequence is determined.
- FIG. 4 illustrates an exemplary flowchart of step S122 of the target matching method according to an embodiment of the present disclosure.
- step S122 may include steps S1221 and S1223.
- step S1221 the feature vector of each frame in the candidate image sequence is reduced by the second fully-connected layer of the second sub-neural network to obtain the second reduced-dimensional feature vector of each frame in the candidate image sequence.
- the dimension of the second dimensionality reduction feature vector of each frame in the candidate image sequence is 128 dimensions.
- step S1222 the second dimensionality reduction feature vector of each frame in the candidate image sequence is subjected to an average pooling process in the time dimension to obtain the overall feature vector of the candidate image sequence.
- the overall feature vector of a candidate image sequence can be expressed as
- step S1223 based on the second dimensionality reduction feature vector of each frame in the candidate image sequence, the overall feature vector of the candidate image sequence, and the first dimensionality reduction feature vector of each frame in the candidate image sequence, the self- Express feature vectors.
- FIG. 5 illustrates an exemplary flowchart of step S1213 of the target matching method according to an embodiment of the present disclosure. As shown in FIG. 5, step S1213 may include steps S12131 and S12132.
- step S12131 the correlation between the second dimensionality-reduced feature vector of each frame in the query image sequence and the overall feature vector of the query image sequence is calculated through a parameterless correlation function to obtain the first correlation weight of each frame in the query image sequence.
- the correlation between the second dimensionality-reduced feature vector of each frame in the query image sequence and the overall feature vector of the query image sequence can be calculated by the parameterless correlation function f () to obtain the first correlation of each frame in the query image sequence.
- the parameterless correlation function f () can be calculated by dot multiplication. versus Relevance.
- the embodiment of the present disclosure is based on a self-expression mechanism, and assigns a relevant weight to each frame of the query image sequence through the query image sequence's own expression.
- step S12132 based on the first correlation weight of each frame in the query image sequence, the first dimensionality reduction feature vector of each frame in the query image sequence is weighted to obtain the self-expression feature vector of the query image sequence.
- the self-expressing feature vector of a query image sequence can be expressed as among them,
- the second dimensionality reduction feature vector of the t-th frame in the query image sequence Represents the overall feature vector of the query image sequence, Represents the first dimensionality reduction feature vector of the t-th frame in the query image sequence.
- the first correlation weight includes a first normalized correlation weight
- the first normalized correlation weight is obtained by performing a normalization process on the first correlation weight.
- weighting the first dimensionality reduction feature vector of each frame in the query image sequence to obtain a self-expressing feature vector of the query image sequence including: Normalize the first correlation weight of each frame in the query image sequence to obtain the first normalized correlation weight of each frame in the query image sequence; based on the first normalization of each frame in the query image sequence
- the correlation weight is used to weight the first dimensionality reduction feature vector of each frame in the query image sequence to obtain the self-expression feature vector of the query image sequence.
- softmax may be used to normalize the first correlation weight of each frame in the query image sequence to obtain the first normalized correlation weight of each frame in the query image sequence.
- FIG. 6 illustrates an exemplary flowchart of step S1223 of the target matching method according to an embodiment of the present disclosure. As shown in FIG. 6, step S1223 may include steps S12231 and S12232.
- step S12231 the correlation between the second dimensionality-reduced feature vector of each frame in the candidate image sequence and the overall feature vector of the candidate image sequence is calculated through a parameterless correlation function to obtain the first correlation weight of each frame in the candidate image sequence.
- the correlation between the second dimensionality-reduced feature vector of each frame in the candidate image sequence and the overall feature vector of the candidate image sequence can be calculated by the parameterless correlation function f () to obtain the first correlation of each frame in the candidate image sequence.
- the parameterless correlation function f () can be calculated by dot multiplication. versus Relevance.
- the embodiment of the present disclosure is based on a self-expression mechanism, and assigns a relevant weight to each frame of the candidate image sequence through its own expression of the candidate image sequence.
- step S12232 based on the first correlation weight of each frame in the candidate image sequence, the first dimensionality reduction feature vector of each frame in the candidate image sequence is weighted to obtain the self-expression feature vector of the candidate image sequence.
- the self-expressing feature vector of a candidate image sequence can be expressed as among them, The second dimensionality reduction feature vector of the r-th frame in the candidate image sequence, The overall feature vector representing the candidate image sequence, Represents the first dimensionality reduction feature vector of the r-th frame in the candidate image sequence.
- the first correlation weight includes a first normalized correlation weight
- the first normalized correlation weight is obtained by performing a normalization process on the first correlation weight.
- weighting the first dimensionality reduction feature vector of each frame in the candidate image sequence to obtain a self-expression feature vector of the candidate image sequence including: Normalize the first correlation weight of each frame in the candidate image sequence to obtain the first normalized correlation weight of each frame in the candidate image sequence; based on the first normalization of each frame in the candidate image sequence
- the correlation weight is used to weight the first dimensionality reduction feature vector of each frame in the candidate image sequence to obtain the self-expression feature vector of the candidate image sequence.
- softmax may be used to normalize the first correlation weight of each frame in the candidate image sequence to obtain the first normalized correlation weight of each frame in the candidate image sequence.
- FIG. 7 illustrates an exemplary flowchart of step S13 of the target matching method according to an embodiment of the present disclosure. As shown in FIG. 7, step S13 may include steps S131 and S132.
- step S131 the feature vector of each frame in the query image sequence, the first dimensionality reduction feature vector of each frame in the query image sequence, and the self-expressing feature vector of the candidate image sequence are input into the third sub-neural network to obtain a query.
- Cooperative expression feature vector of image sequence is input into the third sub-neural network to obtain a query.
- the third sub-neural network may be CAN (Collaborative Attention Subnetwork, a cooperative expression sub-neural network based on attention mechanism).
- step S132 the feature vector of each frame in the candidate image sequence, the first dimensionality reduction feature vector of each frame in the candidate image sequence, and the self-expressing feature vector of the query image sequence are input into a third sub-neural network to obtain a candidate.
- Cooperative expression feature vector of image sequence is input into a third sub-neural network to obtain a candidate.
- the feature vector of each frame in the candidate image sequence can be The first dimensionality reduction feature vector of each frame in the candidate image sequence
- query self-expressing feature vectors of image sequences Enter the third sub-neural network to obtain the collaborative expression feature vector of the candidate image sequence
- FIG. 8 illustrates an exemplary flowchart of step S131 of the target matching method according to an embodiment of the present disclosure. As shown in FIG. 8, step S131 may include steps S1311 and S1312.
- step S1311 the feature vector of each frame in the query image sequence is reduced by the third fully-connected layer of the third sub-neural network to obtain the third dimension-reduced feature vector of each frame in the query image sequence.
- the third dimensionality reduction feature vector of each frame in the query image sequence can be expressed as among them, Represents the third dimensionality reduction feature vector of the t-th frame in the query image sequence.
- the dimension of the third dimensionality reduction feature vector of each frame in the query image sequence is 128 dimensions.
- the third fully connected layer can be represented as fc-2.
- step S1312 the query image sequence is obtained based on the third dimension reduction feature vector of each frame in the query image sequence, the self-expression feature vector of the candidate image sequence, and the first dimension reduction feature vector of each frame in the query image sequence. Co-express feature vectors.
- FIG. 9 illustrates an exemplary flowchart of step S132 of the target matching method according to an embodiment of the present disclosure. As shown in FIG. 9, step S132 may include steps S1321 and S1322.
- step S1321 the feature vector of each frame in the candidate image sequence is reduced by the third fully-connected layer of the third sub-neural network to obtain the third reduced-dimensional feature vector of each frame in the candidate image sequence.
- the third dimensionality reduction feature vector of each frame in the candidate image sequence can be expressed as among them, Represents the third dimensionality reduction feature vector of the r-th frame in the candidate image sequence.
- the dimension of the third dimensionality reduction feature vector of each frame in the candidate image sequence is 128 dimensions.
- step S1322 the candidate image sequence is obtained based on the third dimensionality reduction feature vector of each frame in the candidate image sequence, the self-expression feature vector of the query image sequence, and the first dimensionality reduction feature vector of each frame in the candidate image sequence. Co-express feature vectors.
- FIG. 10 illustrates an exemplary flowchart of step S1312 of the target matching method according to an embodiment of the present disclosure. As shown in FIG. 10, step S1312 may include steps S13121 and S13122.
- step S13121 the correlation between the third dimensionality-reduced feature vector of each frame in the query image sequence and the self-expressing feature vector of the candidate image sequence is calculated through a parameterless correlation function to obtain the second correlation of each frame in the query image sequence. Weights.
- the second correlation weight of the t-th frame in the query image sequence can be expressed as
- the embodiment of the present disclosure is based on a cooperative expression mechanism, and assigns a relevant weight to each frame of the query image sequence through the expression of the candidate image sequence and the query image sequence's own expression.
- step S13122 based on the second correlation weight of each frame in the query image sequence, the first dimensionality reduction feature vector of each frame in the query image sequence is weighted to obtain the collaborative expression feature vector of the query image sequence.
- the collaborative expression feature vector of a query image sequence can be expressed as
- the second related weight includes a second normalized related weight
- the second normalized related weight is obtained by normalizing the second related weight.
- weighting the first dimensionality reduction feature vector of each frame in the query image sequence to obtain a collaborative expression feature vector of the query image sequence including: Normalize the second correlation weight of each frame in the query image sequence to obtain the second normalized correlation weight of each frame in the query image sequence; based on the second normalization of each frame in the query image sequence
- the correlation weight is used to weight the first dimensionality reduction feature vector of each frame in the query image sequence to obtain the collaborative expression feature vector of the query image sequence.
- FIG. 11 illustrates an exemplary flowchart of step S1322 of the target matching method according to an embodiment of the present disclosure. As shown in FIG. 11, step S1322 may include steps S13221 and S13222.
- step S13221 the correlation between the third dimensionality-reduced feature vector of each frame in the candidate image sequence and the self-expression feature vector of the query image sequence is calculated by using a parameterless correlation function to obtain the second correlation of each frame in the candidate image sequence. Weights.
- the second correlation weight of the r-th frame in the candidate image sequence can be expressed as
- the embodiment of the present disclosure is based on a cooperative expression mechanism, and assigns a relevant weight to each frame of the candidate image sequence by querying the expression of the image sequence and the own expression of the candidate image sequence.
- step S13222 based on the second correlation weight of each frame in the candidate image sequence, the first dimensionality reduction feature vector of each frame in the candidate image sequence is weighted to obtain the collaborative expression feature vector of the candidate image sequence.
- the collaborative expression feature vector of a candidate image sequence can be expressed as
- the second related weight includes a second normalized related weight
- the second normalized related weight is obtained by normalizing the second related weight.
- weighting the first dimensionality reduction feature vector of each frame in the candidate image sequence to obtain a collaborative expression feature vector of the candidate image sequence including: Normalize the second correlation weight of each frame in the candidate image sequence to obtain the second normalized correlation weight of each frame in the candidate image sequence; based on the second normalization of each frame in the candidate image sequence
- the correlation weight is used to weight the first dimensionality reduction feature vector of each frame in the candidate image sequence to obtain the collaborative expression feature vector of the candidate image sequence.
- the second sub-neural network and the third sub-neural network are based on the self-expression mechanism and the cooperative expression mechanism, and query each frame of the image sequence and the candidate image by querying the expression of the image sequence and the expression of the candidate image sequence.
- Each frame of the sequence is assigned a relevant weight.
- the second child neural network and the third child neural network use this non-parametric self-expression and cooperative expression to implicitly align the query image sequence and the candidate image sequence to select a more discriminative frame for two images. Sequence for expression. Since the second sub-neural network and the third sub-neural network are non-parametric, the query image sequence and the candidate image sequence are allowed to have different lengths. Therefore, the target matching method provided by the embodiment of the present disclosure has high flexibility and can be widely applied.
- FIG. 12 illustrates an exemplary flowchart of step S14 of the target matching method according to an embodiment of the present disclosure. As shown in FIG. 12, step S14 may include steps S141 and S143.
- step S141 the difference between the self-expression feature vector of the query image sequence and the co-expression feature vector of the candidate image sequence is calculated to obtain a first difference vector.
- the first difference vector is
- step S142 the difference between the self-expression feature vector of the candidate image sequence and the co-expression feature vector of the query image sequence is calculated to obtain a second difference vector.
- the second difference vector is
- step S143 based on the first difference vector and the second difference vector, a similarity feature vector of the query image sequence and the candidate image sequence is obtained.
- obtaining the similarity feature vector of the query image sequence and the candidate image sequence based on the first difference vector and the second difference vector includes: calculating a sum of the first difference vector and the second difference vector to obtain Query the similarity feature vector between the image sequence and the candidate image sequence. For example, query similarity feature vectors of image sequences and candidate image sequences
- obtaining a similarity feature vector between the query image sequence and the candidate image sequence based on the first difference vector and the second difference vector including: calculating corresponding bits of the first difference vector and the second difference vector.
- the product of the elements of is used to obtain the similarity feature vector of the query image sequence and the candidate image sequence.
- FIG. 13 illustrates an exemplary flowchart of step S15 of the target matching method according to an embodiment of the present disclosure. As shown in FIG. 13, step S15 may include steps S151 and S152.
- step S151 the similarity feature vector of the query image sequence and the candidate image sequence is input to the fourth fully connected layer to obtain a matching score between the query image sequence and the candidate image sequence.
- the fourth fully connected layer can be represented as fc-3.
- step S152 based on the matching score of the query image sequence and the candidate image sequence, a matching result of the query image sequence and the candidate image sequence is determined.
- the matching score between the query image sequence and the candidate image sequence is greater than the score threshold, it can be determined that the matching result between the query image sequence and the candidate image sequence is that the query image sequence matches the candidate image sequence; if the query image sequence matches the candidate image sequence, If the matching score is less than or equal to the score threshold, it can be determined that the matching result between the query image sequence and the candidate image sequence is that the query image sequence does not match the candidate image sequence.
- the method further includes: using the same pair of labeled data and binary cross entropy based on the matching score of the query image sequence and the candidate image sequence. Loss function to optimize network parameters.
- N the number of query image sequence and candidate image sequence pairs in the training set
- the training image sequence can be segmented to generate rich query image sequence and candidate image sequence pairs, thereby effectively improving optimization efficiency and improving the robustness of the network model. Thereby, matching accuracy can be improved.
- FIG. 14 illustrates an exemplary flowchart of a target matching method according to an embodiment of the present disclosure. As shown in FIG. 14, the method may include steps S21 to S28.
- step S21 the query video is divided into a plurality of query image sequences.
- segmenting the query video into multiple query image sequences includes: segmenting the query video into multiple query image sequences according to a preset sequence length and a preset step size, where the query images The length of the sequence is equal to the preset sequence length, and the number of overlapping images between adjacent query image sequences is equal to the difference between the preset sequence length and the preset step size.
- step S22 the candidate video is segmented into a plurality of candidate image sequences.
- segmenting the candidate video into multiple candidate image sequences includes: segmenting the candidate video into multiple candidate image sequences according to a preset sequence length and a preset step length, where the candidate images The length of the sequence is equal to the preset sequence length, and the number of overlapping images between adjacent candidate image sequences is equal to the difference between the preset sequence length and the preset step size.
- step S23 the feature vector of each frame in the query image sequence and the feature vector of each frame in the candidate image sequence are separately extracted, where the query image sequence includes the target to be matched.
- step S23 refer to the description of step S11 above.
- step S24 the self-expression feature vector of the query image sequence and the self-expression feature vector of the candidate image sequence are determined based on the feature vector of each frame in the query image sequence and the feature vector of each frame in the candidate image sequence.
- step S24 refer to the description of step S12 above.
- step S25 based on the feature vector of each frame in the query image sequence and the self-expression feature vector of the candidate image sequence, a collaborative expression feature vector of the query image sequence is determined, and the feature vector and query based on each frame in the candidate image sequence The self-expression feature vector of the image sequence determines the cooperative expression feature vector of the candidate image sequence.
- step S25 refer to the description of step S13 above.
- step S26 the query image sequence and the candidate image sequence are determined based on the self-expression feature vector of the query image sequence, the co-expression feature vector of the query image sequence, the self-expression feature vector of the candidate image sequence, and the co-expression feature vector of the candidate image sequence. Similarity feature vector.
- step S26 refer to the description of step S14 above.
- step S27 a matching result of the query image sequence and the candidate image sequence is determined based on the similarity feature vector.
- step S27 refer to the description of step S15 above.
- step S28 based on the matching result between the query image sequence of the query video and the candidate image sequence of the candidate video, a matching result of the query video and the candidate video is determined.
- FIG. 15 illustrates an exemplary flowchart of step S28 of the target matching method according to an embodiment of the present disclosure.
- step S28 may include steps S281 to S283.
- step S281 a matching score of each query image sequence of the query video and each candidate image sequence of the candidate video is determined.
- step S282 the average value of the highest N matching scores among the matching scores of each query image sequence of the query video and each candidate image sequence of the candidate video is calculated to obtain the matching score of the query video and the candidate video, where N is positive Integer.
- step S283 a matching result of the query video and the candidate video is determined based on the matching score of the query video and the candidate video.
- the matching score between the query video and the candidate video is greater than the score threshold, it can be determined that the matching result between the query video and the candidate video is that the query video matches the candidate video; if the query video matches the candidate video, If the score is less than or equal to the score threshold, it can be determined that the matching result between the query video and the candidate video is that the query video does not match the candidate video.
- the target matching method provided by the embodiment of the present disclosure can filter out more discriminating key frames in the image sequence, and use multiple key frames to express the image sequence, thereby improving the discrimination ability; the embodiment of the present disclosure proposes more effective
- the time-domain modeling method captures the dynamic change information of consecutive frames and improves the expression ability of the model.
- the embodiment of the present disclosure proposes a more effective distance measurement method, which reduces the distance between feature expressions of the same person, increasing Increase the distance between the character expressions of different characters.
- the target matching method provided by the embodiment of the present disclosure can still obtain more accurate target matching results under the conditions of poor lighting conditions, severe occlusion, poor viewing angle, or severe background interference.
- the embodiments of the present disclosure can help improve the effects of human detection and / or pedestrian tracking. Utilizing the embodiments of the present disclosure, it is possible to perform better cross-camera search and tracking on specific pedestrians (such as criminal suspects, missing children, etc.) in intelligent video surveillance.
- the present disclosure also provides a target matching device, an electronic device, a computer-readable storage medium, and a program, all of which can be used to implement any one of the target matching methods provided by the present disclosure, the corresponding technical solutions and descriptions, and the corresponding records in the method section ,No longer.
- FIG. 16 illustrates a block diagram of a target matching device according to an embodiment of the present disclosure.
- the device includes: an extraction module 31 for extracting a feature vector of each frame in a query image sequence and a feature vector of each frame in a candidate image sequence, wherein the query image sequence includes a target to be matched;
- the first determining module 32 is configured to determine the self-expression feature vector of the query image sequence and the self-expression feature vector of the candidate image sequence based on the feature vector of each frame in the query image sequence and the feature vector of each frame in the candidate image sequence, respectively.
- a second determining module 33 configured to determine a cooperative expression feature vector of the query image sequence based on the feature vector of each frame in the query image sequence and a self-expression feature vector of the candidate image sequence, and based on the The feature vector and the self-expressing feature vector of the query image sequence determine the cooperative expression feature vector of the candidate image sequence;
- the third determination module 34 is used to base the self-expression feature vector of the query image sequence, the collaborative expression feature vector of the query image sequence, the candidate Self-expressing feature vectors of image sequences and collaborative expression of candidate image sequences
- the feature vector determines a similarity feature vector between the query image sequence and the candidate image sequence;
- a fourth determination module 35 is configured to determine a matching result between the query image sequence and the candidate image sequence based on the similarity feature vector.
- the extraction module 31 is configured to extract the feature vector of each frame in the query image sequence and the feature vector of each frame in the candidate image sequence through the first sub-neural network.
- FIG. 17 illustrates an exemplary block diagram of a target matching device according to an embodiment of the present disclosure. As shown in Figure 17:
- the apparatus further includes: a dimensionality reduction module 36, configured to query the feature vector of each frame in the image sequence and each of the candidate image sequences through the first fully connected layer of the first sub-neural network.
- the feature vector of one frame is subjected to dimension reduction processing to obtain the first dimension-reduced feature vector of each frame in the query image sequence and the first dimension-reduced feature vector of each frame in the candidate image sequence.
- the first determining module 32 includes: a first determining submodule 321, configured to convert a feature vector of each frame in the query image sequence and a first dimension reduction feature of each frame in the query image sequence The vector is input into the second sub-neural network to determine the self-expressing feature vector of the query image sequence; the second determination sub-module 322 is configured to combine the feature vector of each frame in the candidate image sequence and the first of each frame in the candidate image sequence The dimensionality-reduced feature vector is input into the second sub-neural network to determine the self-expressing feature vector of the candidate image sequence.
- the first determining sub-module 321 includes: a first dimension reduction unit, configured to reduce the feature vector of each frame in the query image sequence through the second fully connected layer of the second sub-neural network Dimensional processing to obtain the second dimensionality reduction feature vector of each frame in the query image sequence; a first average pooling unit for averaging the second dimension reduction feature vector of each frame in the query image sequence over the time dimension Processing to obtain the overall feature vector of the query image sequence; a first determining unit, configured to: based on the second dimensionality reduction feature vector of each frame in the query image sequence, the overall feature vector of the query image sequence, and the The first dimensionality reduction feature vector determines the self-expression feature vector of the query image sequence.
- the second determining submodule 322 includes: a second dimension reduction unit, configured to reduce the feature vector of each frame in the candidate image sequence through the second fully connected layer of the second sub-neural network. Dimension processing to obtain a second dimensionality reduction feature vector of each frame in the candidate image sequence; a second average pooling unit for averaging the second dimensionality reduction feature vector of each frame in the candidate image sequence over the time dimension Processing to obtain the overall feature vector of the candidate image sequence; a second determining unit, configured to be based on the second dimensionality reduction feature vector of each frame in the candidate image sequence, the overall feature vector of the candidate image sequence, and the The first dimensionality reduction feature vector determines the self-expression feature vector of the candidate image sequence.
- the first determining unit includes: a first calculation subunit, configured to calculate a second dimension-reduced feature vector of each frame in the query image sequence and the overall feature of the query image sequence through a parameterless correlation function.
- the first dimensionality reduction feature vector is weighted to obtain a self-expressing feature vector of the query image sequence.
- the second determination unit includes: a second calculation subunit, configured to calculate a second dimension-reduced feature vector of each frame in the candidate image sequence and an overall feature of the candidate image sequence by using a parameterless correlation function.
- the first dimensionality reduction feature vector is weighted to obtain a self-expression feature vector of the candidate image sequence.
- the first correlation weight includes a first normalized correlation weight
- the first normalized correlation weight is obtained by performing a normalization process on the first correlation weight.
- the second determination module 33 includes: a third determination submodule 331, configured to query a feature vector of each frame in the query image sequence, and query the first dimension reduction feature of each frame in the image sequence
- the vector and the self-expressing feature vector of the candidate image sequence are input into the third sub-neural network to obtain the collaborative expression feature vector of the query image sequence
- the fourth determining submodule 332 is configured to use the feature vector and candidate of each frame in the candidate image sequence
- the first dimensionality-reduced feature vector of each frame in the image sequence and the self-expression feature vector of the query image sequence are input into the third sub-neural network to obtain the collaborative expression feature vector of the candidate image sequence.
- the third determining sub-module 331 includes a third dimension reduction unit, configured to reduce the feature vector of each frame in the query image sequence through the third fully connected layer of the third sub-neural network. Dimensional processing to obtain the third dimensionality reduction feature vector of each frame in the query image sequence; a third determination unit, which is based on the third dimensionality reduction feature vector of each frame in the query image sequence and the self-expression feature vector of the candidate image sequence And querying the first dimensionality reduction feature vector of each frame in the image sequence to obtain the collaborative expression feature vector of the query image sequence; the fourth determining submodule 332 includes a fourth dimensionality reduction unit for The three fully connected layers perform a dimensionality reduction process on the feature vectors of each frame in the candidate image sequence to obtain a third dimensionality reduction feature vector of each frame in the candidate image sequence; a fourth determination unit is configured to be based on each of the candidate image sequences The third dimensionality-reduced feature vector of the frame, the self-expression feature vector of the query
- the third determination unit includes: a third calculation subunit, configured to calculate, through a parameterless correlation function, the third dimension-reduced feature vector of each frame in the query image sequence and the self-expression of the candidate image sequence The correlation degree of the feature vector to obtain the second correlation weight of each frame in the query image sequence; the third weighting subunit is used for each frame in the query image sequence based on the second correlation weight of each frame in the query image sequence The weighted first dimensionality reduction feature vector is weighted to obtain the collaborative expression feature vector of the query image sequence.
- the fourth determination unit includes: a fourth calculation subunit, configured to calculate, through a parameterless correlation function, a third dimension-reduced feature vector of each frame in the candidate image sequence and a self-expression of the query image sequence The correlation degree of the feature vector to obtain the second correlation weight of each frame in the candidate image sequence; the fourth weighting subunit is used for each frame in the candidate image sequence based on the second correlation weight of each frame in the candidate image sequence The weighted first dimensionality reduction feature vector is weighted to obtain a collaborative expression feature vector of the candidate image sequence.
- the second related weight includes a second normalized related weight
- the second normalized related weight is obtained by normalizing the second related weight
- the third determination module 34 includes: a first calculation submodule 341, configured to calculate a difference between a self-expression feature vector of a query image sequence and a co-expression feature vector of a candidate image sequence to obtain a first difference
- a fifth determination sub-module 343 based on the first The difference vector and the second difference vector are used to obtain the similarity feature vector of the query image sequence and the candidate image sequence.
- the fifth determination sub-module 343 includes: a first calculation unit, configured to calculate a sum of the first difference vector and the second difference vector, to obtain a similarity feature vector of the query image sequence and the candidate image sequence Or, a second calculation unit, configured to calculate a product of elements of corresponding bits of the first difference vector and the second difference vector, to obtain a similarity feature vector of the query image sequence and the candidate image sequence.
- the fourth determination module 35 includes: a sixth determination submodule 351, configured to input the similarity feature vector of the query image sequence and the candidate image sequence into the fourth fully connected layer to obtain the query image sequence and A matching score of the candidate image sequence; a seventh determination submodule 352 is configured to determine a matching result of the query image sequence and the candidate image sequence based on the matching score of the query image sequence and the candidate image sequence.
- the device further includes: an optimization module 37 for optimizing network parameters based on the matching score of the query image sequence and the candidate image sequence, using the same pair of labeled data and a binary cross-entropy loss function.
- the device further includes a first segmentation module 38 for segmenting the query video into multiple query image sequences, and a second segmentation module 39 for segmenting the candidate video into A plurality of candidate image sequences; a fifth determination module 30 is configured to determine a matching result between the query video and the candidate video based on a matching result between the query image sequence of the query video and the candidate image sequence of the candidate video.
- the first segmentation module 38 is configured to segment the query video into multiple query image sequences according to a preset sequence length and a preset step length, where the length of the query image sequence is equal to the Set the sequence length, the number of overlapping images between adjacent query image sequences is equal to the difference between the preset sequence length and the preset step size;
- the second segmentation module 39 is used to: according to the preset sequence length and the preset step size, The candidate video is divided into multiple candidate image sequences, where the length of the candidate image sequence is equal to the preset sequence length, and the number of overlapping images between adjacent candidate image sequences is equal to the difference between the preset sequence length and the preset step size.
- the fifth determination module 30 includes: an eighth determination submodule 301, configured to determine a matching score between each query image sequence of the query video and each candidate image sequence of the candidate video; and a third calculation submodule 302. Calculate the average of the highest N matching scores of the matching scores of each query image sequence of the query video and each candidate image sequence of the candidate video to obtain the matching score of the query video and the candidate video, where N is a positive integer.
- a ninth determining sub-module 303 configured to determine a matching result between the query video and the candidate video based on the matching score of the query video and the candidate video.
- the embodiments of the present disclosure determine a query image sequence and a candidate image sequence based on a self-expression feature vector of a query image sequence, a cooperative expression feature vector of a query image sequence, a self-expression feature vector of a candidate image sequence, and a cooperative expression feature vector of a candidate image sequence. Based on the similarity feature vector, and based on the similarity feature vector, determine the matching result between the query image sequence and the candidate image sequence, thereby improving the accuracy of target matching.
- An embodiment of the present disclosure also provides a computer-readable storage medium having computer program instructions stored thereon, and the computer program instructions implement the above method when executed by a processor.
- the computer-readable storage medium may be a non-volatile computer-readable storage medium.
- An embodiment of the present disclosure further provides an electronic device, including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured as the foregoing method.
- the electronic device may be provided as a terminal, a server, or other forms of devices.
- Fig. 18 is a block diagram showing an electronic device 800 according to an exemplary embodiment.
- the electronic device 800 may be a mobile phone, a computer, a digital broadcasting terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, and other terminals.
- the electronic device 800 may include one or more of the following components: a processing component 802, a memory 804, a power component 806, a multimedia component 808, an audio component 810, an input / output (I / O) interface 812, and a sensor component 814 , And communication component 816.
- the processing component 802 generally controls overall operations of the electronic device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations.
- the processing component 802 may include one or more processors 820 to execute instructions to complete all or part of the steps of the method described above.
- the processing component 802 may include one or more modules to facilitate the interaction between the processing component 802 and other components.
- the processing component 802 may include a multimedia module to facilitate the interaction between the multimedia component 808 and the processing component 802.
- the memory 804 is configured to store various types of data to support operation at the electronic device 800. Examples of these data include instructions for any application or method for operating on the electronic device 800, contact data, phone book data, messages, pictures, videos, and the like.
- the memory 804 may be implemented by any type of volatile or non-volatile storage devices or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), Programming read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic disk or optical disk.
- SRAM static random access memory
- EEPROM electrically erasable programmable read-only memory
- EPROM Programming read-only memory
- PROM programmable read-only memory
- ROM read-only memory
- magnetic memory flash memory
- flash memory magnetic disk or optical disk.
- the power component 806 provides power to various components of the electronic device 800.
- the power component 806 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the electronic device 800.
- the multimedia component 808 includes a screen that provides an output interface between the electronic device 800 and a user.
- the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user.
- the touch panel includes one or more touch sensors to sense touch, swipe, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure related to the touch or slide operation.
- the multimedia component 808 includes a front camera and / or a rear camera. When the electronic device 800 is in an operation mode, such as a shooting mode or a video mode, the front camera and / or the rear camera can receive external multimedia data. Each front camera and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.
- the audio component 810 is configured to output and / or input audio signals.
- the audio component 810 includes a microphone (MIC).
- the microphone is configured to receive an external audio signal.
- the received audio signal may be further stored in the memory 804 or transmitted via the communication component 816.
- the audio component 810 further includes a speaker for outputting audio signals.
- the I / O interface 812 provides an interface between the processing component 802 and a peripheral interface module.
- the peripheral interface module may be a keyboard, a click wheel, a button, or the like. These buttons can include, but are not limited to: a home button, a volume button, a start button, and a lock button.
- the sensor component 814 includes one or more sensors for providing various aspects of the state evaluation of the electronic device 800.
- the sensor component 814 can detect the on / off state of the electronic device 800, and the relative positioning of the components.
- the component is the display and keypad of the electronic device 800.
- the sensor component 814 can also detect the electronic device 800 or an electronic device 800.
- the position of the component changes, the presence or absence of the user's contact with the electronic device 800, the orientation or acceleration / deceleration of the electronic device 800, and the temperature change of the electronic device 800.
- the sensor component 814 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact.
- the sensor component 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications.
- the sensor component 814 may further include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
- the communication component 816 is configured to facilitate wired or wireless communication between the electronic device 800 and other devices.
- the electronic device 800 can access a wireless network based on a communication standard, such as WiFi, 2G, or 3G, or a combination thereof.
- the communication component 816 receives a broadcast signal or broadcast-related information from an external broadcast management system via a broadcast channel.
- the communication component 816 further includes a near field communication (NFC) module to facilitate short-range communication.
- the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
- RFID radio frequency identification
- IrDA infrared data association
- UWB ultra wideband
- Bluetooth Bluetooth
- the electronic device 800 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), Implementation of a programming gate array (FPGA), controller, microcontroller, microprocessor, or other electronic component to perform the above method.
- ASICs application specific integrated circuits
- DSPs digital signal processors
- DSPDs digital signal processing devices
- PLDs programmable logic devices
- FPGA programming gate array
- controller microcontroller, microprocessor, or other electronic component to perform the above method.
- a non-volatile computer-readable storage medium such as a memory 804 including computer program instructions, and the computer program instructions may be executed by the processor 820 of the electronic device 800 to complete the foregoing method.
- Fig. 19 is a block diagram of an electronic device 1900 according to an exemplary embodiment.
- the electronic device 1900 may be provided as a server.
- the electronic device 1900 includes a processing component 1922, which further includes one or more processors, and a memory resource represented by a memory 1932, for storing instructions executable by the processing component 1922, such as an application program.
- the application program stored in the memory 1932 may include one or more modules each corresponding to a set of instructions.
- the processing component 1922 is configured to execute instructions to perform the method described above.
- the electronic device 1900 may further include a power supply component 1926 configured to perform power management of the electronic device 1900, a wired or wireless network interface 1950 configured to connect the electronic device 1900 to a network, and an input / output (I / O) interface 1958 .
- the electronic device 1900 can operate based on an operating system stored in the memory 1932, such as Windows ServerTM, Mac OSXTM, UnixTM, LinuxTM, FreeBSDTM, or the like.
- a non-volatile computer-readable storage medium such as a memory 1932 including computer program instructions, and the computer program instructions may be executed by the processing component 1922 of the electronic device 1900 to complete the above method.
- the present disclosure may be a system, method, and / or computer program product.
- the computer program product may include a computer-readable storage medium having computer-readable program instructions for causing a processor to implement various aspects of the present disclosure.
- the computer-readable storage medium may be a tangible device that can hold and store instructions used by the instruction execution device.
- the computer-readable storage medium may be, for example, but not limited to, an electric storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
- Non-exhaustive list of computer-readable storage media include: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM) Or flash memory), static random access memory (SRAM), portable compact disc read only memory (CD-ROM), digital versatile disc (DVD), memory stick, floppy disk, mechanical encoding device, such as a printer with instructions stored thereon A protruding structure in the hole card or groove, and any suitable combination of the above.
- RAM random access memory
- ROM read-only memory
- EPROM erasable programmable read-only memory
- flash memory flash memory
- SRAM static random access memory
- CD-ROM compact disc read only memory
- DVD digital versatile disc
- memory stick floppy disk
- mechanical encoding device such as a printer with instructions stored thereon A protruding structure in the hole card or groove, and any suitable combination of the above.
- Computer-readable storage media used herein are not to be interpreted as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (for example, light pulses through fiber optic cables), or via electrical wires Electrical signal transmitted.
- the computer-readable program instructions described herein can be downloaded from a computer-readable storage medium to various computing / processing devices, or downloaded to an external computer or external storage device via a network, such as the Internet, a local area network, a wide area network, and / or a wireless network.
- the network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and / or edge servers.
- the network adapter card or network interface in each computing / processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing / processing device .
- Computer program instructions for performing the operations of the present disclosure may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or in one or more programming languages.
- the programming languages include object-oriented programming languages—such as Smalltalk, C ++, and the like—and conventional procedural programming languages—such as "C” or similar programming languages.
- Computer-readable program instructions may be executed entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer, partly on a remote computer, or entirely on a remote computer or server carried out.
- the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as through an Internet service provider using the Internet connection).
- electronic circuits such as programmable logic circuits, field-programmable gate arrays (FPGAs), or programmable logic arrays (PLAs) are personalized by using state information of computer-readable program instructions.
- the electronic circuits may Computer-readable program instructions are executed to implement various aspects of the present disclosure.
- These computer-readable program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, or other programmable data processing device, thereby producing a machine such that, when executed by a processor of a computer or other programmable data processing device , Means for implementing the functions / actions specified in one or more blocks in the flowcharts and / or block diagrams.
- These computer-readable program instructions may also be stored in a computer-readable storage medium, and these instructions cause a computer, a programmable data processing apparatus, and / or other devices to work in a specific manner.
- a computer-readable medium storing instructions includes: An article of manufacture that includes instructions to implement various aspects of the functions / acts specified in one or more blocks in the flowcharts and / or block diagrams.
- Computer-readable program instructions can also be loaded onto a computer, other programmable data processing device, or other device, so that a series of operating steps can be performed on the computer, other programmable data processing device, or other device to produce a computer-implemented process , So that the instructions executed on the computer, other programmable data processing apparatus, or other equipment can implement the functions / actions specified in one or more blocks in the flowchart and / or block diagram.
- each block in the flowchart or block diagram may represent a module, a program segment, or a part of an instruction that contains one or more components for implementing a specified logical function.
- Executable instructions may also occur in a different order than those marked in the drawings. For example, two consecutive blocks may actually be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending on the functions involved.
- each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts can be implemented in a dedicated hardware-based system that performs the specified function or action. , Or it can be implemented with a combination of dedicated hardware and computer instructions.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Mathematical Physics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Medical Informatics (AREA)
- Library & Information Science (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (44)
- 一种目标匹配方法,其特征在于,包括:分别提取查询图像序列中每一帧的特征向量和候选图像序列中每一帧的特征向量,其中,所述查询图像序列包含待匹配目标;分别基于所述查询图像序列中每一帧的特征向量和所述候选图像序列中每一帧的特征向量,确定所述查询图像序列的自表达特征向量和所述候选图像序列的自表达特征向量;基于所述查询图像序列中每一帧的特征向量和所述候选图像序列的自表达特征向量,确定所述查询图像序列的协同表达特征向量,以及基于所述候选图像序列中每一帧的特征向量和所述查询图像序列的自表达特征向量,确定所述候选图像序列的协同表达特征向量;基于所述查询图像序列的自表达特征向量、所述查询图像序列的协同表达特征向量、所述候选图像序列的自表达特征向量以及所述候选图像序列的协同表达特征向量,确定所述查询图像序列与所述候选图像序列的相似性特征向量;基于所述相似性特征向量,确定所述查询图像序列与所述候选图像序列的匹配结果。
- 根据权利要求1所述的方法,其特征在于,分别提取查询图像序列中每一帧的特征向量和候选图像序列中每一帧的特征向量,包括:通过第一子神经网络提取查询图像序列中每一帧的特征向量和候选图像序列中每一帧的特征向量。
- 根据权利要求1或2所述的方法,其特征在于,在提取查询图像序列中每一帧的特征向量和候选图像序列中每一帧的特征向量之后,所述方法还包括:通过第一子神经网络的第一全连接层对所述查询图像序列中每一帧的特征向量和所述候选图像序列中每一帧的特征向量进行降维处理,得到所述查询图像序列中每一帧的第一降维特征向量和所述候选图像序列中每一帧的第一降维特征向量。
- 根据权利要求3所述的方法,其特征在于,分别基于所述查询图像序列中每一帧的特征向量和所述候选图像序列中每一帧的特征向量,确定所述查询图像序列的自表达特征向量和所述候选图像序列的自表达特征向量包括:将所述查询图像序列中每一帧的特征向量和所述查询图像序列中每一帧的第一降维特征向量输入第二子神经网络中,确定所述查询图像序列的自表达特征向量;将所述候选图像序列中每一帧的特征向量和所述候选图像序列中每一帧的第一降维特征向量输入第二子神经网络中,确定所述候选图像序列的自表达特征向量。
- 根据权利要求4所述的方法,其特征在于,将所述查询图像序列中每一帧的特征向量和所述查询图像序列中每一帧的第一降维特征向量输入第二子神经网络中,确定所述查询图像序列的自表达特征向量,包括:通过所述第二子神经网络的第二全连接层对所述查询图像序列中每一帧的特征向量进行降维处理,得到所述查询图像序列中每一帧的第二降维特征向量;将所述查询图像序列中每一帧的第二降维特征向量经过时间维度的平均池化处理,得到所述查询图像序列的整体特征向量;基于所述查询图像序列中每一帧的第二降维特征向量、所述查询图像序列的整体特征向量以及所 述查询图像序列中每一帧的第一降维特征向量,确定所述查询图像序列的自表达特征向量。
- 根据权利要求4所述的方法,其特征在于,将所述候选图像序列中每一帧的特征向量和所述候选图像序列中每一帧的第一降维特征向量输入第二子神经网络中,得到所述候选图像序列的自表达特征向量,包括:通过所述第二子神经网络的第二全连接层对所述候选图像序列中每一帧的特征向量进行降维处理,得到所述候选图像序列中每一帧的第二降维特征向量;将所述候选图像序列中每一帧的第二降维特征向量经过时间维度的平均池化处理,得到所述候选图像序列的整体特征向量;基于所述候选图像序列中每一帧的第二降维特征向量、所述候选图像序列的整体特征向量以及所述候选图像序列中每一帧的第一降维特征向量,确定所述候选图像序列的自表达特征向量。
- 根据权利要求5所述的方法,其特征在于,基于所述查询图像序列中每一帧的第二降维特征向量、所述查询图像序列的整体特征向量以及所述查询图像序列中每一帧的第一降维特征向量,确定所述查询图像序列的自表达特征向量,包括:通过无参数相关函数计算所述查询图像序列中每一帧的第二降维特征向量与所述查询图像序列的整体特征向量的相关度,得到所述查询图像序列中每一帧的第一相关权重;基于所述查询图像序列中每一帧的第一相关权重,对所述查询图像序列中每一帧的第一降维特征向量进行加权,得到所述查询图像序列的自表达特征向量。
- 根据权利要求6所述的方法,其特征在于,基于所述候选图像序列中每一帧的第二降维特征向量、所述候选图像序列的整体特征向量以及所述候选图像序列中每一帧的第一降维特征向量,确定所述候选图像序列的自表达特征向量,包括:通过无参数相关函数计算所述候选图像序列中每一帧的第二降维特征向量与所述候选图像序列的整体特征向量的相关度,得到所述候选图像序列中每一帧的第一相关权重;基于所述候选图像序列中每一帧的第一相关权重,对所述候选图像序列中每一帧的第一降维特征向量进行加权,得到所述候选图像序列的自表达特征向量。
- 根据权利要求7或8所述的方法,其特征在于,所述第一相关权重包括第一归一化相关权重,所述第一归一化相关权重是对所述第一相关权重进行归一化处理得到的。
- 根据权利要求3至9中任意一项所述的方法,其特征在于,基于所述查询图像序列中每一帧的特征向量和所述候选图像序列的自表达特征向量,确定所述查询图像序列的协同表达特征向量,以及基于所述候选图像序列中每一帧的特征向量和所述查询图像序列的自表达特征向量,确定所述候选图像序列的协同表达特征向量,包括:将所述查询图像序列中每一帧的特征向量、所述查询图像序列中每一帧的第一降维特征向量以及所述候选图像序列的自表达特征向量输入第三子神经网络中,得到所述查询图像序列的协同表达特征向量;将所述候选图像序列中每一帧的特征向量、所述候选图像序列中每一帧的第一降维特征向量以及所述查询图像序列的自表达特征向量输入第三子神经网络中,得到所述候选图像序列的协同表达特征向量。
- 根据权利要求10所述的方法,其特征在于,将所述查询图像序列中每一帧的特征向量、所述查询图像序列中每一帧的第一降维特征向量以及所述候选图像序列的自表达特征向量输入第三子神经网络中,得到所述查询图像序列的协同表达特征向量,包括:通过所述第三子神经网络的第三全连接层对所述查询图像序列中每一帧的特征向量进行降维处理,得到所述查询图像序列中每一帧的第三降维特征向量;基于所述查询图像序列中每一帧的第三降维特征向量、所述候选图像序列的自表达特征向量以及所述查询图像序列中每一帧的第一降维特征向量,得到所述查询图像序列的协同表达特征向量;将所述候选图像序列中每一帧的特征向量、所述候选图像序列中每一帧的第一降维特征向量以及所述查询图像序列的自表达特征向量输入第三子神经网络中,得到所述候选图像序列的协同表达特征向量,包括:通过所述第三子神经网络的第三全连接层对所述候选图像序列中每一帧的特征向量进行降维处理,得到所述候选图像序列中每一帧的第三降维特征向量;基于所述候选图像序列中每一帧的第三降维特征向量、所述查询图像序列的自表达特征向量以及所述候选图像序列中每一帧的第一降维特征向量,得到所述候选图像序列的协同表达特征向量。
- 根据权利要求11所述的方法,其特征在于,基于所述查询图像序列中每一帧的第三降维特征向量、所述候选图像序列的自表达特征向量以及所述查询图像序列中每一帧的第一降维特征向量,得到所述查询图像序列的协同表达特征向量,包括:通过无参数相关函数计算所述查询图像序列中每一帧的第三降维特征向量与所述候选图像序列的自表达特征向量的相关度,得到所述查询图像序列中每一帧的第二相关权重;基于所述查询图像序列中每一帧的第二相关权重,对所述查询图像序列中每一帧的第一降维特征向量进行加权,得到所述查询图像序列的协同表达特征向量。
- 根据权利要求11所述的方法,其特征在于,基于所述候选图像序列中每一帧的第三降维特征向量、所述查询图像序列的自表达特征向量以及所述候选图像序列中每一帧的第一降维特征向量,得到所述候选图像序列的协同表达特征向量,包括:通过无参数相关函数计算所述候选图像序列中每一帧的第三降维特征向量与所述查询图像序列的自表达特征向量的相关度,得到所述候选图像序列中每一帧的第二相关权重;基于所述候选图像序列中每一帧的第二相关权重,对所述候选图像序列中每一帧的第一降维特征向量进行加权,得到所述候选图像序列的协同表达特征向量。
- 根据权利要求12或13所述的方法,其特征在于,所述第二相关权重包括第二归一化相关权重,所述第二归一化相关权重是对所述第二相关权重进行归一化处理得到的。
- 根据权利要求1至14中任意一项所述的方法,其特征在于,基于所述查询图像序列的自表达特征向量、所述查询图像序列的协同表达特征向量、所述候选图像序列的自表达特征向量以及所述候选图像序列的协同表达特征向量,得到所述查询图像序列与所述候选图像序列的相似性特征向量,包括:计算所述查询图像序列的自表达特征向量与所述候选图像序列的协同表达特征向量之差,得到第一差向量;计算所述候选图像序列的自表达特征向量与所述查询图像序列的协同表达特征向量之差,得到第二差向量;基于所述第一差向量与所述第二差向量,得到所述查询图像序列与所述候选图像序列的相似性特征向量。
- 根据权利要求15所述的方法,其特征在于,基于所述第一差向量与所述第二差向量,得到所述查询图像序列与所述候选图像序列的相似性特征向量,包括:计算所述第一差向量与所述第二差向量之和,得到所述查询图像序列与所述候选图像序列的相似性特征向量;或者,计算所述第一差向量与所述第二差向量的相应位的元素的乘积,得到所述查询图像序列与所述候选图像序列的相似性特征向量。
- 根据权利要求1至16中任意一项所述的方法,其特征在于,基于所述相似性特征向量,确定所述查询图像序列与所述候选图像序列的匹配结果,包括:将所述查询图像序列与所述候选图像序列的相似性特征向量输入第四全连接层,得到所述查询图像序列与所述候选图像序列的匹配分数;基于所述查询图像序列与所述候选图像序列的匹配分数,确定所述查询图像序列与所述候选图像序列的匹配结果。
- 根据权利要求17所述的方法,其特征在于,在得到所述查询图像序列与所述候选图像序列的匹配分数之后,所述方法还包括:基于所述查询图像序列与所述候选图像序列的匹配分数,采用同对标注数据和二元交叉熵损失函数,优化网络参数。
- 根据权利要求1至18中任意一项所述的方法,其特征在于,在提取查询图像序列中每一帧的特征向量之前,所述方法还包括:将查询视频切分为多个查询图像序列;将候选视频切分为多个候选图像序列;在确定所述查询图像序列与所述候选图像序列的匹配结果之后,所述方法还包括:基于所述查询视频的查询图像序列与所述候选视频的候选图像序列的匹配结果,确定所述查询视频与所述候选视频的匹配结果。
- 根据权利要求19所述的方法,其特征在于,将查询视频切分为多个查询图像序列,包括:按照预设序列长度以及预设步长,将查询视频切分为多个查询图像序列,其中,所述查询图像序列的长度等于所述预设序列长度,相邻的查询图像序列之间重叠的图像数等于所述预设序列长度与所述预设步长之差;将候选视频切分为多个候选图像序列,包括:按照预设序列长度以及预设步长,将候选视频切分为多个候选图像序列,其中,所述候选图像序列的长度等于所述预设序列长度,相邻的候选图像序列之间重叠的图像数等于所述预设序列长度与所述预设步长之差。
- 根据权利要求19或20所述的方法,其特征在于,基于所述查询视频的查询图像序列与所述候 选视频的候选图像序列的匹配结果,确定所述查询视频与所述候选视频的匹配结果,包括:确定所述查询视频的各个查询图像序列与所述候选视频的各个候选图像序列的匹配分数;计算所述查询视频的各个查询图像序列与所述候选视频的各个候选图像序列的匹配分数中最高的N个匹配分数的平均值,得到所述查询视频与所述候选视频的匹配分数,其中,N为正整数;基于所述查询视频与所述候选视频的匹配分数,确定所述查询视频与所述候选视频的匹配结果。
- 一种目标匹配装置,其特征在于,包括:提取模块,用于分别提取查询图像序列中每一帧的特征向量和候选图像序列中每一帧的特征向量,其中,所述查询图像序列包含待匹配目标;第一确定模块,用于分别基于所述查询图像序列中每一帧的特征向量和所述候选图像序列中每一帧的特征向量,确定所述查询图像序列的自表达特征向量和所述候选图像序列的自表达特征向量;第二确定模块,用于基于所述查询图像序列中每一帧的特征向量和所述候选图像序列的自表达特征向量,确定所述查询图像序列的协同表达特征向量,以及基于所述候选图像序列中每一帧的特征向量和所述查询图像序列的自表达特征向量,确定所述候选图像序列的协同表达特征向量;第三确定模块,用于基于所述查询图像序列的自表达特征向量、所述查询图像序列的协同表达特征向量、所述候选图像序列的自表达特征向量以及所述候选图像序列的协同表达特征向量,确定所述查询图像序列与所述候选图像序列的相似性特征向量;第四确定模块,用于基于所述相似性特征向量,确定所述查询图像序列与所述候选图像序列的匹配结果。
- 根据权利要求22所述的装置,其特征在于,所述提取模块用于:通过第一子神经网络提取查询图像序列中每一帧的特征向量和候选图像序列中每一帧的特征向量。
- 根据权利要求22或23所述的装置,其特征在于,所述装置还包括:降维模块,用于通过第一子神经网络的第一全连接层对所述查询图像序列中每一帧的特征向量和所述候选图像序列中每一帧的特征向量进行降维处理,得到所述查询图像序列中每一帧的第一降维特征向量和所述候选图像序列中每一帧的第一降维特征向量。
- 根据权利要求24所述的装置,其特征在于,所述第一确定模块包括:第一确定子模块,用于将所述查询图像序列中每一帧的特征向量和所述查询图像序列中每一帧的第一降维特征向量输入第二子神经网络中,确定所述查询图像序列的自表达特征向量;第二确定子模块,用于将所述候选图像序列中每一帧的特征向量和所述候选图像序列中每一帧的第一降维特征向量输入第二子神经网络中,确定所述候选图像序列的自表达特征向量。
- 根据权利要求25所述的装置,其特征在于,所述第一确定子模块包括:第一降维单元,用于通过所述第二子神经网络的第二全连接层对所述查询图像序列中每一帧的特征向量进行降维处理,得到所述查询图像序列中每一帧的第二降维特征向量;第一平均池化单元,用于将所述查询图像序列中每一帧的第二降维特征向量经过时间维度的平均池化处理,得到所述查询图像序列的整体特征向量;第一确定单元,用于基于所述查询图像序列中每一帧的第二降维特征向量、所述查询图像序列的 整体特征向量以及所述查询图像序列中每一帧的第一降维特征向量,确定所述查询图像序列的自表达特征向量。
- 根据权利要求25所述的装置,其特征在于,所述第二确定子模块包括:第二降维单元,用于通过所述第二子神经网络的第二全连接层对所述候选图像序列中每一帧的特征向量进行降维处理,得到所述候选图像序列中每一帧的第二降维特征向量;第二平均池化单元,用于将所述候选图像序列中每一帧的第二降维特征向量经过时间维度的平均池化处理,得到所述候选图像序列的整体特征向量;第二确定单元,用于基于所述候选图像序列中每一帧的第二降维特征向量、所述候选图像序列的整体特征向量以及所述候选图像序列中每一帧的第一降维特征向量,确定所述候选图像序列的自表达特征向量。
- 根据权利要求26所述的装置,其特征在于,所述第一确定单元包括:第一计算子单元,用于通过无参数相关函数计算所述查询图像序列中每一帧的第二降维特征向量与所述查询图像序列的整体特征向量的相关度,得到所述查询图像序列中每一帧的第一相关权重;第一加权子单元,用于基于所述查询图像序列中每一帧的第一相关权重,对所述查询图像序列中每一帧的第一降维特征向量进行加权,得到所述查询图像序列的自表达特征向量。
- 根据权利要求27所述的装置,其特征在于,所述第二确定单元包括:第二计算子单元,用于通过无参数相关函数计算所述候选图像序列中每一帧的第二降维特征向量与所述候选图像序列的整体特征向量的相关度,得到所述候选图像序列中每一帧的第一相关权重;第二加权子单元,用于基于所述候选图像序列中每一帧的第一相关权重,对所述候选图像序列中每一帧的第一降维特征向量进行加权,得到所述候选图像序列的自表达特征向量。
- 根据权利要求28或29所述的装置,其特征在于,所述第一相关权重包括第一归一化相关权重,所述第一归一化相关权重是对所述第一相关权重进行归一化处理得到的。
- 根据权利要求24至30中任意一项所述的装置,其特征在于,所述第二确定模块包括:第三确定子模块,用于将所述查询图像序列中每一帧的特征向量、所述查询图像序列中每一帧的第一降维特征向量以及所述候选图像序列的自表达特征向量输入第三子神经网络中,得到所述查询图像序列的协同表达特征向量;第四确定子模块,用于将所述候选图像序列中每一帧的特征向量、所述候选图像序列中每一帧的第一降维特征向量以及所述查询图像序列的自表达特征向量输入第三子神经网络中,得到所述候选图像序列的协同表达特征向量。
- 根据权利要求31所述的装置,其特征在于,所述第三确定子模块包括:第三降维单元,用于通过所述第三子神经网络的第三全连接层对所述查询图像序列中每一帧的特征向量进行降维处理,得到所述查询图像序列中每一帧的第三降维特征向量;第三确定单元,用于基于所述查询图像序列中每一帧的第三降维特征向量、所述候选图像序列的自表达特征向量以及所述查询图像序列中每一帧的第一降维特征向量,得到所述查询图像序列的协同表达特征向量;所述第四确定子模块包括:第四降维单元,用于通过所述第三子神经网络的第三全连接层对所述候选图像序列中每一帧的特征向量进行降维处理,得到所述候选图像序列中每一帧的第三降维特征向量;第四确定单元,用于基于所述候选图像序列中每一帧的第三降维特征向量、所述查询图像序列的自表达特征向量以及所述候选图像序列中每一帧的第一降维特征向量,得到所述候选图像序列的协同表达特征向量。
- 根据权利要求32所述的装置,其特征在于,所述第三确定单元包括:第三计算子单元,用于通过无参数相关函数计算所述查询图像序列中每一帧的第三降维特征向量与所述候选图像序列的自表达特征向量的相关度,得到所述查询图像序列中每一帧的第二相关权重;第三加权子单元,用于基于所述查询图像序列中每一帧的第二相关权重,对所述查询图像序列中每一帧的第一降维特征向量进行加权,得到所述查询图像序列的协同表达特征向量。
- 根据权利要求32所述的装置,其特征在于,所述第四确定单元包括:第四计算子单元,用于通过无参数相关函数计算所述候选图像序列中每一帧的第三降维特征向量与所述查询图像序列的自表达特征向量的相关度,得到所述候选图像序列中每一帧的第二相关权重;第四加权子单元,用于基于所述候选图像序列中每一帧的第二相关权重,对所述候选图像序列中每一帧的第一降维特征向量进行加权,得到所述候选图像序列的协同表达特征向量。
- 根据权利要求33或34所述的装置,其特征在于,所述第二相关权重包括第二归一化相关权重,所述第二归一化相关权重是对所述第二相关权重进行归一化处理得到的。
- 根据权利要求22至35中任意一项所述的装置,其特征在于,所述第三确定模块包括:第一计算子模块,用于计算所述查询图像序列的自表达特征向量与所述候选图像序列的协同表达特征向量之差,得到第一差向量;第二计算子模块,用于计算所述候选图像序列的自表达特征向量与所述查询图像序列的协同表达特征向量之差,得到第二差向量;第五确定子模块,用于基于所述第一差向量与所述第二差向量,得到所述查询图像序列与所述候选图像序列的相似性特征向量。
- 根据权利要求36所述的装置,其特征在于,所述第五确定子模块包括:第一计算单元,用于计算所述第一差向量与所述第二差向量之和,得到所述查询图像序列与所述候选图像序列的相似性特征向量;或者,第二计算单元,用于计算所述第一差向量与所述第二差向量的相应位的元素的乘积,得到所述查询图像序列与所述候选图像序列的相似性特征向量。
- 根据权利要求22至37中任意一项所述的装置,其特征在于,所述第四确定模块包括:第六确定子模块,用于将所述查询图像序列与所述候选图像序列的相似性特征向量输入第四全连接层,得到所述查询图像序列与所述候选图像序列的匹配分数;第七确定子模块,用于基于所述查询图像序列与所述候选图像序列的匹配分数,确定所述查询图像序列与所述候选图像序列的匹配结果。
- 根据权利要求38所述的装置,其特征在于,所述装置还包括:优化模块,用于基于所述查询图像序列与所述候选图像序列的匹配分数,采用同对标注数据和二 元交叉熵损失函数,优化网络参数。
- 根据权利要求22至39中任意一项所述的装置,其特征在于,所述装置还包括:第一切分模块,用于将查询视频切分为多个查询图像序列;第二切分模块,用于将候选视频切分为多个候选图像序列;第五确定模块,用于基于所述查询视频的查询图像序列与所述候选视频的候选图像序列的匹配结果,确定所述查询视频与所述候选视频的匹配结果。
- 根据权利要求40所述的装置,其特征在于,所述第一切分模块用于:按照预设序列长度以及预设步长,将查询视频切分为多个查询图像序列,其中,所述查询图像序列的长度等于所述预设序列长度,相邻的查询图像序列之间重叠的图像数等于所述预设序列长度与所述预设步长之差;所述第二切分模块用于:按照预设序列长度以及预设步长,将候选视频切分为多个候选图像序列,其中,所述候选图像序列的长度等于所述预设序列长度,相邻的候选图像序列之间重叠的图像数等于所述预设序列长度与所述预设步长之差。
- 根据权利要求40或41所述的装置,其特征在于,所述第五确定模块包括:第八确定子模块,用于确定所述查询视频的各个查询图像序列与所述候选视频的各个候选图像序列的匹配分数;第三计算子模块,用于计算所述查询视频的各个查询图像序列与所述候选视频的各个候选图像序列的匹配分数中最高的N个匹配分数的平均值,得到所述查询视频与所述候选视频的匹配分数,其中,N为正整数;第九确定子模块,用于基于所述查询视频与所述候选视频的匹配分数,确定所述查询视频与所述候选视频的匹配结果。
- 一种电子设备,其特征在于,包括:处理器;用于存储处理器可执行指令的存储器;其中,所述处理器被配置为:执行权利要求1至21中任意一项所述的方法。
- 一种计算机可读存储介质,其上存储有计算机程序指令,其特征在于,所述计算机程序指令被处理器执行时实现权利要求1至21中任意一项所述的方法。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020207007917A KR20200042513A (ko) | 2018-06-15 | 2019-05-13 | 타겟 매칭 방법 및 장치, 전자 기기 및 저장 매체 |
SG11202003581YA SG11202003581YA (en) | 2018-06-15 | 2019-05-13 | Target matching method and apparatus, electronic device, and storage medium |
JP2020515878A JP6883710B2 (ja) | 2018-06-15 | 2019-05-13 | ターゲットのマッチング方法及び装置、電子機器並びに記憶媒体 |
US16/841,723 US11222231B2 (en) | 2018-06-15 | 2020-04-07 | Target matching method and apparatus, electronic device, and storage medium |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810621959.5 | 2018-06-15 | ||
CN201810621959.5A CN109145150B (zh) | 2018-06-15 | 2018-06-15 | 目标匹配方法及装置、电子设备和存储介质 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/841,723 Continuation US11222231B2 (en) | 2018-06-15 | 2020-04-07 | Target matching method and apparatus, electronic device, and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019237870A1 true WO2019237870A1 (zh) | 2019-12-19 |
Family
ID=64802036
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/086670 WO2019237870A1 (zh) | 2018-06-15 | 2019-05-13 | 目标匹配方法及装置、电子设备和存储介质 |
Country Status (6)
Country | Link |
---|---|
US (1) | US11222231B2 (zh) |
JP (1) | JP6883710B2 (zh) |
KR (1) | KR20200042513A (zh) |
CN (1) | CN109145150B (zh) |
SG (1) | SG11202003581YA (zh) |
WO (1) | WO2019237870A1 (zh) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109145150B (zh) | 2018-06-15 | 2021-02-12 | 深圳市商汤科技有限公司 | 目标匹配方法及装置、电子设备和存储介质 |
CN111435432B (zh) * | 2019-01-15 | 2023-05-26 | 北京市商汤科技开发有限公司 | 网络优化方法及装置、图像处理方法及装置、存储介质 |
CN110705590B (zh) * | 2019-09-02 | 2021-03-12 | 创新先进技术有限公司 | 通过计算机执行的、用于识别车辆部件的方法及装置 |
CN110866509B (zh) | 2019-11-20 | 2023-04-28 | 腾讯科技(深圳)有限公司 | 动作识别方法、装置、计算机存储介质和计算机设备 |
KR102475177B1 (ko) * | 2020-11-04 | 2022-12-07 | 한국전자기술연구원 | 영상 처리 방법 및 장치 |
CN113243886B (zh) * | 2021-06-11 | 2021-11-09 | 四川翼飞视科技有限公司 | 一种基于深度学习的视力检测系统、方法和存储介质 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105608234A (zh) * | 2016-03-18 | 2016-05-25 | 北京京东尚科信息技术有限公司 | 图像检索方法和装置 |
CN107193983A (zh) * | 2017-05-27 | 2017-09-22 | 北京小米移动软件有限公司 | 图像搜索方法及装置 |
CN107451156A (zh) * | 2016-05-31 | 2017-12-08 | 杭州华为企业通信技术有限公司 | 一种图像再识别方法及识别装置 |
CN109145150A (zh) * | 2018-06-15 | 2019-01-04 | 深圳市商汤科技有限公司 | 目标匹配方法及装置、电子设备和存储介质 |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7921036B1 (en) * | 2002-04-30 | 2011-04-05 | Videomining Corporation | Method and system for dynamically targeting content based on automatic demographics and behavior analysis |
US10346726B2 (en) * | 2014-12-15 | 2019-07-09 | Samsung Electronics Co., Ltd. | Image recognition method and apparatus, image verification method and apparatus, learning method and apparatus to recognize image, and learning method and apparatus to verify image |
CN106326288B (zh) * | 2015-06-30 | 2019-12-03 | 阿里巴巴集团控股有限公司 | 图像搜索方法及装置 |
US9818126B1 (en) * | 2016-04-20 | 2017-11-14 | Deep Labs Inc. | Systems and methods for sensor data analysis through machine learning |
CN106649663B (zh) * | 2016-12-14 | 2018-10-16 | 大连理工大学 | 一种基于紧凑视频表征的视频拷贝检测方法 |
CN106886599B (zh) * | 2017-02-28 | 2020-03-03 | 北京京东尚科信息技术有限公司 | 图像检索方法以及装置 |
CN107862331A (zh) * | 2017-10-31 | 2018-03-30 | 华中科技大学 | 一种基于时间序列及cnn的不安全行为识别方法及系统 |
US20210264496A1 (en) * | 2020-02-21 | 2021-08-26 | Goldenspear Llc | Machine learning for rapid analysis of image data via curated customer personas |
-
2018
- 2018-06-15 CN CN201810621959.5A patent/CN109145150B/zh active Active
-
2019
- 2019-05-13 KR KR1020207007917A patent/KR20200042513A/ko not_active IP Right Cessation
- 2019-05-13 JP JP2020515878A patent/JP6883710B2/ja active Active
- 2019-05-13 SG SG11202003581YA patent/SG11202003581YA/en unknown
- 2019-05-13 WO PCT/CN2019/086670 patent/WO2019237870A1/zh active Application Filing
-
2020
- 2020-04-07 US US16/841,723 patent/US11222231B2/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105608234A (zh) * | 2016-03-18 | 2016-05-25 | 北京京东尚科信息技术有限公司 | 图像检索方法和装置 |
CN107451156A (zh) * | 2016-05-31 | 2017-12-08 | 杭州华为企业通信技术有限公司 | 一种图像再识别方法及识别装置 |
CN107193983A (zh) * | 2017-05-27 | 2017-09-22 | 北京小米移动软件有限公司 | 图像搜索方法及装置 |
CN109145150A (zh) * | 2018-06-15 | 2019-01-04 | 深圳市商汤科技有限公司 | 目标匹配方法及装置、电子设备和存储介质 |
Also Published As
Publication number | Publication date |
---|---|
SG11202003581YA (en) | 2020-05-28 |
JP2020534606A (ja) | 2020-11-26 |
US11222231B2 (en) | 2022-01-11 |
CN109145150B (zh) | 2021-02-12 |
CN109145150A (zh) | 2019-01-04 |
US20200234078A1 (en) | 2020-07-23 |
JP6883710B2 (ja) | 2021-06-09 |
KR20200042513A (ko) | 2020-04-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2019237870A1 (zh) | 目标匹配方法及装置、电子设备和存储介质 | |
TWI759722B (zh) | 神經網路訓練方法及裝置、圖像處理方法及裝置、電子設備和計算機可讀存儲介質 | |
US11120078B2 (en) | Method and device for video processing, electronic device, and storage medium | |
TWI766286B (zh) | 圖像處理方法及圖像處理裝置、電子設備和電腦可讀儲存媒介 | |
TWI747325B (zh) | 目標對象匹配方法及目標對象匹配裝置、電子設備和電腦可讀儲存媒介 | |
WO2021196401A1 (zh) | 图像重建方法及装置、电子设备和存储介质 | |
WO2021056808A1 (zh) | 图像处理方法及装置、电子设备和存储介质 | |
JP7110412B2 (ja) | 生体検出方法及び装置、電子機器並びに記憶媒体 | |
TWI738172B (zh) | 影片處理方法及裝置、電子設備、儲存媒體和電腦程式 | |
WO2021093375A1 (zh) | 检测同行人的方法及装置、系统、电子设备和存储介质 | |
TWI757668B (zh) | 網路優化方法及裝置、圖像處理方法及裝置、儲存媒體 | |
WO2021036382A1 (zh) | 图像处理方法及装置、电子设备和存储介质 | |
WO2021208666A1 (zh) | 字符识别方法及装置、电子设备和存储介质 | |
TW202109314A (zh) | 圖像處理方法及圖像處理裝置、電子設備和電腦可讀儲存媒體 | |
WO2020192113A1 (zh) | 图像处理方法及装置、电子设备和存储介质 | |
CN111582383B (zh) | 属性识别方法及装置、电子设备和存储介质 | |
WO2023115911A1 (zh) | 对象重识别方法及装置、电子设备、存储介质和计算机程序产品 | |
CN111259967A (zh) | 图像分类及神经网络训练方法、装置、设备及存储介质 | |
WO2016188065A1 (zh) | 云名片推荐方法及装置 | |
WO2021164100A1 (zh) | 图像处理方法及装置、电子设备和存储介质 | |
TWI770531B (zh) | 人臉識別方法、電子設備和儲存介質 | |
CN110110742B (zh) | 多特征融合方法、装置、电子设备及存储介质 | |
CN115422932A (zh) | 一种词向量训练方法及装置、电子设备和存储介质 | |
CN114842404A (zh) | 时序动作提名的生成方法及装置、电子设备和存储介质 | |
CN113807369A (zh) | 目标重识别方法及装置、电子设备和存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19818786 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2020515878 Country of ref document: JP Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 20207007917 Country of ref document: KR Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 23/04/2021) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19818786 Country of ref document: EP Kind code of ref document: A1 |