WO2022088909A1 - 图像档案的处理方法、装置、设备及计算机可读存储介质 - Google Patents
图像档案的处理方法、装置、设备及计算机可读存储介质 Download PDFInfo
- Publication number
- WO2022088909A1 WO2022088909A1 PCT/CN2021/115209 CN2021115209W WO2022088909A1 WO 2022088909 A1 WO2022088909 A1 WO 2022088909A1 CN 2021115209 W CN2021115209 W CN 2021115209W WO 2022088909 A1 WO2022088909 A1 WO 2022088909A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- file
- image
- target
- feature vector
- files
- Prior art date
Links
- 238000003672 processing method Methods 0.000 title claims description 21
- 238000000034 method Methods 0.000 claims abstract description 130
- 238000012545 processing Methods 0.000 claims abstract description 126
- 239000013598 vector Substances 0.000 claims description 336
- 230000015654 memory Effects 0.000 claims description 66
- 230000004044 response Effects 0.000 claims description 52
- 238000000605 extraction Methods 0.000 claims description 28
- 230000004931 aggregating effect Effects 0.000 claims description 7
- 230000002776 aggregation Effects 0.000 claims description 7
- 238000004220 aggregation Methods 0.000 claims description 7
- 238000013473 artificial intelligence Methods 0.000 abstract description 5
- 230000008569 process Effects 0.000 description 50
- 238000004364 calculation method Methods 0.000 description 26
- 238000004891 communication Methods 0.000 description 18
- 238000004590 computer program Methods 0.000 description 18
- 238000010586 diagram Methods 0.000 description 13
- 238000007726 management method Methods 0.000 description 11
- 230000006870 function Effects 0.000 description 10
- 238000004422 calculation algorithm Methods 0.000 description 7
- 230000001133 acceleration Effects 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 5
- 230000001360 synchronised effect Effects 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000013507 mapping Methods 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000001174 ascending effect Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000003064 k means clustering Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 239000000969 carrier Substances 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 238000000802 evaporation-induced self-assembly Methods 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
Definitions
- the present application relates to the technical field of artificial intelligence, and in particular, to a method, apparatus, device, and computer-readable storage medium for processing image files.
- the image file is a file containing images.
- it is often necessary to perform file hits on images.
- the images are compared with all files one by one, so that the file with the highest similarity is used as the file to be hit.
- the related art requires a large number of comparisons and a low hit rate, resulting in low processing efficiency of the image file.
- the present application provides an image file processing method, device, device and computer-readable storage medium to solve the problems provided by the related art.
- the technical solutions are as follows:
- a method for processing an image file comprising:
- the first feature vector of the first image is extracted.
- multiple files in the archive are aggregated to obtain multiple inter-file classes, and the number of the multiple inter-file classes is less than the number of the multiple files.
- a target bay class whose degree of similarity with the first image is greater than the first threshold is determined from the plurality of bay classes, and the number of the target bay classes is smaller than the number of the plurality of bay classes.
- use the first feature vector to perform matching with candidate profiles in the profiles included in the target inter-file class. If one of the target archives among the candidate archives is hit, image processing can be performed based on the target archive.
- the present application obtains a plurality of document classes through clustering, and then selects a target document class with a higher probability of successful matching with the first image from the plurality of document classes, thereby narrowing the comparison range, and then passes the first image
- the comparison with the candidate files included in the target file class hits the first target file. Therefore, it is avoided to compare the first image with all the files one by one, which not only reduces the number of comparisons, but also requires less computation and consumes less time, so the processing efficiency is high.
- the method before using the first feature vector to match with a candidate profile in the profile included in the target inter-file class, the method further includes: determining a short feature vector of the profile included in the target inter-file class; A short feature vector of the files included in the inter-file class, among the files included in the target file class, a file whose similarity degree with the first image is greater than the second threshold is determined as a candidate file.
- a file with a higher degree of similarity to the first image is selected as a candidate file according to the short feature vector of each file, thereby reducing the number of matches and improving the processing efficiency.
- each file in the multiple files corresponds to multiple representative feature vectors
- each file in the multiple files corresponds to multiple representative feature vectors
- the first feature vector and the target inter-file class include: Matching the candidate files in the files, including: based on the multiple representative feature vectors corresponding to each file, determining the candidate file with the greatest degree of similarity with the first feature vector in the candidate file, and the determined candidate file Archives are used as target archives.
- one candidate file that is most similar to the first image is determined as the hit target file, thereby ensuring the accuracy of the processing process.
- aggregating multiple files in the archive to obtain multiple inter-file classes including: aggregating files with a degree of similarity greater than a third threshold in the multiple files into the same inter-file class, obtaining Multiple stall classes.
- the files belonging to the same inter-file category are relatively similar, which ensures the accuracy of the processing process.
- the method further includes: obtaining a second image; and in response to the second image and the first image corresponding to the same object, filing the second image and the first image together in a target file.
- the second image and the first image belong to the same object, the second image and the first image are archived to the target file together, which realizes the batch filing of images, which is suitable for the case where the number of images to be archived is large, which is conducive to improving the processing efficiency.
- the method further includes: obtaining a third image; in response to the third image being similar to the first image, determining whether the third image matches the target profile; in response to the third image matching the target profile , archive the third image together with the first image to the target file.
- the third image is similar to the first image
- the third image can be archived in the target file, thereby realizing batch filing of images, which is suitable for When there are many images to be archived.
- the method further includes: obtaining a fourth image, and performing feature extraction on the fourth image to obtain a second feature vector corresponding to the fourth image; using the second feature vector and the first feature vector in the multiple files file matching, wherein the number of the first file is less than the number of multiple files, and the conditions met by the first file include: the image filing frequency is higher than the frequency threshold; in response to hitting a first target file in the first file, based on the first file A target file for image processing.
- the present application preferentially matches the fourth image with the first profile of the plurality of profiles.
- the first file is a file whose image filing frequency is greater than the threshold among the multiple files, and the probability of successful matching with the fourth image is high. If the first target file in the first file is successfully matched, the fourth image can be directly archived to the first target file without being compared with other files. Compared with the way in which the fourth image is compared with all the files one by one, the method provided by the present application requires less computation and consumes less time, so the processing efficiency is higher.
- the conditions satisfied by the first file further include: the shooting area of the first file matches the shooting area of the fourth image, and the shooting area of the first file is based on the shooting area of the image in the first file Sure.
- the shooting area of the first file also matches the shooting area of the fourth image, thereby increasing the possibility of hitting the first target file in the first file and reducing the amount of calculation. , Improve filing efficiency.
- the conditions satisfied by the first profile further include: the shooting period of the first profile matches the shooting period of the fourth image, and the shooting period of the first profile is based on the shooting period of the images in the first profile Sure.
- the shooting period of the first file also matches the shooting period of the fourth image, so it is more likely to hit the first target file in the first file, which is beneficial to reduce the amount of calculation, Improve filing efficiency.
- the method further includes: obtaining a fifth image, and performing feature extraction on the fifth image to obtain a third feature vector corresponding to the fifth image; using the third feature vector to match the first file; responding When the first file is not hit, the third feature vector is used to match the second file in the plurality of files, wherein the conditions satisfied by the second file include: the image filing frequency is lower than the frequency threshold; in response to hitting the second file in the second file A second target file for image processing based on the second target file.
- the fifth image is then matched with the second file whose image filing frequency is lower than the frequency threshold, so as to realize the archiving of the fifth image.
- performing image processing based on the target file includes: filing the first image into the target file. This implementation is applied to an image archiving scenario, where the archiving of the first image causes the first target archive to be updated.
- performing image processing based on the target archive includes: reading an image from the target archive. This implementation is applied to an image retrieval scenario, and the image read from the first target file is used as the retrieval result of the first image, thereby realizing image search by image.
- an apparatus for processing an image file comprising:
- an obtaining module configured to obtain a first image, perform feature extraction on the first image, and obtain a first feature vector corresponding to the first image
- the aggregation module is used to aggregate multiple files in the archive to obtain multiple inter-file classes, and the number of multiple inter-file classes is less than the number of multiple files;
- a determination module configured to determine, from a plurality of stall classes, a target stall class whose degree of similarity with the first image is greater than the first threshold, and the number of target stall categories is less than the number of multiple stall categories;
- a matching module configured to use the first feature vector to match with the candidate files in the files included in the target inter-file class
- the processing module is used for a target file in the candidate file, and performs image processing based on the target file.
- the determining module is further configured to determine the short feature vector of the file included in the target inter-file class; according to the short feature vector of the file included in the target inter-file class, in the file included in the target inter-file class A file whose degree of similarity with the first image is greater than the second threshold is determined as a candidate file.
- each file in the multiple files corresponds to multiple representative feature vectors, respectively
- the matching module is configured to determine, based on the multiple representative feature vectors corresponding to each file, the first feature in the candidate file.
- the candidate file with the greatest similarity between the vectors is used as the target file.
- the aggregation module is used for aggregating files whose similarity degree is greater than a third threshold in multiple files into the same inter-file class to obtain multiple inter-file classes.
- the obtaining module is further configured to obtain the second image
- the processing module is further configured to file the second image and the first image together in the target file in response to the second image and the first image corresponding to the same object.
- the obtaining module is further configured to obtain the third image
- a determining module further configured to determine whether the third image matches the target profile in response to the third image being similar to the first image
- the processing module is further configured to file the third image and the first image together in the target file in response to the matching of the third image with the second target file.
- the obtaining module is further configured to obtain a fourth image, and perform feature extraction on the fourth image to obtain a second feature vector corresponding to the fourth image;
- the matching module is further configured to use the second feature vector to match the first file among the multiple files, wherein the number of the first file is less than the number of the multiple files, and the conditions met by the first file include: the frequency of image filing is higher than that of the multiple files. frequency threshold;
- the processing module is further configured to perform image processing based on the first target file in response to hitting a first target file in the first file.
- the conditions satisfied by the second file further include: the shooting area of the second file matches the shooting area of the second image, and the shooting area of the second file is based on the shooting area of the image in the second file Sure.
- the conditions satisfied by the second file further include: the shooting period of the second file matches the shooting period of the second image, and the shooting period of the second file is based on the shooting period of the images in the second file.
- the obtaining module is further configured to obtain a fifth image, perform feature extraction on the fifth image, and obtain a third feature vector corresponding to the fifth image;
- the matching module is further configured to use the third feature vector to match with the first file; in response to not hitting the first file, use the third feature vector to match with the second file in the plurality of files, wherein the second file satisfies the Conditions include: the frequency of image archiving is lower than the frequency threshold;
- the processing module is further configured to perform image processing based on the second target file in response to hitting a second target file in the second file.
- the processing module is configured to archive the first image to the target file.
- the processing module is used to read the image from the target archive.
- an image file processing device comprising: a transceiver, a memory and a processor.
- the transceiver, the memory and the processor communicate with each other through an internal connection path, the memory is used for storing instructions, and the processor is used for executing the instructions stored in the memory to control the transceiver to receive signals and control the transceiver to send signals , and when the processor executes the instructions stored in the memory, the processor is caused to execute the method in the first aspect or any possible implementation manner of the first aspect.
- processors there are one or more processors and one or more memories.
- the memory may be integrated with the processor, or the memory may be provided separately from the processor.
- the memory can be a non-transitory memory, such as a read only memory (ROM), which can be integrated with the processor on the same chip, or can be separately set in different On the chip, the present application does not limit the type of memory and the manner of setting the memory and the processor.
- ROM read only memory
- a computer program comprising: computer program code, when the computer program code is executed by a computer, the computer program code causes the computer to execute the methods in the above aspects.
- a computer-readable storage medium stores programs or instructions. When the programs or instructions are run on a computer, the methods in the above aspects are performed.
- a chip including a processor for invoking and executing instructions stored in the memory from a memory, so that a communication device on which the chip is installed executes the methods in the above aspects.
- another chip including: an input interface, an output interface, a processor, and a memory, the input interface, the output interface, the processor, and the memory are connected through an internal connection path, and the processor is used to execute codes in the memory, When the code is executed, the processor is configured to perform the methods of the above-described aspects.
- FIG. 1 is a schematic diagram of processing an image file according to an embodiment of the present application.
- FIG. 2 is a schematic diagram of an implementation environment provided by an embodiment of the present application.
- FIG. 3 is a schematic flowchart of an image file processing method provided by an embodiment of the present application.
- FIG. 4 is a schematic flowchart of generating a short feature vector according to an embodiment of the present application.
- FIG. 5 is a schematic diagram of a segment of a representative feature vector provided by an embodiment of the present application.
- FIG. 6 is a schematic diagram of a segmented clustering provided by an embodiment of the present application.
- FIG. 7 is a schematic diagram of a stopwatch provided by an embodiment of the present application.
- FIG. 8 is a schematic diagram of obtaining a short feature vector based on code table conversion according to an embodiment of the present application.
- FIG. 9 is a schematic diagram of a distance table provided by an embodiment of the present application.
- FIG. 10 is a schematic diagram of querying distance based on a distance table according to an embodiment of the present application.
- FIG. 11 is an overall flowchart of an image file processing method provided by an embodiment of the present application.
- FIG. 13 is a schematic flowchart of a file statistics provided by an embodiment of the present application.
- 15 is a schematic diagram of a calculation acceleration provided by an embodiment of the present application.
- FIG. 16 is a schematic diagram of a calculation acceleration provided by an embodiment of the present application.
- 17 is a schematic structural diagram of an apparatus for processing an image file according to an embodiment of the present application.
- FIG. 18 is a schematic structural diagram of an image file processing device according to an embodiment of the present application.
- the processing of image archives includes image archiving and image retrieval.
- the image filing refers to: based on the feature vector of the image, the images belonging to the same object in the multiple images are classified into the same file, and multiple files are obtained.
- Objects include, but are not limited to, people, objects, artificial intelligence (AI), etc.
- the objects are, for example, vehicles.
- Image retrieval refers to: retrieving other images that belong to the same object as the image. As shown in Figure 1, through each file, services such as retrieval and comparison (for confirming the identity of the object), trajectory analysis (for generating the motion trajectory of the object), and frequency analysis can be realized, which is conducive to the realization of intelligent management of the city.
- an embodiment of the present application provides an image file processing method, which can be applied to the implementation environment shown in FIG. 2 .
- a camera node, an operation node, a management node, a computing node and a storage node are included.
- the camera node includes spherical, barrel and other cameras, which are used to capture images or record videos.
- the images to be archived or retrieved include but are not limited to: images captured by the camera node, images captured from captured images, and images captured from recorded videos, where the images include at least one object .
- the operation node is used to interact with the user to enable the user to deploy, configure and manage image archiving tasks and image retrieval tasks.
- the management node is used to obtain images and videos from the camera node. For example, referring to FIG. 2 , the camera node uploads the images and videos to the cloud, and the management node obtains the images and videos from the cloud.
- the management node is also used to manage the computing node and the storage node in conjunction with the image archiving task and the image retrieval task. In the management process, the management node forwards images and videos to the computing nodes, and the computing nodes are used to complete the computing tasks involved in the image archiving task and the image retrieval task according to the received images and videos, so as to achieve computing acceleration.
- the storage node is used for storing the images captured by the camera node, the recorded videos and the respective images archived in the archive through the image archiving process according to the management of the management node.
- management node, computing node, and storage node in FIG. 2 are different devices, or any two or three of the management node, computing node, and storage node may also be integrated into the same device.
- an embodiment of the present application provides an image file processing method. Referring to Figure 3, the method includes the following steps.
- this embodiment does not limit the method of feature extraction, and the feature vector obtained through the feature extraction process belongs to the long feature vector.
- a deep learning method is used to perform feature extraction on the first image, so as to obtain a first feature vector corresponding to the first image.
- deep learning methods include, but are not limited to, deep neural networks (DNN).
- DNN deep neural networks
- a file includes at least one image, and the file corresponds to the object one by one. That is, all images included in an archive are images of the same object.
- the profile in response to a profile corresponding to an object that can be identified, the profile also corresponds to an identity feature image.
- One file corresponds to a first central feature vector, and multiple files are clustered based on the first central feature vectors corresponding to multiple files, thereby obtaining multiple inter-file classes.
- aggregating multiple archives in the archive to obtain multiple inter-archive classes includes: aggregating archives with a degree of similarity greater than a third threshold in the multiple archives into the same inter-file class to obtain multiple Room class. That is, the files included in the same inter-file class are relatively similar.
- a bay class includes at least one first center feature vector, and the center feature vector of a bay class is determined based on the at least one first center feature vector included in the bay class.
- determining the degree of similarity between the stall category and the first image includes: determining a vector distance between the center feature vector of the stall class and the first feature vector of the first image, The vector distance is used to indicate the degree of similarity between the class and the first image, and the smaller the vector distance is, the higher the degree of similarity is indicated.
- the target inter-bay class is determined from a plurality of inter-bay classes based on the vector distances corresponding to each inter-bay-class.
- the vector distances corresponding to each inter-bay class are sorted in ascending order, and in the sequence obtained by sorting, the inter-bay class corresponding to the first reference number of vector distances is selected as the target inter-bay class.
- the stall class corresponding to the vector distances of the first reference number in the sequence is the stall category whose similarity degree with the first image is greater than the first threshold.
- the first central feature vector corresponding to a profile is determined based on the feature vectors corresponding to the images included in the profile. In the case that only one image is included in a file, the feature vector corresponding to the image is used as the first center feature vector corresponding to the file. Or, when a file includes multiple images, first determine the representative feature vector of the file based on the feature vectors corresponding to the multiple images, and then perform a weighted summation on the multiple representative feature vectors to determine the first center corresponding to the file. Feature vector. Wherein, referring to 405 in FIG. 4 , a plurality of representative feature vectors of a profile and a first central feature vector are taken together as a typical feature vector of the file, and the typical feature vector belongs to a long feature vector.
- the feature vector of the identity feature image also belongs to the canonical feature vector. Understandably, over time, more and more images will be archived in the archive, thereby increasing the number of images in the archive. Exemplarily, when the number of images added in a profile exceeds a certain threshold, the typical feature vector of the profile is updated.
- determining the representative feature vector of the file based on the feature vectors corresponding to the multiple images includes: clustering the feature vectors corresponding to the images included in the file to obtain multiple categories within the file, which are called intra-file classes.
- An in-file class includes a feature vector corresponding to at least one image, and a central feature vector corresponding to the in-file class can be determined based on the feature vector included in the in-file class. After that, the central feature vectors corresponding to each in-file class are determined as multiple representative feature vectors of the file.
- this embodiment does not limit the manner in which the files are clustered.
- the clustering method within the archive includes density-based spatial clustering of applications with noise (DBSCAN).
- the process of determining the first central feature vector corresponding to the file will be described.
- the feature vectors corresponding to 100 images that is, 100 feature vectors
- 10 in-file classes are obtained.
- each in-file class includes 10 eigenvectors.
- the corresponding central eigenvectors of the in-file class can be determined, and 10 in-file classes can determine a total of 10 central eigenvectors.
- the 10 central feature vectors corresponding to the 10 in-file classes are used as the representative feature vectors of the file, and the file has 10 representative feature vectors.
- weighted summation is performed on the 10 representative feature vectors, so that the result of the weighted summation is used as the first central feature vector corresponding to the file.
- a file has 10 representative feature vectors is only an example, and is not used to limit the number of representative feature vectors of a file.
- this embodiment further determines the first center feature vector corresponding to the file in combination with the identity feature image corresponding to the file. For example, in the case that an image is included in the file, the weighted summation result of the feature vector corresponding to the image and the feature vector corresponding to the identity feature image is used as the first central feature vector. For another example, when the file includes multiple images, a weighted summation result of multiple representative feature vectors and feature vectors corresponding to the identity feature image is used as the first central feature vector.
- the candidate file can be determined based on the file corresponding to the at least one first center feature vector, so that the first feature vector and the determined reserve file can be used. Select files to match.
- the file corresponding to the at least one first central feature vector included in the target inter-file class is hereinafter referred to as the file included in the target inter-file class.
- the method further includes: determining a short feature vector of the profiles included in the target inter-file class. According to the short feature vector of the files included in the target inter-file class, among the files included in the target inter-file class, a file whose similarity degree with the first image is greater than the second threshold is determined as a candidate file.
- the typical eigenvectors of the file that is, the first central eigenvector and the multiple representative eigenvectors belong to the long eigenvectors, and the length of the long eigenvectors is greater than the length of the short eigenvectors. Therefore, by determining the short feature vector of the file included in the target inter-file class, and then determining the similarity between the file included in the target file class and the first image according to the short feature vector, it is beneficial to reduce the amount of calculation and improve the processing efficiency.
- a codebook (codebook) is generated based on the product quantization (PQ) algorithm, so as to convert the first central feature vector corresponding to the files included in the target inter-file class into files according to the codebook. short eigenvectors of .
- the code table is generated based on the representative feature vector of each archive in the plurality of archives.
- the representative feature vectors of each file are segmented, and the number of segments of each representative feature vector is the same. For ease of differentiation, each segment corresponds to a different number.
- each segment with the same number is clustered to obtain a plurality of inter-segment classes, so as to generate a code table based on the central feature vector corresponding to each inter-segment class.
- the code table since the code table is generated based on the representative feature vector of each file, and the representative feature vector of each file will be updated, so when the number of files whose representative feature vector is updated is greater than a certain threshold, the code table needs to be regenerated , and update the short feature vector based on the regenerated code table.
- each segment has different numbers S1, S2, ..., S32.
- S1 256 inter-segment classes corresponding to S1 are obtained by clustering each segment numbered S1, each inter-segment class corresponds to a central feature vector, and S1 corresponds to 256 central feature vectors in total.
- S1 corresponds to 256 central feature vectors in total.
- the first column corresponds to S1
- the 256 items included in the first column are the 256 central feature vectors corresponding to S1.
- the 256 items included in the first column correspond to indices 0-255 in order from top to bottom.
- the second column corresponds to S2
- the 256 items included in the second column are the 256 central feature vectors corresponding to S2
- the 256 included in the second column correspond to indices 0-255 from top to bottom. And so on for other columns.
- the first central feature vector of the archive can be converted into a short feature vector.
- the first central feature vector is segmented, and the number of segments is the same as the number of columns in the code table.
- a column in the code table corresponding to the segment is determined, and the inter-segment class in which the segment is located is determined in the column, so that the index corresponding to the inter-segment class in which the segment is located can be determined.
- the indexes corresponding to the inter-segment classes in which each segment in the first center feature vector is located are combined to obtain a short feature vector transformed based on the first center feature vector.
- the first central feature vector is divided into 32 segments.
- a column corresponding to the first segment in the code table is S1
- the index corresponding to the first segment is 1.
- a column corresponding to the second segment in the code table is S2
- in response to the inter-segment class where the second segment is located is the first item in S2, the index corresponding to the second segment is 0.
- the short eigenvector corresponding to the file can be obtained as (1, 0 , 3, 255, ..., 2, 254).
- determining the degree of similarity between each file and the first image according to the short feature vector of the file comprising: generating a distance table corresponding to the first image based on the code table and the first feature vector corresponding to the first image, according to the distance table and The short feature vector of the profile determines the vector distance between the profile and the first image, the vector distance is used to indicate the similarity between the profile and the first image, and the smaller the vector distance, the higher the indicated similarity.
- the files included in the target inter-file class if the similarity between a file and the first image is greater than the second threshold, the file can be used as a candidate file.
- the first feature vector of the first image is segmented, and the number of segments is the same as the number of columns in the code table. For any segment in the first feature vector, determine a column in the code table corresponding to the segment, and determine multiple distances between the segment and items in the column, and combine the multiple distances into a column in the distance table . For example, referring to the code table shown in FIG. 7 , for the first segment in the first feature vector, a column in the code table corresponding to this segment is S1. Determine the 256 distances between this segment and the 256 items included in S1, and combine the 256 distances into the first column in the distance table shown in Figure 9.
- the first item in the first column of the distance table is the distance between the first segment of the first feature vector and the first item in the S1 column in the code table
- the second item in the first column of the distance table is the first feature
- the distance between the first segment of the vector and the second item in the S1 column in the code table, and so on, the 256th item in the first column of the distance table is the first segment of the first feature vector and the code table in the S1 column. Distance between items 256. The other columns in the distance table will not be repeated one by one.
- the vector distance between the first feature vector and the short feature vector of the file can be determined by querying the distance table corresponding to the first image, and no calculation is required, thereby reducing the amount of calculation required in the processing process.
- the distance table corresponding to the first image includes a total of 32 columns S1-S32, the indices corresponding to each column in the distance table from top to bottom are 0-255 in sequence, and the short feature vector of the file is (1, 0, 3, 255, ..., 2, 254).
- the first value in the short eigenvector corresponds to the first column S1 in the distance table.
- the first value is 1, an item with an index of 1 in the S1 column in the code table (that is, the second item in the S1 column) is searched. Determine the first distance.
- the second value in the short eigenvector B corresponds to the second column S2 in the distance table. Since the second value is 0, look for an item with an index of 0 in the S2 column in the code table (that is, the first item in the S2 column) , to determine the second distance.
- the distances are obtained by querying the distance table in the same way, so that 32 distances can be obtained. Then, the 32 distances obtained from the query are combined to obtain the vector distance between the first eigenvector and the short eigenvector of the file.
- GPU graphics processing unit
- the GPU corresponds to on-chip memory, and the capacity of the on-chip memory is often small, so that the GPU cannot store all short feature vectors and distance tables. Therefore, referring to 404 in FIG. 4 , in this embodiment, all the short feature vectors and the distance table are first loaded (load) into a larger capacity off-chip corresponding to a central processing unit (CPU).
- CPU central processing unit
- this implementation divides all short feature vectors and distance tables into multiple batches and imports them into the CPU.
- the short feature vectors and distance tables that have been imported into the CPU can also be replaced according to actual needs.
- the clustering algorithm indicates whether the short feature vector and distance table in the CPU need to be replaced.
- the first feature vector is used to match the candidate profile.
- each file in the multiple files corresponds to a plurality of representative feature vectors
- the first feature vector is used to match the candidate files in the files included in the target inter-file class, including: based on the corresponding A plurality of representative feature vectors, the candidate file with the greatest degree of similarity to the first feature vector is determined in the candidate file, and the determined candidate file is used as the target file.
- multiple vector distances between the first feature vector and multiple representative feature vectors of the candidate file are respectively determined, that is, one candidate file corresponds to multiple vector distances. After that, whether to hit a target file in the candidate files is determined according to the vector distances corresponding to all the candidate files, as described in 304 for details.
- the candidate file corresponding to the minimum vector distance is determined, so that the candidate file corresponding to the minimum vector distance is determined as the hit target file.
- weighted summation is performed on a plurality of vector distances corresponding to a candidate file to obtain a weighted summation distance corresponding to the candidate file.
- the candidate file corresponding to the minimum weighted summation distance is determined as the hit target file.
- performing the image processing based on the target archive includes archiving the first image to the target archive. For example, referring to 403 in FIG. 4 , when the target file is hit, the first image is directly archived into the hit target file, and this processing method is applied to the image filing scenario.
- performing image processing based on the target archive includes: reading an image from the target archive, and the processing method is applied to an image retrieval scenario. That is to say, the image read from the target file can be used as the retrieval result of the first image, thereby realizing image search by image.
- both image archiving and image retrieval may be performed, which is not limited in this embodiment.
- the central feature vector and the representative feature vector of each profile in the plurality of profiles are determined first. Perform inter-file clustering on the central feature vector of the file according to 1105 to obtain multiple inter-file classes. According to 1106, the representative feature vector of the file is segmented and clustered to obtain a code table, and the central feature vector of the file is converted into a short feature vector of the file based on the code table.
- the distance table of the image is determined from the long feature vector of the image and the stop table generated in 1106.
- the distance between the short feature vector of these files and the long feature vector of the image is queried according to the distance table, and the distances are arranged in the order from small to small.
- 1104 determine the distance between the long feature vector of the image and the representative feature vector of the file selected in 1103, and hit a file based on the distance determined in 1104, for example, hit the file corresponding to the minimum distance, thereby filing the image to the hit archive, or read the image from the hit archive as the retrieval result.
- the embodiment of the present application firstly obtains a plurality of inter-file classes by clustering, and then selects a target inter-file class with a higher probability of being successfully matched with the first image from the plurality of inter-file classes, so that the target inter-file class is based on the target file
- the inter-class determines candidate profiles for inter-matching with the first image. Therefore, it is avoided to match the first image with all the files one by one, which not only reduces the number of matches, but also requires less computation and consumes less time, so the processing efficiency is high.
- each retrieval is often only for one image.
- this embodiment also provides a method for batch archiving multiple images, so as to reduce the amount of calculation and improve the processing efficiency, as described below.
- the method further includes obtaining a second image, and in response to the second image corresponding to the same object as the first image, filing the second image with the first image to the target archive.
- multiple images are clustered to obtain multiple categories.
- the clustering accuracy of the category is not lower than the accuracy threshold, it is considered that the first image and the second image correspond to the same object.
- the second image and the first image correspond to the same object, the second image and the first image can be archived to the same file, that is, the target file.
- other images in the category where the first image and the second image are located can also be archived in the target file. In this way, it is ensured that each image belonging to the same object is located in the same file, thereby ensuring the accuracy of the image filing process.
- the method further includes obtaining a third image, and in response to the third image being similar to the first image, determining whether the third image matches the target profile.
- the third image is filed with the first image to the target dossier.
- multiple images are clustered to obtain multiple categories. In response to the fact that the first image and the third image are in the same category, but the clustering accuracy of this category is lower than the accuracy threshold, it is considered that the first image and the third image are in the same category.
- the image is similar to the first image.
- determining whether the third image matches the target profile includes: determining a plurality of vector distances between a feature vector of the third image and each representative feature vector of the target profile, under the condition that the plurality of vector distances satisfy a condition It is determined that the third image matches the target profile.
- the multiple vector distances satisfy the condition, including: the minimum vector distance among the multiple vector distances is less than the second distance threshold, or the weighted sum distance of the multiple vector distances is less than the third distance threshold.
- a process of clustering multiple images to obtain multiple categories is involved.
- feature extraction is first performed on multiple images to be archived to obtain multiple feature vectors, and then the feature vectors are clustered.
- the clustering methods include, but are not limited to, K-nearest neighbors (KNN) and K-means clustering algorithms (k-means clustering algorithm, K-means).
- KNN K-nearest neighbors
- K-means clustering algorithm K-means clustering algorithm
- This embodiment does not limit the clustering methods.
- this embodiment performs clustering based on different feature vectors. Wherein, when the number of multiple images is not greater than the number threshold (for example, 100 million), referring to FIG.
- the feature vectors extracted by 401 are directly clustered, and the feature vectors are also called long feature vectors.
- the feature vectors of each image are respectively converted into short feature vectors according to step 402 in FIG. 4 , and then the short feature vectors are clustered.
- the short feature vector of the image refer to 302 above, which will not be repeated here.
- clustering can be performed according to the short feature vector. Among them, the clustering process relies on the distance calculation between short feature vectors of different images. Exemplarily, referring to 403 in FIG.
- corresponding distance tables are respectively generated for the feature vectors of each image, and the distance calculation between the short feature vectors of different images is realized based on the distance table, so as to speed up the clustering speed. , which is beneficial to improve the processing efficiency.
- For the process of determining the distance based on the distance table refer to 302 above, which will not be repeated here.
- any category obtained by clustering includes at least one feature vector, one category corresponds to a second central feature vector, and the second central feature vector is determined based on at least one feature vector included in the category.
- this embodiment first performs clustering on multiple images to be archived to obtain multiple categories.
- the second central feature vector of the category is used to match the candidate file, and in response to hitting the target file in the candidate file, the file comparison process is performed by referring to 1305 in FIG. 13 .
- each image in the category is archived into the target file.
- images in that class that match the target dossier are archived in the target dossier.
- images in this category that do not match the target archives they can be re-clustered with other subsequently obtained images, and then archived in batches.
- this embodiment further provides the following image file processing method, which includes the following steps.
- the conditions met by the first file include: the image filing frequency is greater than the frequency threshold.
- the image filing frequency refers to the number of times the file is archived within the reference time period, or the number of images that are archived in the file within the reference time period. It can be seen that the first file is the part of the files with a high frequency of image filing among the multiple files, so the first file and the second feature vector corresponding to the fourth image are more likely to be successfully matched.
- this embodiment selects the first file with a higher probability of successful partial matching from the multiple files, and preferentially uses the second feature vector. Matching with these first files reduces the calculation amount, shortens the matching time, and improves the processing efficiency.
- the number of times of image archiving of an archive within a reference time period is obtained by performing a weighted summation on the archiving frequencies of a plurality of different time periods. For example, determine the first filing frequency of a file in the past week and the second filing frequency of the file in the past day, and use the weighted sum of the first and second filing frequencies as the total filing frequency of the file .
- the past week and the past day are only examples of different durations, and are not used to limit the duration.
- the weight corresponding to the first filing frequency above is greater than the weight corresponding to the second filing frequency.
- the weight corresponding to the first filing frequency is 0.6
- the weight corresponding to the second filing frequency is 0.4.
- the frequency threshold is used to indicate the quantity, or is used to indicate the proportion.
- the frequency threshold may be an empirically determined value. Taking the frequency threshold of 50 times as an example, in response to the image filing frequency of a file being greater than 50 times, the file is determined to be the first file. Alternatively, the image filing frequencies of multiple archives are arranged in descending order to obtain a frequency sequence. In the frequency sequence, the file corresponding to the first frequency threshold image filing frequency is selected as the first file. For example, if the frequency threshold is 40, the file corresponding to the first 40 image filing frequencies in the frequency sequence is selected as the first file. For the latter, the frequency threshold can be a fraction or a percentage.
- the frequency threshold as the percentage of 4% as an example, after arranging the image filing frequencies of multiple files in descending order, select the file corresponding to the first 4% of the image filing frequencies as the first frequency sequence. file.
- other thresholds involved in this embodiment are also used to indicate the quantity or proportion, which will not be described in detail below.
- using the second feature vector to match the first profile of the plurality of profiles includes: determining a similarity between the second feature vector and the first profile of the plurality of profiles.
- each file in the multiple files corresponds to a first center feature vector, respectively, and the similarity between the second feature vector and the first file includes: the second feature vector corresponds to the first center of the first file.
- Vector distance between feature vectors The smaller the vector distance is, the higher the similarity between the second feature vector and the first file is.
- the condition satisfied by the first file further includes: the shooting area of the first file matches the shooting area of the fourth image.
- the shooting area of the first file is determined based on the shooting region of the image in the first file.
- the area to which the processing procedure is applied can be divided into multiple areas.
- An area includes at least one camera.
- the image is captured by the camera included in the area, and the area where the image is captured is the area .
- each administrative area in the area to which the processing procedure is applied is directly regarded as a plurality of areas. For example, if the region to which the processing process is applied is province A, and the administrative region of province A includes city A1 and city A2, city A1 and city A2 are used as multiple regions.
- the administrative area can also be adjusted according to the number of cameras in each administrative area to obtain multiple areas, so that the difference between the number of cameras included in any two areas is not greater than a certain threshold.
- different areas correspond to different area numbers.
- the shooting area of the image is indicated by the area number corresponding to the shooting area.
- the cameras in each area are numbered respectively, and the images captured by one camera all correspond to the number of the camera.
- the mapping relationship between the area number and the camera number is stored, after the image is obtained, the mapping relationship is searched according to the camera number corresponding to the image, and the area number corresponding to the camera number can be obtained.
- the cameras in the same area are all configured with the area number of the area where the camera is located, and the images captured by one camera all correspond to the area number configured by the camera.
- the obtained image itself has a corresponding area number, and there is no need to search according to the above mapping relationship.
- the area number of the shooting area can be determined by reading the designated position in the camera number.
- the area number is a 5-digit string
- the camera number is a 10-digit string
- the first 5 digits of the 10 digits are the area number of the area where the camera is located
- the last 5 digits are used to distinguish different cameras in the same area.
- the area number corresponding to the shooting area can be determined by reading the first five digits of the camera number.
- the shooting area of the first file is determined based on the shooting region of the image in the first file, which specifically includes the following three ways.
- the first way to determine the shooting area the first file includes only one image, or the first file includes multiple images, and the multiple images have the same shooting area, then the shooting area of the image is used as the first file. shooting area.
- the second way of determining the shooting area in response to that in all the images included in the first file, the shooting regions of the images that exceed a certain ratio threshold are the same region, the region is used as the shooting region of the first file.
- the first file includes 100 images and the ratio threshold is 35%. If the shooting areas of more than 35 images in the 100 images are in the A area, the shooting area of the first file includes the A area. It can be understood that the shooting area of the first file may include multiple areas. For example, if the first file includes 100 images and the ratio threshold is 35%, if the shooting area of more than 35 images in the 100 images is area A, and the shooting area of more than 35 images is area B, then the first file The shooting area of a file pair includes an A area and a B area.
- the shooting area of the first file may not be determined.
- the first file includes 100 images and the ratio threshold is 35%
- the shooting area of 33 images in the 100 images is area A
- the shooting area of 33 images is area B
- the shooting area of 34 images is area A
- area C since the images captured in each area do not exceed the ratio threshold of 35%, area A, area B and area C cannot be used as the photographing area of the first file.
- the third way to determine the shooting area for an area, determine the number of images shot in the area in each first file. After that, arrange the number of images in descending order to get an image sequence. In the sequence of images, the shooting area of the first file that is larger than the number threshold is determined as the area.
- the first file A includes 30 images whose shooting area is area A
- the first file B includes 20 images whose shooting area is area A
- the first file C includes 10 images whose shooting area is area A. Since the first file A has the most images shot in the area A (30 images), the shooting area of the first file A is determined as the area A. It can be understood that, in the third manner, the shooting area of the first file may include multiple regions, and the shooting area of the first file may not be determined.
- matching the photographing area of the first archive with the photographing area of the fourth image means that the photographing area of the first archive includes the photographing area of the fourth image.
- shooting area For example, the shooting area of the fourth image is the area A, the shooting area of the first file 1 is the area A, and the shooting area of the first file 2 is the area B, then the shooting area of the first file 1 is the same as the shooting area of the fourth image. match.
- the shooting area of the first file may further include other areas than the shooting area of the fourth image. For example, if the shooting area of the fourth image is area A, and the shooting area of the first file 3 includes area A, area B and area C, the shooting area of the first file 3 also matches the shooting area of the fourth image.
- the condition satisfied by the first profile further includes: the shooting period of the first profile matches the shooting period of the fourth image.
- the shooting period of the first profile is determined based on the shooting period of the images in the first profile.
- one image has a fixed shooting time.
- the shooting time period of the first file is determined based on the shooting time period of the images in the first file, and matching the shooting time period of the first file with the shooting time period of the fourth image means that the shooting time period of the first file includes the shooting time period of the fourth image.
- Shooting period Exemplarily, determining the shooting period of the first file includes the following three ways.
- the first way to determine the shooting period the first file includes only one image, or the first file includes multiple images, and the multiple images have the same shooting period, then the shooting period of the image is used as the first file. Shooting period.
- the second way of determining the shooting period in response to the shooting period of all the images included in the first file exceeding a certain percentage threshold being the same period, the period is taken as the shooting period of the first file. For example, if the first file includes 100 images and the ratio threshold is 35%, if the shooting period of more than 35 images in the 100 images is the A period, the shooting area corresponding to the first file includes the A area.
- the third way to determine the shooting period for a period, determine the number of images in each first file that were shot within the period, that is, the period filing frequency of each first file. Sort the period filing frequency in descending order. In the obtained sequence, the period is taken as the shooting period of the first archive corresponding to the archive frequency of the period greater than the first threshold. For example, in the period A, the period filing frequency of the first file A is 30, the period filing frequency of the first file B is 20, and the period filing frequency of the first file C is 10, then the period A is used as the period filing frequency of the first file A. The reason is that the first file A has the highest filing frequency of the time period.
- the high-frequency archives shown in 1303 correspond to archives whose image archive frequency is greater than the frequency threshold.
- the number of images archived in the file within the reference time period is determined, so as to obtain the image filing frequency of the file. Whether the file corresponds to a shooting area within the reference period is determined according to the shooting area of each image filed in the file within the reference period.
- different files correspond to different file identifications (IDs).
- IDs file identifications
- the file IDs corresponding to each file whose image filing frequency is greater than the threshold may be formed into the first file list.
- the file can also be mapped with the area number of the corresponding shooting area to form a second file list.
- the reference duration is also divided into a plurality of time periods, and the number of images archived into the archive in each time period is determined as the time period archiving frequency of the archive. Whether the file corresponds to a shooting area within the time period is determined according to the shooting area of the images filed in the file within the time period. Exemplarily, for any time period obtained by division, this embodiment selects the file ID corresponding to the file whose time period archiving frequency in this time period is greater than a certain threshold from among the files included in the above-mentioned first file list to form a third file list. . Since a plurality of time periods are obtained through division, and each time period corresponds to a third file list, multiple third file lists can be obtained.
- a file in a third file list corresponds to a shooting area
- the file may also be mapped with the area number of the corresponding shooting area to form a fourth file list.
- the shooting area of a file in the fourth file list refers to the shooting area of the file within the corresponding time period, rather than the shooting area of the file within the reference time period.
- the first file needs to meet the first and second conditions, then determine the area number of the shooting area of the fourth image, and use the file with the same area number mapped in the second file list as the first file;
- the shooting period of the fourth image determines a third file list corresponding to the shooting period of the fourth image from the multiple third file lists, and set the third file list to the third file list.
- the file included in the file list is the first file;
- the area number of the shooting area of the fourth image is determined, and the fourth file list mapped with the same area number is filtered out from the fourth file list, and then filtered out.
- a fourth file list corresponding to the shooting period of the fourth image is determined in the fourth file list of , and the files included in the fourth file list are taken as the first file.
- the four file lists are obtained through a statistical sorting process, with the continuous increase of images and the continuous filing of images in each file, the four file lists also need to be updated.
- the four archive lists are updated at regular intervals, or the four archive lists are updated once when the sum of the image archiving frequencies of multiple archives exceeds a certain threshold.
- using the second feature vector to match the first file includes: determining the similarity between the second feature vector and the first file, where the similarity is, for example, the similarity between the second feature vector and the first file.
- the first vector distance between the first central feature vectors in response to the second feature vector having the highest similarity with the first target profile in the first profile and satisfying the threshold, for example, the first central feature vector between the second feature vector and the first central feature vector corresponding to the first target profile If the vector distance is the smallest and smaller than the first distance threshold, it is considered that the second feature vector is successfully matched with the first target file, that is, the first target file is hit.
- image processing is performed based on the second target archive, including filing the fourth image to the first target archive.
- the fourth image is directly archived to the hit first target file, and this processing method is applicable to the scene of image filing.
- performing image processing based on the second target file also includes reading images from the first target file, and this processing method is suitable for an image retrieval scenario.
- this processing method is suitable for an image retrieval scenario.
- this embodiment preferentially matches the fourth image with the first file among the plurality of files.
- the first file is a file whose image filing frequency is greater than the threshold among the multiple files, and the probability of successful matching with the fourth image is high. If the first target file in the first file is successfully matched, the image processing can be performed directly based on the first target file without the need to compare with other files.
- the processing method provided by the embodiment of the present application not only has a high hit rate, but also requires fewer comparison times, less computation, and time-consuming. shorter, so the processing efficiency is higher.
- the method further includes the following steps.
- a third feature vector is used to match a second profile of the plurality of profiles.
- the third feature vector is used to match the second profile of the plurality of profiles.
- the conditions satisfied by the second file include: the image filing frequency is lower than the frequency threshold.
- this embodiment uses the third feature vector to sequentially match each of the second files.
- this embodiment may also refer to the inter-file class-based matching methods in 301-304 to perform a comparison of multiple files except the first file Other files are clustered to obtain multiple inter-file classes.
- the condition satisfied by the first file only includes the image filing frequency being lower than the frequency threshold, other files in the multiple files except the first file are equivalent to the second file.
- other files other than the first file in the multiple files include the second file, and also include files other than the second file.
- a file class whose degree of similarity with the fifth image is greater than a certain threshold is selected, so as to determine the second file based on the files included in the selected file room class.
- all files included in the selected inter-file category are determined as the second file.
- the short feature vector of the archive included in the selected inter-file class is determined.
- the distance table of the fifth image is generated based on the code table, the vector distance between the short feature vector of each file and the third feature vector of the fifth image is determined according to the distance table, and the file whose vector distance is less than a certain threshold is determined as the second file.
- the third feature vector is used to match the second profile.
- the matching process referring to the long feature comparison process shown at 1304 in FIG. 13 , for each second file, multiple vector distances between the third feature vector and multiple representative feature vectors of the second file are determined respectively, and also That is, a second file corresponds to multiple vector distances. After that, it is determined whether a second target file in the second files is hit according to the vector distances corresponding to all the second files, as described in 1206 for details.
- the second file corresponding to the smallest vector distance is determined, so that the second file corresponding to the smallest vector distance is determined as the hit second target file.
- weighted summation is performed on a plurality of vector distances corresponding to a second file to obtain a weighted summation distance corresponding to the second file.
- the second file corresponding to the minimum weighted sum distance is determined as the hit second target file.
- image processing is performed based on the second target file, including at least one of filing the fifth image to the second target file and reading the image from the second target file.
- the fifth image does not hit the second file.
- the fifth image can be processed again according to the steps described above.
- the fifth image is clustered with other images obtained subsequently, so as to perform batch processing.
- the above descriptions in 1204-1206 are aimed at the case where a fifth image does not hit the first file.
- the typical feature vector includes a central feature vector and multiple representative feature vectors, and the typical feature vector belongs to the long feature vector of the file.
- a plurality of files are clustered to obtain a plurality of inter-file classes, and the inter-file classes correspond to a central feature vector.
- Generate a code table according to 1410, and convert the central feature vector of each file into a short feature vector based on the code table, and the short feature vector of each file can be stored in the file library.
- high-frequency files are obtained, wherein the high-frequency files correspond to the files whose image filing frequency is greater than the frequency threshold in the above description.
- image archiving After obtaining multiple images to be archived, image archiving begins. Referring to 1401, perform feature extraction on the image to obtain a long feature vector of the image. Referring to 1402, different clustering methods are selected according to the number of images. Among them, if the magnitude of the long feature vector is not greater than the threshold, the clustering is performed directly based on the long feature vector. If the magnitude of the long feature vector is greater than the threshold, convert the long feature vector of the image into a short feature vector of the image based on the code table generated in 1410, and then perform clustering based on the short feature vector of the image. Either of the two clustering methods can obtain multiple categories, each category has a central feature vector, and each category is compared according to the following instructions.
- the central eigenvectors of the classes are used to compare with the central eigenvectors of the high frequency profiles.
- each image corresponding to the category and the long feature vector of the image are archived and stored in the archive through 1408. If the clustering accuracy does not meet the requirements, according to 1407, the long feature vectors of each image in the category are compared with the central feature vectors of the hit high-frequency archives one by one.
- the images that are successfully compared are archived and stored in the archive through 1408, and the images that are not successfully compared are returned to 1402, waiting to be re-clustered with other subsequently obtained images, and then re-archived.
- the distance between the central feature vector of the class and the central feature vector of the interval class is determined. Sort the distances from small to large, select the stall class corresponding to the first X distances, and go to 1405. In 1405, the short feature vectors of the files included in the class selected in 1404 are obtained, and the distance between the central feature vector of the class and these short feature vectors is determined. Sort the distances from small to large, and select the files corresponding to the first Y distances.
- the representative feature vector of the file selected in 1405 is obtained, and the distances between the central feature vector of the category and these representative feature vectors are determined, so as to hit a file according to the obtained distances, for example, the minimum distance corresponds to file as the hit file.
- the distances between the central feature vector of the category and these representative feature vectors are determined, so as to hit a file according to the obtained distances, for example, the minimum distance corresponds to file as the hit file.
- 1411 is triggered to perform statistics on the frequency of image archiving of each archive, so as to facilitate subsequent update of high-frequency archives.
- this embodiment accelerates the calculation process, so as to shorten the time required for the processing process and improve the processing efficiency.
- an intellectual property (IP) core core
- IP intellectual property
- the IP core is established based on the HardQ algorithm, and the HardQ algorithm is adjusted based on the PQ algorithm.
- the calculation related to the short eigenvectors in this embodiment includes but is not limited to: distance query (obtaining the distance by querying the distance table), distance combination (combining multiple distances queried from the distance table to obtain distance between different short feature vectors) and distance sorting (sorting multiple distances in a certain order), etc.
- this embodiment achieves calculation acceleration by providing a distance calculation operator.
- the functions of the distance calculation operator include, but are not limited to, calculating cosine distance, calculating Euclidean distance, and sorting distances.
- the distance calculation operator is an operator based on the AI kernel.
- this embodiment provides an IP core and a distance calculation operator through a field programmable gate array (FPGA) as shown in FIG. 15 , and the FPGA is provided by a software development kit (software development kit, SDK) developed.
- the IP core and the distance calculation operator can also be provided through an application specific integrated circuit (ASIC) as shown in FIG. 16 .
- ASIC application specific integrated circuit
- the present application also provides an image file processing apparatus.
- the apparatus is used to execute the image file processing method shown in FIG. 3 and FIG. 12 through the modules shown in FIG. 17 .
- the image file processing apparatus provided by this application includes the following modules.
- the obtaining module 1701 is configured to obtain a first image, perform feature extraction on the first image, and obtain a first feature vector corresponding to the first image.
- the steps performed by the obtaining module 1701 refer to the descriptions in 301, 1201 and 1204 above, which will not be repeated here.
- the aggregation module 1702 is configured to aggregate multiple files in the archive to obtain multiple inter-file classes, and the number of the multiple inter-file classes is less than the number of the multiple files. For the steps performed by the aggregation module 1702, refer to the description in 302 above, which will not be repeated here.
- the determining module 1703 is configured to determine, from the plurality of stall categories, a target stall category whose degree of similarity to the first image is greater than a first threshold, and the number of the target stall category is smaller than the number of the multiple stall categories. For the steps performed by the determining module 1703, refer to the description in 302 above, which will not be repeated here.
- the matching module 1704 is configured to use the first feature vector to perform matching with candidate files in the files included in the target inter-file class. For the ear steps performed by the matching module 1704, refer to the descriptions in 303, 1202 and 1205 above, which will not be repeated here.
- the processing module 1705 is used for a target file in the candidate files, and performs image processing based on the target file. For the steps performed by the processing module 1705, refer to the descriptions in 304, 1203 and 1206 above, which will not be repeated here.
- the determining module 1703 is further configured to determine the short feature vector of the file included in the target inter-file class; according to the short feature vector of the file included in the target A profile with a degree of similarity with the first image greater than the second threshold is determined as a candidate profile.
- each file in the multiple files corresponds to a plurality of representative feature vectors, respectively, and the matching module 1704 is configured to determine the first feature vector in the candidate file based on the multiple representative feature vectors corresponding to each file.
- the candidate archives with the greatest degree of similarity between them are used as the target archives.
- the aggregation module 1702 is configured to aggregate the files whose similarity degree is greater than the third threshold in the multiple files into the same inter-file class to obtain multiple inter-file classes.
- the obtaining module 1701 is further configured to obtain the second image
- the processing module 1705 is further configured to file the second image and the first image together in the target file in response to the second image and the first image corresponding to the same object.
- the obtaining module 1701 is further configured to obtain a third image
- the determining module 1703 is further configured to determine whether the third image matches the target profile in response to the third image being similar to the first image;
- the processing module 1705 is further configured to, in response to the matching of the third image with the target file, archive the third image and the first image into the target file together.
- the obtaining module 1701 is further configured to obtain a fourth image, perform feature extraction on the fourth image, and obtain a second feature vector corresponding to the fourth image;
- the matching module 1704 is further configured to use the second feature vector to match the first file among the multiple files, wherein the number of the first file is less than the number of the multiple files, and the conditions that the first file meets include: a high frequency of image filing at the frequency threshold;
- the processing module 1705 is further configured to perform image processing based on the first target file in response to hitting a first target file in the first file.
- condition satisfied by the second profile further includes: the shooting area of the second profile matches the shooting area of the second image, and the shooting area of the second profile is determined based on the shooting area of the image in the second profile.
- condition satisfied by the second profile further includes: the shooting period of the second profile matches the shooting period of the second image, and the shooting period of the second profile is determined based on the shooting period of the images in the second profile.
- the obtaining module 1701 is further configured to obtain a fifth image, perform feature extraction on the fifth image, and obtain a third feature vector corresponding to the fifth image;
- the matching module 1704 is further configured to use the third feature vector to match with the first file; in response to missing the first file, use the third feature vector to match the second file among the multiple files, wherein the second file satisfies conditions include: the frequency of image archiving is lower than the frequency threshold;
- the processing module 1705 is further configured to perform image processing based on the second target file in response to hitting a second target file in the second file.
- a processing module 1705 for archiving the first image to a target archive In an exemplary embodiment, a processing module 1705 for archiving the first image to a target archive.
- a processing module 1705 for reading an image from the target archive In an exemplary embodiment, a processing module 1705 for reading an image from the target archive.
- the present application provides an image file processing device, the device includes: a communication interface and a processor, optionally, the communication device further includes a memory.
- the communication interface, the memory and the processor communicate with each other through an internal connection path, the memory is used for storing instructions, and the processor is used for executing the instructions stored in the memory to control the communication interface to receive signals and control the communication interface to send signals , and when the processor executes the instructions stored in the memory, the processor is caused to execute any one of the exemplary image file processing methods provided in this application.
- FIG. 18 shows a schematic structural diagram of an exemplary image file processing device 1800 of the present application.
- the image file processing device 1800 shown in FIG. 18 is used to execute the operations involved in the above-described image file processing methods shown in FIGS. 3 and 12 .
- the image file processing device 1800 is, for example, a server, a server cluster composed of multiple servers, or a cloud computing service center.
- the image file processing device 1800 includes at least one processor 1801 , memory 1803 and at least one communication interface 1804 .
- the processor 1801 is, for example, a general-purpose CPU, a digital signal processor (DSP), a network processor (NP), a GPU, a neural-network processing unit (NPU), a data processing unit ( Data Processing Unit (DPU), microprocessor or one or more integrated circuits or application-specific integrated circuits (ASICs) for implementing the solutions of the present application, programmable logic devices (PLDs) or Other programmable logic devices, transistor logic devices, hardware components, or any combination thereof.
- PLD is, for example, a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), a generic array logic (GAL), or any combination thereof. It may implement or execute the various logical blocks, modules and circuits described in connection with this disclosure.
- a processor may also be a combination that implements computing functions, such as a combination of one or more microprocessors, a combination of a DSP and a microprocessor, and the like.
- the image file processing device 1800 further includes a bus 1802 .
- the bus 1802 is used to transfer information between the various components of the image archive processing device 1800.
- the bus 1802 may be a peripheral component interconnect (PCI for short) bus or an extended industry standard architecture (EISA for short) bus or the like.
- PCI peripheral component interconnect
- EISA extended industry standard architecture
- the bus 1802 can be divided into an address bus, a data bus, a control bus, and the like. For ease of presentation, only one thick line is shown in FIG. 18, but it does not mean that there is only one bus or one type of bus.
- the memory 1803 is, for example, a read-only memory (ROM) or other types of storage devices that can store static information and instructions, or a random access memory (RAM) or other types of storage devices that can store information and instructions.
- ROM read-only memory
- RAM random access memory
- dynamic storage devices such as electrically erasable programmable read-only memory (EEPROM), compact disc read-only memory (CD-ROM) or other optical disk storage, optical disk storage (including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media, or other magnetic storage devices, or that can be used to carry or store desired program code in the form of instructions or data structures and can be accessed by Any other medium accessed by the computer, but not limited to this.
- the memory 1803 exists independently, for example, and is connected to the processor 1801 through the bus 1802 .
- the memory 1803 may also be integrated with the processor 1801.
- the communication interface 1804 uses any transceiver-like device for communicating with other devices or a communication network, which may be Ethernet, a radio access network (RAN), or a wireless local area network (WLAN) )Wait.
- Communication interface 1804 may include a wired communication interface and may also include a wireless communication interface.
- the communication interface 1804 may be an Ethernet (Ethernet) interface, such as a Fast Ethernet (Fast Ethernet, FE) interface, a Gigabit Ethernet (Gigabit Ethernet, GE) interface, an Asynchronous Transfer Mode (Asynchronous Transfer Mode, ATM) interface, a WLAN interface, a cellular network communication interface, or a combination thereof.
- the Ethernet interface can be an optical interface, an electrical interface or a combination thereof.
- the communication interface 1804 may be used by the image archive processing device 1800 to communicate with other devices.
- the processor 1801 may include one or more CPUs, such as CPU0 and CPU1 as shown in FIG. 18 . Each of these processors can be a single-core processor or a multi-core processor.
- a processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (eg, computer program instructions).
- the image archive processing device 1800 may include multiple processors, such as the processor 1801 and the processor 1805 as shown in FIG. 18 . Each of these processors can be a single-core processor or a multi-core processor.
- a processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (eg, computer program instructions).
- the memory 1803 is used to store the program code 1810 for executing the solutions of the present application, and the processor 1801 can execute the program code 1810 stored in the memory 1803 . That is, the image file processing device 1800 can implement the image file processing method provided by the method embodiment through the processor 1801 and the program code 1810 in the memory 1803 . One or more software modules may be included in the program code 1810. Optionally, the processor 1801 itself may also store program codes or instructions for executing the solutions of the present application.
- the image file processing device 1800 of the present application may correspond to the device for executing the above method.
- the image archive processing device 1800 can perform all or part of the steps in the method embodiments.
- the image file processing apparatus 1800 may also correspond to the apparatus shown in FIG. 17 , and each functional module in the apparatus shown in FIG. 17 is implemented by software of the image file processing apparatus 1800 .
- the functional modules included in the apparatus shown in FIG. 17 are generated after the processor 1801 of the image file processing device 1800 reads the program code 1810 stored in the memory 1803 .
- the steps of the image file processing method shown in FIG. 3 and FIG. 12 are completed by hardware integrated logic circuits or software instructions in the processor of the image file processing device 1800 .
- the steps in combination with the method embodiments disclosed in this application may be directly embodied as being executed by a hardware processor, or executed by a combination of hardware and software modules in the processor.
- the software modules may be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other storage media mature in the art.
- the storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps of the above method embodiments in combination with its hardware. To avoid repetition, details are not described here.
- processor may be a central processing unit (Central Processing Unit, CPU), or other general-purpose processors, digital signal processing (digital signal processing, DSP), application specific integrated circuit (application specific integrated circuit, ASIC), field-programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
- a general purpose processor may be a microprocessor or any conventional processor or the like. It should be noted that the processor may be a processor supporting an advanced RISC machine (ARM) architecture.
- ARM advanced RISC machine
- the above-mentioned memory may include read-only memory and random access memory, and provide instructions and data to the processor.
- the memory may also include non-volatile random access memory.
- the memory may also store device type information.
- the memory may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
- the non-volatile memory may be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically programmable Erase programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
- Volatile memory may be random access memory (RAM), which acts as an external cache. By way of example and not limitation, many forms of RAM are available.
- SRAM static RAM
- DRAM dynamic random access memory
- SDRAM synchronous dynamic random access memory
- double data rate synchronous dynamic random access Memory double data date SDRAM, DDR SDRAM
- enhanced synchronous dynamic random access memory enhanced SDRAM, ESDRAM
- synchronous link dynamic random access memory direct memory bus random access memory
- direct rambus RAM direct rambus RAM
- the present application provides a computer program.
- the processor or the computer can execute the corresponding steps and/or processes in the foregoing method embodiments.
- the embodiment of the present application provides a computer program (product), the computer program (product) includes: computer program code, when the computer program code is executed by a computer, the computer program code enables the computer to execute any of the methods provided by the above exemplary implementations.
- Embodiments of the present application provide a computer-readable storage medium, where the computer-readable storage medium stores programs or instructions, and when the programs or instructions are run on a computer, the methods provided by any of the foregoing exemplary implementations are executed.
- An embodiment of the present application provides a chip, including a processor, configured to call and execute an instruction stored in the memory from a memory, so that a communication device with the chip installed executes the method provided by any of the foregoing exemplary implementations.
- the embodiment of the present application provides another chip, including: an input interface, an output interface, a processor, and a memory, the input interface, the output interface, the processor, and the memory are connected through an internal connection path, and the processor is used to execute the code in the memory, When the code is executed, the processor is configured to perform the method provided by any of the above-described exemplary implementations.
- a computer program product includes one or more computer instructions.
- the computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
- Computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website site, computer, server, or data center over a wire (e.g.
- a computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, data center, or the like that contains one or more of the available mediums integrated.
- Useful media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, DVDs), or semiconductor media (eg, solid state drives), and the like.
- computer program code or related data may be carried by any suitable carrier to enable a device, apparatus or processor to perform the various processes and operations described above.
- suitable carriers include computer readable media and the like.
- the disclosed systems, devices and methods may be implemented in other manners.
- the devices described above are only illustrative.
- the division of the modules is only a logical function division. In actual implementation, there may be other division methods.
- multiple modules or components may be combined or integrated into Another system, or some features can be ignored, or not implemented.
- the shown or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or modules, and may also be electrical, mechanical or other forms of connection.
- modules described as separate components may or may not be physically separated, and the components shown as modules may or may not be physical modules, that is, may be located in one place, or may be distributed to multiple network modules. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution of the present application.
- each functional module in each embodiment of the present application may be integrated into one processing module, or each module may exist physically alone, or two or more modules may be integrated into one module.
- the above-mentioned integrated modules can be implemented in the form of hardware, and can also be implemented in the form of software function modules.
- first, second and other words are used to distinguish the same items or similar items with basically the same function and function, it should be understood that “first”, “second” and “nth” There is no logical or timing dependency between them, and the number and execution order are not limited. It will also be understood that, although the following description uses the terms first, second, etc. to describe various elements, these elements should not be limited by the terms. These terms are only used to distinguish one element from another. For example, a first device could be termed a second device, and, similarly, a second device could be termed a first device, without departing from the scope of various examples. Both the first device and the second device may be communicating, and in some cases, may be separate and distinct devices.
- the size of the sequence numbers of each process does not mean the sequence of execution, and the execution sequence of each process should be determined by its function and internal logic, rather than the implementation process of the present application. constitute any limitation.
- determining B according to A does not mean that B is only determined according to A, and B may also be determined according to A and other information.
- references throughout the specification to "one embodiment,” “one embodiment,” and “one possible implementation” mean that a particular feature, structure, or characteristic associated with that embodiment or implementation is included in the present specification. in at least one embodiment of the application. Thus, appearances of "in one embodiment” or “in an embodiment”, “one possible implementation” in various places throughout the specification are not necessarily necessarily referring to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner in one or more embodiments.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Image Analysis (AREA)
Abstract
一种图像档案的处理技术,属于人工智能技术领域,该处理技术适用于图像检索或者图像归档等场景。其中,将多个档案聚合为多个档间类,在多个档间类中选择与图像的相似程度较高的一部分档间类。之后,图像仅与这部分相似程度较高的档间类所包括的档案进行比对即可,不仅缩小了比对范围,而且命中率较高,从而提高了处理效率。
Description
本申请涉及人工智能技术领域,特别涉及图像档案的处理方法、装置、设备及计算机可读存储介质。
随着人工智能技术领域的发展,图像档案处理逐渐受到广泛关注。其中,图像档案是包含有图像的档案。在档案的处理过程中,往往需要针对图像进行档案的命中。
相关技术中,将图像与所有档案一一进行比对,从而将相似度最高的档案作为命中的档案。然而,相关技术中所需的比对次数较多、命中率较低,从而导致图像档案的处理效率较低。
发明内容
本申请提供了一种图像档案的处理方法、装置、设备及计算机可读存储介质,以解决相关技术提供的问题,技术方案如下:
第一方面,提供了一种图像档案的处理方法,方法包括:
在获取第一图像之后,提取该第一图像的第一特征向量。并且,对档案库中的多个档案进行聚合,得到多个档间类,且多个档间类的数量小于多个档案的数量。之后,从多个档间类中确定与第一图像的相似程度大于第一阈值的目标档间类,目标档间类的数量小于多个档间类的数量。接着,使用第一特征向量与目标档间类包括的档案中的备选档案进行匹配。如果命中备选档案中的一个目标档案,则能够基于目标档案进行图像处理。
本申请通过聚类得到多个档间类,再从多个档间类中选择与第一图像的匹配成功可能性较高的目标档间类,从而缩小了比对范围,之后通过第一图像与目标档间类中包括的备选档案之间比对命中第一目标档案。由此,避免了将第一图像与所有档案一一进行比对,不仅减少了比对次数,而且所需的计算量较小、所耗费的时间较短,因而处理效率较高。
在一种可能的实现方式中,使用第一特征向量与目标档间类包括的档案中的备选档案进行匹配之前,方法还包括:确定目标档间类包括的档案的短特征向量;根据目标档间类包括的档案的短特征向量,在目标档间类包括的档案中将与第一图像之间的相似程度大于第二阈值的档案确定为备选档案。
在目标档间类包括的档案中,根据各个档案的短特征向量选择与第一图像相似程度较高档案作为备选档案,从而减少了匹配次数,提高了处理效率。
在一种可能的实现方式中,多个档案中的各个档案分别对应多个代表特征向量,多个档案中的各个档案分别对应多个代表特征向量,使用第一特征向量与目标档间类包括的档案中的备选档案进行匹配,包括:基于各个档案对应的多个代表特征向量,在备选档案中确定与第一特征向量之间的相似程度最大的备选档案,所确定的备选档案用于作为目标档案。
在备选档案中,将与第一图像最为相似的一个备选档案确定为命中的目标档案,从而保证了处理过程的准确率。
在一种可能的实现方式中,对档案库中的多个档案进行聚合,得到多个档间类,包括:将多个档案中相似程度大于第三阈值的档案聚合为同一档间类,得到多个档间类。
属于同一档间类的各个档案之间较为相似,保证了处理过程的准确率。
在一种可能的实现方式中,方法还包括:获得第二图像;响应于第二图像与第一图像对应同一对象,将第二图像与第一图像一并归档至目标档案。
如果第二图像与第一图像属于同一对象,则将第二图像与第一图像一并归档至目标档案,实现了图像的批量归档,适用于待归档的图像数量较多的情况,有利于提高处理效率。
在一种可能的实现方式中,方法还包括:获得第三图像;响应于第三图像与第一图像相似,确定第三图像是否与目标档案相匹配;响应于第三图像与目标档案相匹配,将第三图像与第一图像一并归档至目标档案。
在第三图像与第一图像相似的情况下,如果第三图像与第一图像所在的目标档案相匹配,则能够将第三图像归档至目标档案中,从而实现了图像的批量归档,适用于待归档图像较多的情况。
在一种可能的实现方式中,方法还包括:获得第四图像,对第四图像进行特征提取,得到第四图像对应的第二特征向量;使用第二特征向量与多个档案中的第一档案进行匹配,其中,第一档案的数量小于多个档案的数量,第一档案满足的条件包括:图像归档频率高于频率阈值;响应于命中第一档案中的一个第一目标档案,基于第一目标档案进行图像处理。
本申请优先将第四图像与多个档案中的第一档案进行匹配。其中,第一档案是多个档案中图像归档频率大于阈值的档案,与第四图像匹配成功的可能性较高。如果成功匹配了第一档案中的第一目标档案,则第四图像能够直接归档至第一目标档案,而无需再与其他档案进行比对。相比于第四图像与所有档案一一进行比对的方式,本申请所提供的方法所需的计算量较小、所耗费的时间较短,因而处理效率较高。
在一种可能的实现方式中,第一档案满足的条件还包括:第一档案的拍摄区域与第四图像的拍摄区域相匹配,第一档案的拍摄区域基于第一档案中的图像的拍摄区域确定。
除图像归档频率大于频率阈值之外,第一档案的拍摄区域也与第四图像的拍摄区域相匹配,从而提高了在第一档案中命中第一目标档案的可能性,有利于减小计算量、提高归档效率。
在一种可能的实现方式中,第一档案满足的条件还包括:第一档案的拍摄时段与第四图像的拍摄时段相匹配,第一档案的拍摄时段基于第一档案中的图像的拍摄时段确定。
除图像归档频率大于频率阈值之外,第一档案的拍摄时段也与第四图像的拍摄时段相匹配,因而更有可能在第一档案中命中第一目标档案,从而有利于减小计算量、提高归档效率。
在一种可能的实现方式中,方法还包括:获得第五图像,对第五图像进行特征提取,得到第五图像对应的第三特征向量;使用第三特征向量与第一档案进行匹配;响应于未命中第一档案,使用第三特征向量与多个档案中的第二档案进行匹配,其中,第二档案满足的条件包括:图像归档频率低于频率阈值;响应于命中第二档案中的一个第二目标档案,基于第二目标档案进行图像处理。
在第五图像未命中任何第一档案的情况下,再将第五图像与图像归档频率低于频率阈值的第二档案进行匹配,以便于实现第五图像的归档。
在一种可能的实现方式中,基于目标档案进行图像处理,包括:将第一图像归档至目标档案。该实现方式应用于图像归档场景,第一图像的归档使得第一目标档案发生更新。
在一种可能的实现方式中,基于目标档案进行图像处理,包括:从目标档案中读取图像。该实现方式应用于图像检索场景,从第一目标档案中读取到的图像作为第一图像的检索结果,从而实现以图搜图。
第二方面,提供了一种图像档案的处理装置,该装置包括:
获得模块,用于获得第一图像,对第一图像进行特征提取,得到第一图像对应的第一特征向量;
聚合模块,用于对档案库中的多个档案进行聚合,得到多个档间类,多个档间类的数量小于多个档案的数量;
确定模块,用于从多个档间类中确定与第一图像的相似程度大于第一阈值的目标档间类,目标档间类的数量小于多个档间类的数量;
匹配模块,用于使用第一特征向量与目标档间类包括的档案中的备选档案进行匹配;
处理模块,用于备选档案中的一个目标档案,基于目标档案进行图像处理。
在一种可能的实现方式中,确定模块,还用于确定目标档间类包括的档案的短特征向量;根据目标档间类包括的档案的短特征向量,在目标档间类包括的档案中将与第一图像之间的相似程度大于第二阈值的档案确定为备选档案。
在一种可能的实现方式中,多个档案中的各个档案分别对应多个代表特征向量,匹配模块,用于基于各个档案对应的多个代表特征向量,在备选档案中确定与第一特征向量之间的相似程度最大的备选档案,所确定的备选档案用于作为目标档案。
在一种可能的实现方式中,聚合模块,用于将多个档案中相似程度大于第三阈值的档案聚合为同一档间类,得到多个档间类。
在一种可能的实现方式中,获得模块,还用于获得第二图像;
处理模块,还用于响应于第二图像与第一图像对应同一对象,将第二图像与第一图像一并归档至目标档案。
在一种可能的实现方式中,获得模块,还用于获得第三图像;
确定模块,还用于响应于第三图像与第一图像相似,确定第三图像是否与目标档案相匹配;
处理模块,还用于响应于第三图像与第二目标档案相匹配,将第三图像与第一图像一并归档至目标档案。
在一种可能的实现方式中,获得模块,还用于获得第四图像,对第四图像进行特征提取,得到第四图像对应的第二特征向量;
匹配模块,还用于使用第二特征向量与多个档案中的第一档案进行匹配,其中,第一档案的数量小于多个档案的数量,第一档案满足的条件包括:图像归档频率高于频率阈值;
处理模块,还用于响应于命中第一档案中的一个第一目标档案,基于第一目标档案进行图像处理。
在一种可能的实现方式中,第二档案满足的条件还包括:第二档案的拍摄区域与第二图像的拍摄区域相匹配,第二档案的拍摄区域基于第二档案中的图像的拍摄区域确定。
在一种可能的实现方式中,第二档案满足的条件还包括:第二档案的拍摄时段与第二图像的拍摄时段相匹配,第二档案的拍摄时段基于第二档案中的图像的拍摄时段确定。
在一种可能的实现方式中,获得模块,还用于获得第五图像,对第五图像进行特征提取,得到第五图像对应的第三特征向量;
匹配模块,还用于使用第三特征向量与第一档案进行匹配;响应于未命中第一档案,使用第三特征向量与多个档案中的第二档案进行匹配,其中,第二档案满足的条件包括:图像归档频率低于频率阈值;
处理模块,还用于响应于命中第二档案中的一个第二目标档案,基于第二目标档案进行图像处理。
在一种可能的实现方式中,处理模块,用于将第一图像归档至目标档案。
在一种可能的实现方式中,处理模块,用于从目标档案中读取图像。
其中,第二方面或第二方面的任一种可能的实现方式所具备的技术效果可以参见第一方面或第一方面的任一种可能的实现方式所具备的技术效果,此处不再进行赘述。
第三方面,提供了一种图像档案的处理设备,该设备包括:收发器、存储器和处理器。其中,该收发器、该存储器和该处理器通过内部连接通路互相通信,该存储器用于存储指令,该处理器用于执行该存储器存储的指令,以控制收发器接收信号,并控制收发器发送信号,并且当该处理器执行该存储器存储的指令时,使得该处理器执行第一方面或第一方面的任一种可能的实施方式中的方法。
可选地,处理器为一个或多个,存储器为一个或多个。
可选地,存储器可以与处理器集成在一起,或者存储器与处理器分离设置。
在具体实现过程中,存储器可以为非瞬时性(non-transitory)存储器,例如只读存储器(read only memory,ROM),其可以与处理器集成在同一块芯片上,也可以分别设置在不同的芯片上,本申请对存储器的类型以及存储器与处理器的设置方式不做限定。
第四方面,提供了一种计算机程序(产品),计算机程序(产品)包括:计算机程序代码,当计算机程序代码被计算机运行时,使得计算机执行上述各方面中的方法。
第五方面,提供了一种计算机可读存储介质,计算机可读存储介质存储程序或指令,当程序或指令在计算机上运行时,上述各方面中的方法被执行。
第六方面,提供了一种芯片,包括处理器,用于从存储器中调用并运行存储器中存储的指令,使得安装有芯片的通信设备执行上述各方面中的方法。
第七方面,提供另一种芯片,包括:输入接口、输出接口、处理器和存储器,输入接口、输出接口、处理器以及存储器之间通过内部连接通路相连,处理器用于执行存储器中的代码,当代码被执行时,处理器用于执行上述各方面中的方法。
图1为本申请实施例提供的一种图像档案的处理示意图;
图2为本申请实施例提供的一种实施环境的示意图;
图3为本申请实施例提供的一种图像档案的处理方法的流程示意图;
图4为本申请实施例提供的一种生成短特征向量的流程示意图;
图5为本申请实施例提供的一种代表特征向量的分段示意图;
图6为本申请实施例提供的一种分段聚类的示意图;
图7为本申请实施例提供的一种码表的示意图;
图8为本申请实施例提供的一种基于码表转换得到短特征向量的示意图;
图9为本申请实施例提供的一种距离表的示意图;
图10为本申请实施例提供的一种基于距离表查询距离的示意图;
图11为本申请实施例提供的一种图像档案的处理方法的整体流程图;
图12为本申请实施例提供的一种图像档案的处理方法的流程图;
图13为本申请实施例提供的一种档案统计的流程示意图;
图14为本申请实施例提供的一种图像档案的处理方法的整体流程图;
图15为本申请实施例提供的一种计算加速的示意图;
图16为本申请实施例提供的一种计算加速的示意图;
图17为本申请实施例提供的一种图像档案的处理装置的结构示意图;
图18为本申请实施例提供的一种图像档案的处理设备的结构示意图。
本申请的实施方式部分使用的术语仅用于对本申请的具体实施例进行解释,而非旨在限定本申请。
随着城市建设的不断推进,基于图像的相关技术以及业务逐渐受到人们的广泛关注,图像档案的处理便是其中一种。图像档案的处理包括图像归档以及图像检索。其中,图像归档是指:基于图像的特征向量将多个图像中属于同一对象的图像归入同一档案中,得到多个档案。对象包括但不限于人、物体以及人工智能(artificial intelligence,AI)等等,物体例如为车辆。图像检索是指:检索与图像属于同一对象的其他图像。如图1所示,通过各个档案,能够实现检索比对(用于确认对象身份)、轨迹分析(用于生成对象的运动轨迹)以及频率分析等业务,从而有利于实现城市的智能化管理。
在档案的处理过程中,往往需要针对图像进行档案的命中。相关技术中,将图像与所有档案一一进行比对,从而将相似度最高的档案作为命中的档案。
然而,在实际应用中,档案的数量往往较多。相关技术中提供的方法所需的计算量较大、所耗费的时间较长,从而导致处理效率较低。
对此,本申请实施例提供了一种图像档案的处理方法,该方法可以应用于图2所示的实施环境中。图2中,包括摄像节点、操作节点、管理节点、计算节点和存储节点。其中,摄像节点包括球型、筒型等摄像机,用于抓拍图像或者录制视频。在本实施例中,待归档或待检索的图像包括但不限于:摄像节点抓拍到的图像、从抓拍到的图像中截取的图像和从录制的视频中截取的图像,图像中包括至少一个对象。操作节点用于与用户交互,以使得用户能够进行图像归档任务以及图像检索任务的部署、配置及管理。管理节点用于从摄像节点获得图像及视频,例如,参见图2,摄像节点将图像及视频上传至云端,管理节点从云端获得图像及视频。管理节点还用于结合图像归档任务以及图像检索任务管理计算节点以及存储节点。在管理过程中,管理节点向计算节点转发图像及视频,计算节点用于根据接收到的图像及视频完成图像归档任务以及图像检索任务中涉及的计算任务,实现计算加速。存储节点用于根据管理节点的管理存储摄像节点拍摄的图像、录制的视频和通过图像归档过程归档至档案中的各个图像。
示例性地,图2中管理节点、计算节点和存储节点分别为不同设备,或者,管理节点、计算节点和存储节点中的任意两个或三个也可以集成于同一设备中。
基于图2所示的实施环境,本申请实施例提供了一种图像档案的处理方法。参见图3,该方法包括如下的步骤。
301,获得第一图像,对第一图像进行特征提取,得到第一图像对应的第一特征向量。
其中,本实施例不对特征提取的方式进行限定,通过特征提取过程得到的特征向量属于长特征向量。示例性地,本实施例通过深度学习方法对第一图像进行特征提取,从而得到第一图像对应的第一特征向量。例如,深度学习方法包括但不限于深度神经网络(deep neural networks,DNN)。需要说明的是,本实施例中对除第一图像之外的其他图像进行特征提取的方式均参见对第一图像进行特征提取的方式,后文不再进行赘述。
302,对档案库中的多个档案进行聚合,得到多个档间类,从多个档间类中确定与第一图像的相似程度大于第一阈值的目标档间类,多个档间类的数量小于多个档案的数量,目标档间类的数量小于多个档间类的数量。
其中,一个档案中包括至少一张图像,档案与对象一一对应。也就是说,一个档案中包括的所有图像均为同一对象对应的图像。示例性地,响应于一个档案对应的对象是能够 确定身份的,则该档案还对应有身份特征图像。
一个档案对应有一个第一中心特征向量,基于多个档案对应的第一中心特征向量对多个档案进行聚类,从而得到多个档间类。在示例性实施例中,对档案库中的多个档案进行聚合,得到多个档间类,包括:将多个档案中相似程度大于第三阈值的档案聚合为同一档间类,得到多个档间类。也就是说,同一档间类中包括的各个档案之间较为相似。一个档间类中包括至少一个第一中心特征向量,一个档间类的中心特征向量基于该档间类包括的至少一个第一中心特征向量确定。示例性地,对于一个档间类而言,确定该档间类与第一图像的相似程度包括:确定该档间类的中心特征向量与第一图像的第一特征向量之间的向量距离,该向量距离用于指示档间类与第一图像的相似程度,向量距离越小则指示的相似程度越高。相应地,本实施例基于各个档间类对应的向量距离从多个档间类中确定出目标档间类。示例性地,按照从小到大的顺序对各个档间类对应的向量距离进行排序,在排序得到的序列中选择前参考数量个向量距离对应的档间类作为目标档间类。能够理解的是,序列中前参考数量个向量距离对应的档间类即为与第一图像的相似程度大于第一阈值的档间类。
一个档案对应的第一中心特征向量基于该档案包括的图像对应的特征向量确定。在一个档案中仅包括一张图像的情况下,将该图像对应的特征向量作为该档案对应的第一中心特征向量。或者,在一个档案中包括多张图像的情况下,首先基于多张图像对应的特征向量确定档案的代表特征向量,再对多个代表特征向量进行加权求和,从而确定档案对应的第一中心特征向量。其中,参见图4中的405,一个档案的多个代表特征向量和一个第一中心特征向量共同作为该档案的典型特征向量,典型特征向量属于长特征向量。并且,响应于档案包括身份特征图像,则身份特征图像的特征向量也属于典型特征向量。能够理解的是,随着时间的增加,会有越来越多的图像被归档至档案中,从而使得档案中图像的数量不断增加。示例性地,在一个档案中增加的图像数量超过一定阈值的情况下,则对该档案的典型特征向量进行更新。
其中,基于多张图像对应的特征向量确定档案的代表特征向量,包括:对档案包括的图像对应的特征向量进行聚类,从而在档案内部得到多个类别,称为档内类。一个档内类包括至少一个图像对应的特征向量,基于档内类包括的特征向量能够确定档内类对应的中心特征向量。之后,将各个档内类对应的中心特征向量确定为档案的多个代表特征向量。其中,本实施例不对档案内部进行聚类的方式进行限定。示例性地,档案内部进行聚类的方式包括基于密度的噪声应用空间聚类(density-based spatial clustering of applications with noise,DBSCAN)。
以档案包括100张图像为例,对确定档案对应的第一中心特征向量的过程进行说明。首先,通过DBSCAN方式对100张图像对应的特征向量,也就是100个特征向量进行聚类,得到10个档内类。其中,每个档内类中包括10个特征向量,基于这10个特征向量能够确定档内类对应的中心特征向量,则10个档内类共能确定10个中心特征向量。接着,将10个档内类对应的10个中心特征向量作为档案的代表特征向量,则档案具有10个代表特征向量。之后,再对10个代表特征向量进行加权求和,从而将加权求和的结果作为档案对应的第一中心特征向量。当然,一个档案具有10个代表特征向量的情况仅为举例,不用于对档案的代表特征向量的数量造成限定。
示例性地,在档案对应的对象已确定身份的情况下,本实施例还结合档案对应的身份特征图像确定档案对应的第一中心特征向量。例如,在档案中包括一张图像的情况下,将该图像对应的特征向量与身份特征图像对应的特征向量的加权求和结果作为第一中心特征向量。又例如,在档案中包括多张图像的情况下,将多个代表特征向量与身份特征图像对应的特征向量的加权求和结果作为第一中心特征向量。
303,使用第一特征向量与目标档间类包括的档案中的备选档案进行匹配。
在确定目标档间类之后,由于目标档间类包括至少一个第一中心特征向量,因而可以基于至少一个第一中心特征向量对应的档案确定备选档案,从而使用第一特征向量与确定的备选档案进行匹配。为便于描述,下文将目标档间类包括的至少一个第一中心特征向量对应的档案称为目标档间类包括的档案。
示例性地,本实施例将目标档间类包括的所有档案均确定为备选档案。或者,考虑到处理效率问题,本实施例还可以进一步从目标档间类包括的档案中筛选出一部分与第一图像更为相似的档案作为备选档案。因此,使用第一特征向量与目标档间类包括的档案中的备选档案进行匹配之前,方法还包括:确定目标档间类包括的档案的短特征向量。根据目标档间类包括的档案的短特征向量,在目标档间类包括的档案中将与第一图像之间的相似程度大于第二阈值的档案确定为备选档案。根据上文说明可知,档案的典型特征向量,也就是第一中心特征向量以及多个代表特征向量均属于长特征向量,长特征向量的长度大于短特征向量的长度。因此,通过确定目标档间类包括的档案的短特征向量,再根据该短特征向量确定目标档间类包括的档案与第一图像之间的相似程度,有利于减少计算量、提高处理效率。
在确定短特征向量的过程中,则基于乘积量化(product quantization,PQ)算法生成码表(codebook),从而根据码表将目标档间类包括的档案对应的第一中心特征向量分别转换为档案的短特征向量。码表基于多个档案中各个档案的代表特征向量生成。在生成过程中,参见图4中的406,首先将各个档案的代表特征向量分段,各个代表特征向量的段数相同。为便于区分,各段分别对应不同的编号。之后,在不同代表特征向量之间,对具有相同编号的各段进行聚类,得到多个段间类,从而基于各个段间类对应的中心特征向量生成码表。其中,由于码表基于各个档案的代表特征向量生成,而各个档案的代表特征向量是会发生更新的,因而在更新了代表特征向量的档案数量大于一定阈值的情况下,则需要重新生成码表,并基于重新生成的码表更新短特征向量。
例如,参见图5,将各个档案的代表特征向量均分为32段,每段长度为8比特(bit)的情况。其中,每段具有不同的编号S1、S2、……、S32。之后,参见图6,对编号为S1的各段聚类得到S1对应的256个段间类,每个段间类对应有中心特征向量,则S1共对应256个中心特征向量。其他各段以此类推,不再进行赘述。接着,将各段对应的中心特征向量组成图7所示的码表。在图7中,第一列对应S1,第一列包括的256项分别为S1对应的256个中心特征向量。为便于区分,第一列包括的256项从上至下依次对应索引0-255。相应地,第二列对应S2,第二列包括的256项分别为S2对应的256个中心特征向量,且第二列包括的256从上至下依次对应索引0-255。其他列以此类推。
基于生成的码表,能够将档案的第一中心特征向量转换为短特征向量。在转换过程中,对第一中心特征向量进行分段,段数与码表中的列数相同。对于第一中心特征向量中的任 一段,确定码表中与该段对应的一列,并在该列中确定该段所在的段间类,从而能够确定该段所在的段间类对应的索引。之后,将第一中心特征向量中的各段所在的段间类对应的索引进行组合,即可得到基于该第一中心特征向量转换的短特征向量。例如,以图8所示的包括32列的码表为例,则将第一中心特征向量分为32段。其中,第1段在码表中对应的一列为S1,响应于第1段所在的段间类为S1中的第2项,则第一段对应的索引为1。相应地,第2段在码表中对应的一列为S2,响应于第2段所在的段间类为S2中的第1项,则第2段对应的索引为0。以此类推,响应于第一中心特征向量的第3段位于S3中的第4项(索引为3)、第一中心特征向量的第四段位于S4中的第256项(索引为255)、……、第31段位于S31中的第3项(索引为2)、第32段位于S32中的第255项(索引为254),则能够得到该档案对应的短特征向量为(1,0,3,255,……,2,254)。
进一步地,根据档案的短特征向量确定各个档案与第一图像之间的相似程度,包括:基于码表以及第一图像对应的第一特征向量生成第一图像对应的距离表,根据距离表以及该档案的短特征向量确定档案与第一图像之间的向量距离,该向量距离用于指示档案与第一图像之间的相似程度,向量距离越小则指示的相似程度越高。在目标档间类包括的各个档案之中,如果一个档案与第一图像之间的相似程度大于第二阈值,则可以将该档案作为备选档案。
在生成距离表的过程中,对第一图像的第一特征向量进行分段,段数与码表中的列数相同。对于第一特征向量中的任一段,确定码表中与该段对应的一列,并确定该段与该列中的各项之间的多个距离,将多个距离组合为距离表中的一列。例如,参见图7所示的码表,对于第一特征向量中的第1段,码表中与该段对应的一列为S1。确定该段与S1包括的256项之间的256个距离,将256个距离组合为图9所示的距离表中的第一列。则,距离表第一列中的第1项是第一特征向量的第1段与码表中S1列的第一项之间的距离,距离表第一列中的第2项是第一特征向量的第1段与码表中S1列的第2项之间的距离,以此类推,距离表第一列中的第256项是第一特征向量的第1段与码表中S1列的第256项之间的距离。对于距离表中的其他列不再一一进行赘述。
在生成距离表之后,通过查询第一图像对应的距离表便能够确定第一特征向量与档案的短特征向量之间的向量距离、无需进行计算,从而减小了处理过程所需的计算量。例如,参见图10,第一图像对应的距离表中包括S1-S32共32列,距离表中的每列从上至下对应的索引依次为0-255,档案的短特征向量为(1,0,3,255,……,2,254)。短特征向量中的第一个数值对应距离表中的第一列S1,由于第一个数值为1,因而查找码表中S1列索引为1的一项(即S1列的第2项),确定第一个距离。短特征向量B中的第二个数值对应距离表中的第二列S2,由于第二个数值为0,因而查找码表中S2列索引为0的一项(即S2列的第1项),确定第二个距离。对于短特征向量中的其它数值,均按照相同的方式查询距离表得到距离,从而能够得到32个距离。接着,对查询到的32个距离进行组合,便能够得到第一特征向量与档案的短特征向量之间的向量距离。
在实际应用中,与短特征向量以及距离表相关的计算由图形处理器(graphics processing unit,GPU)完成。其中,GPU对应片上(on-chip)内存,片上内存的容量往往较小,从而使得GPU无法存储所有的短特征向量以及距离表。因此,参见图4中的404,本实施例首先将所有的短特征向量以及距离表导入(load)中央处理器(central processing unit,CPU) 对应的、容量较大的片外(off-chip)内存中,当需要使用某个或某些短特征向量以及距离表进行计算时,再从CPU中将需要使用的短特征向量以及距离表导入GPU中。
示例性地,本实施将所有的短特征向量以及距离表分为多个批次导入CPU。当然,对于已导入CPU的短特征向量以及距离表,还可以根据实际需求进行替换。例如,本实施例由聚类算法指示是否需要对CPU中的短特征向量以及距离表进行替换。
在确定出备选档案之后,使用第一特征向量与备选档案进行匹配。在示例性实施例中,多个档案中的各个档案分别对应多个代表特征向量,使用第一特征向量与目标档间类包括的档案中的备选档案进行匹配,包括:基于各个档案对应的多个代表特征向量,在备选档案中确定与第一特征向量之间的相似程度最大的备选档案,所确定的备选档案用于作为目标档案。匹配过程中,对于各个备选档案,分别确定第一特征向量与该备选档案的多个代表特征向量之间的多个向量距离,也就是一个备选档案对应多个向量距离。之后,根据所有备选档案对应的向量距离确定是否命中备选档案中的一个目标档案,详见304中的说明。
304,响应于命中备选档案中的一个目标档案,基于目标档案进行图像处理。
示例性地,在所有备选档案对应的向量距离中,确定最小向量距离对应的备选档案,从而将最小向量距离对应的备选档案确定为命中的目标档案。或者,对一个备选档案对应的多个向量距离进行加权求和,得到该备选档案对应的加权求和距离。之后,将最小加权求和距离对应的备选档案确定为命中的目标档案。当然,在实际应用中,还可能存在第一图像未命中备选档案的情况。此种情况下,可以重新按照上述说明的步骤进行处理。
在示例性实施例中,基于目标档案进行图像处理,包括:将第一图像归档至目标档案。例如,参见图4中的403,在命中了目标档案的情况下直接将第一图像归档至命中的目标档案中,该处理方式应用于图像归档场景。或者,在示例性实施例中,基于目标档案进行图像处理,包括:从目标档案中读取图像,该处理方式应用于图像检索场景。也就是说,从目标档案中读取的图像可以作为第一图像的检索结果,从而实现以图搜图。当然,对于第一图像而言,也可以既进行图像归档,又进行图像检索,本实施例对此不进行限定。
参见图11,对基于档间类进行图像归档及图像检索的整体流程进行说明。
在开始进行处理之前,首先确定多个档案中各个档案的中心特征向量和代表特征向量。按照1105对档案的中心特征向量进行档间聚类,得到多个档间类。按照1106对档案的代表特征向量进行分段聚类,从而得到码表,基于码表将档案的中心特征向量转换为档案的短特征向量。
在获得图像之后,通过1101对图像进行特征提取,得到图像的长特征向量。再进入1102确定图像的长特征向量与各个档间类的中心特征向量之间的距离,对距离按照从小到大的顺序进行排列,选择前X个距离对应的档间类并转入1103。在1103中,根据图像的长特征向量以及1106中生成的码表确定图像的距离表。之后,对于1102中选择出的档间类包括的各个档案,根据距离表查询这些档案的短特征向量与图像的长特征向量之间的距离,对距离按照从小打到的顺序进行排列,选择前Y个距离对应的档案。接着,在1104中,确定图像的长特征向量与1103中选择出的档案的代表特征向量之间的距离,基于1104中确定的距离命中一个档案,例如命中最小距离对应的档案,从而将图像归档至命中的档案中,或者从命中的档案中读取图像作为检索结果。
在301-304中,本申请实施例首先聚类得到多个档间类,再从多个档间类中选择与第一 图像的匹配成功可能性较高的目标档间类,从而基于目标档间类确定用于与第一图像间匹配的备选档案。由此,避免了将第一图像与所有档案一一进行匹配,不仅减少了匹配次数,而且所需的计算量较小、所耗费的时间较短,因而处理效率较高。
以上介绍了使用一张图像的特征向量与备选档案进行匹配的过程。在图像检索场景中,每次检索往往仅针对一张图像。而在图像归档过程中,可能存在待归档的图像数量较多的情况。如果将对于各个待归档的图像依次按照上述说明进行匹配,则所需的计算量较大。对此,本实施例还提供了对多张图像进行批量归档的方式,用于减小计算量、提高处理效率,说明如下。
在示例性实施例中,方法还包括:获得第二图像,响应于第二图像与第一图像对应同一对象,将第二图像与第一图像一并归档至目标档案。示例性地,本实施例对多张图像进行聚类,得到多个类别。响应于第一图像与第二图像位于同一类别中,且该类别的聚类精度不低于精度阈值,则认为第一图像与第二图像对应同一对象。由于第二图像与第一图像对应同一对象,因而可以将第二图像与第一图像归档至相同的档案,即目标档案。并且,第一图像与第二图像所在的类别中的其他图像也均能够归档至目标档案。通过此种方式,保证了属于同一对象的各张图像均位于同一档案中,从而保证了图像归档过程的准确性。
在示例性实施例中,方法还包括:获得第三图像,响应于第三图像与第一图像相似,确定第三图像是否与目标档案相匹配。响应于第三图像与目标档案相匹配,将第三图像与第一图像一并归档至目标档案。示例性地,本实施例对多张图像进行聚类,得到多个类别,响应于第一图像与第三图像位于同一类别中,但该类别的聚类精度低于精度阈值,则认为第一图像与第一图像相似。
在聚类精度低于精度阈值的情况下,则需要进一步确认第三图像是否与第一图像所在的目标档案相匹配。如果匹配,再将第三图像与第一图像归档至相同的档案,即目标档案。通过此种方式,避免了将属于不同对象的图像归档至相同的档案中,从而保证了图像归档过程的准确性。示例性地,确定第三图像是否与目标档案相匹配,包括:确定第三图像的特征向量与目标档案的各个代表特征向量之间的多个向量距离,在多个向量距离满足条件的情况下确定第三图像与目标档案相匹配。其中,多个向量距离满足条件,包括:多个向量距离中的最小向量距离小于第二距离阈值,或者,多个向量距离的加权求和距离小于第三距离阈值。
在对第二图像以及第三图像进行说明的过程中,均涉及了对多张图像进行聚类,从而得到多个类别的过程。在聚类过程中,首先对多张待归档的图像进行特征提取,得到多个特征向量,再对特征向量进行聚类。示例性地,聚类方式包括但不限于K邻近算法(k nearest neighbor,KNN)以及K均值聚类算法(k-means clustering algorithm,K-means),本实施例不对聚类方式进行限定。示例性地,根据图像数量的不同,本实施例基于不同的特征向量进行聚类。其中,在多个图像数量不大于数量阈值(例如1亿)的情况下,参见图4,直接将通过401提取得到的特征向量进行聚类,该特征向量也称为长特征向量。或者,在图像数量大于数量阈值的情况下,则按照图4中的402基于将各个图像的特征向量分别转换为短特征向量,再对短特征向量进行聚类。其中,将图像的特征向量转换为短特征向量的说明参见上文302,此处不再进行赘述。在得到图像的短特征向量之后,便能够根据短特征向 量进行聚类。其中,聚类过程依赖于不同图像的短特征向量之间的距离计算。示例性地,参见图4中的403,本实施例中针对各个图像的特征向量分别生成对应的距离表,基于距离表实现不同图像的短特征向量之间的距离计算,以便于加快聚类速度,从而有利于提高处理效率。基于距离表确定距离的过程参见上文302,此处不再进行赘述。
其中,聚类得到的任一类别包括至少一个特征向量,一个类别对应有第二中心特征向量,第二中心特征向量是基于该类别包括的至少一个特征向量确定的。示例性地,在实际应用中,本实施例首先对待归档的多张图像进行聚类,得到多个类别。对于任一类别而言,使用该类别的第二中心特征向量与备选档案进行匹配,响应于命中备选档案中的目标档案,则参见图13中的1305进行档案比对过程。在档案比对过程中,响应于一个类别的聚类精度不低于精度阈值,则将该类别中的各个图像均归档至目标档案中。或者,响应于一个类别的聚类精度低于精度阈值,则将该类别中与目标档案匹配的图像归档至目标档案中。而对于该类别中与目标档案不匹配的图像,可以与后续获得的其他图像重新聚类,再进行批量归档。
以上介绍了图像与多个档案聚类得到的档间类进行匹配的情况。在示例性实施例中,参见图12,本实施例还提供如下的图像档案的处理方法,该方法包括如下的步骤。
1201,获得第四图像,对第四图像进行特征提取,得到第四图像对应的第二特征向量。
其中,特征提取过程参见图13中的1301。对第四图像进行特征提取的方式参见301中的说明,此处不再进行赘述。
1202,使用第二特征向量与多个档案中的第一档案进行匹配,第一档案的数量小于多个档案的数量。
其中,第一档案满足的条件包括:图像归档频率大于频率阈值。对于任一档案而言,图像归档频率是指该档案在参考时长内的图像归档次数,或者说参考时长内被归档至该档案中的图像的数量。能够看出,第一档案是多个档案中图像归档频率较高的那部分档案,因而第一档案与第四图像对应的第二特征向量匹配成功的可能性也较大。相比于相关技术中使用图像的特征向量与多个档案一一进行匹配的方式,本实施例从多个档案中选择了部分匹配成功可能性较大的第一档案,优先使用第二特征向量与这些第一档案进行匹配,从而减小了计算量、缩短了匹配时间,进而提高了处理效率。
示例性地,一个档案在参考时长内的图像归档次数通过对多个不同时长内的归档频率进行加权求和得到。例如,确定一个档案在过去一周内的第一归档频率,以及该档案在过去一天内的第二归档频率,将第一归档频率和第二归档频率的加权求和值作为该档案的总归档频率。其中,过去一周及过去一天仅为不同时长的举例,不用于对时长造成限定。示例性地,在多个不同时长内,时长越长则对应的权重越大。因此,上文中的第一归档频率对应的权重大于第二归档频率对应的权重。例如,第一归档频率对应的权重为0.6,第二归档频率对应的权重为0.4。
需要说明的是,频率阈值用于指示数量,或者用于指示占比。对于前者,频率阈值可以是根据经验确定的数值。以频率阈值为50次为例,响应于一个档案的图像归档频率大于50次,则确定该档案为第一档案。或者,按照从大到小的顺序对多个档案的图像归档频率进行排列,得到频率序列。在频率序列中,选择前频率阈值个图像归档频率对应的档案作 为第一档案。例如,频率阈值为40个,则选择频率序列中的前40个图像归档频率对应的档案作为第一档案。对于后者,频率阈值可以是分数或者百分率。以频率阈值为百分率4%为例,则按照从大到小的顺序对多个档案的图像归档频率进行排列之后,在得到的频率序列中选择前4%个图像归档频率对应的档案作为第一档案。另外,本实施例中涉及的其他阈值也均用于指示数量或占比,后文不再进行赘述。
示例性地,使用第二特征向量与多个档案中的第一档案进行匹配,包括:确定第二特征向量与多个档案中的第一档案之间的相似度。示例性地,多个档案中的各个档案分别对应有一个第一中心特征向量,则第二特征向量与第一档案之间的相似度包括:第二特征向量与第一档案对应的第一中心特征向量之间的向量距离。其中,向量距离越小,则说明第二特征向量与第一档案之间的相似度越高。
示例性地,第一档案满足的条件还包括:第一档案的拍摄区域与第四图像的拍摄区域相匹配。其中,第一档案的拍摄区域基于第一档案中的图像的拍摄区域确定。
其中,参见图13中的1306,处理过程所应用的区域能够划分为多个区域。一个区域中包括至少一个摄像机,对于一张图像(第四图像或者第一档案中的任一图像)而言,该图像由哪个区域包括的摄像机拍摄得到,则该图像的拍摄区域即为哪个区域。在示例性实施例中,对处理过程所应用的区域进行划分时,将处理过程所应用的区域内各个行政区域直接作为多个区域。例如,处理过程所应用的区域为A省,A省的行政区域包括A1市以及A2市,则将A1市以及A2市作为多个区域。或者,还能够根据各个行政区域内的摄像机数量对行政区域进行调整,得到多个区域,以使得任意两个区域包括的摄像机数量之间的差值不大于一定阈值。以行政区域A1市包括B1个摄像机、行政区域A2市包括B2个摄像机为例,如果(B1-B2)大于一定阈值,则在A1市中减少包含有B3个摄像机的区域A11,在A2市中增加该区域A11,从而划分得到的两个区域为:区域(A1-A11),包括的摄像机数量为(B1-B3);区域(A2+A11),包括的摄像机数量为(B2+B3),且(B1-B3)-(B2+B3)不大于一定阈值。
示例性地,不同区域对应不同的区域编号。对于一张图像,该图像的拍摄区域通过拍摄区域对应的区域编号来指示。其中,本实施例对各个区域内的摄像机分别进行编号,一个摄像机拍摄的图像均对应该摄像机的编号。另外,存储区域编号与摄像机编号的映射关系,则在获得图像之后,根据图像对应的摄像机编号查找映射关系,便能够得到摄像机编号对应的区域编号。或者,同一区域内的摄像机均配置有该摄像机所在区域的区域编号,则一个摄像机拍摄的图像均对应该摄像机配置的区域编号。此种情况下,获得的图像本身对应有区域编号,无需再根据上述映射关系进行查找。又或者,令图像对应的摄像机编号中的指定位置包括摄像机所在区域的区域编号,则通过读取摄像机编号中的指定位置便能够确定拍摄区域的区域编号。例如,区域编号为5位字符串,摄像机编号为10位字符串,10位中的前5位为摄像机所在区域的区域编号,后5位用于区分同一区域内的不同摄像机。此种情况下,读取摄像机编号中的前5位便能够确定拍摄区域对应的区域编号。
进一步地,以上说明了图像的拍摄区域的确定方式。对于第一档案而言,第一档案的拍摄区域基于第一档案中的图像的拍摄区域确定,具体包括如下的三种方式。
第一种确定拍摄区域的方式:第一档案中仅包括一张图像,或者第一档案中包括多张图像,且多张图像具有相同的拍摄区域,则将图像的拍摄区域作为第一档案的拍摄区域。
第二种确定拍摄区域的方式:响应于第一档案包括的所有图像中,超过一定比例阈值的图像的拍摄区域均为同一区域,则将该区域作为第一档案的拍摄区域。例如,第一档案中包括100张图像且比例阈值为35%,如果100张图像中有超过35张图像的拍摄区域均为A区域,则第一档案的拍摄区域包括A区域。能够理解的是,第一档案的拍摄区域可能包括多个区域。例如,第一档案中包括100张图像且比例阈值为35%,如果100张图像中有超过35张图像的拍摄区域为A区域,另外还有超过35张图像的拍摄区域为B区域,则第一档案对的拍摄区域包括A区域以及B区域。
当然,在第二种方式中,还可能无法确定第一档案的拍摄区域。例如,第一档案中包括100张图像且比例阈值为35%,如果100张图像中有33张图像的拍摄区域为A区域、33张图像的拍摄区域为B区域、34张图像的拍摄区域为C区域,则由于拍摄于各个区域中的图像均未超过35%的比例阈值,因而A区域、B区域以及C区域均不能作为第一档案的拍摄区域。
第三种确定拍摄区域的方式:对于一个区域而言,确定各个第一档案中拍摄于该区域内的图像数量。之后,按照从大到小的顺序对图像数量进行排列,得到图像序列。在图像序列中,将大于数量阈值的第一档案的拍摄区域确定为该区域。例如,第一档案A中包括30张拍摄区域为A区域的图像,第一档案B中包括20张拍摄区域为A区域的图像,第一档案C中包括10张拍摄区域为A区域的图像。由于第一档案A中拍摄于A区域的图像最多(30张),因而将第一档案A的拍摄区域确定为A区域。能够理解的是,在第三种方式中,第一档案的拍摄区域可能包括多个区域,也可能无法确定第一档案的拍摄区域。
基于对图像的拍摄区域以及第一档案的拍摄区域的说明,示例性地,第一档案的拍摄区域与第四图像的拍摄区域相匹配是指:第一档案的拍摄区域中包括第四图像的拍摄区域。例如,第四图像的拍摄区域为区域A,第一档案1的拍摄区域为区域A,第一档案2的拍摄区域为区域B,则第一档案1的拍摄区域与第四图像的拍摄区域相匹配。需要说明的是,第一档案的拍摄区域中还可以包括第四图像的拍摄区域之外的其他区域。例如,第四图像的拍摄区域为区域A,第一档案3的拍摄区域包括区域A、区域B和区域C,则第一档案3的拍摄区域也与第四图像的拍摄区域相匹配。
示例性地,第一档案满足的条件还包括:第一档案的拍摄时段与第四图像的拍摄时段相匹配。第一档案的拍摄时段基于第一档案中的图像的拍摄时段确定。
其中,一张图像(第四图像或者第一档案中的任一图像)具有固定的拍摄时间。在本实施例中,图像归档频率是参考时长内的图像归档次数,参见图13中的1306,通过对该参考时长进行划分能够得到时段。例如,参考时长为一周,时段为两小时,则一周中共包括7×12=84个时段。一张图像的拍摄时间位于哪个时段,则可以将哪个时段确定为该图像的拍摄时段。以一周中包括84个时长为2小时的时段为例,如果一张图像的拍摄时间为周一上午9:00,则该拍摄时间位于周一上午8:00-10:00这一时段,也就是84个时段中的第5个时段。
第一档案的拍摄时段是基于第一档案中的图像的拍摄时段确定的,第一档案的拍摄时段与第四图像的拍摄时段相匹配是指:第一档案的拍摄时段中包括第四图像的拍摄时段。示例性地,确定第一档案的拍摄时段包括如下三种方式。
第一种确定拍摄时段的方式:第一档案中仅包括一张图像,或者第一档案中包括多张 图像,且多张图像具有相同的拍摄时段,则将图像的拍摄时段作为第一档案的拍摄时段。
第二种确定拍摄时段的方式:响应于第一档案包括的所有图像中,超过一定比例阈值的图像的拍摄时段均为同一时段,则将该时段作为第一档案的拍摄时段。例如,第一档案中包括100张图像且比例阈值为35%,如果100张图像中有超过35张图像的拍摄时段为A时段,则第一档案对应的拍摄区域包括A区域。
第三种确定拍摄时段的方式:对于一个时段而言,确定各个第一档案中拍摄于该时段内的图像数量,也就是各个第一档案的时段归档频率。按照从大到小的顺序对时段归档频率进行排列。在得到的序列中,将该时段作为大于第一阈值的时段归档频率对应的第一档案的拍摄时段。例如,在A时段中,第一档案A的时段归档频率为30,第一档案B的时段归档频率为20,第一档案C的时段归档频率为10,则将A时段作为第一档案A的拍摄时段,原因在于第一档案A的时段归档频率最大。
示例性地,参见图13中的1308,本实施例在图像归档过程开始之前,对多个档案进行统计整理,以便于在图像归档过程开始之后,能够按照图13中的1303基于第四图像快速获得用于进行匹配的第一档案,从而提高处理效率。其中,1303示出的高频档案对应于上述图像归档频率大于频率阈值的档案。以多个档案中的一个档案为例,确定参考时长内归档至该档案中的图像数量,从而得到该档案的图像归档频率。根据参考时长内归档至该档案中的各图像的拍摄区域确定该档案在参考时长内是否对应有拍摄区域。示例性地,不同档案分别对应不同的档案标识(identification,ID),本实施例可以将图像归档频率大于阈值的各个档案对应的档案ID组成第一档案列表。并且,参见图13中的1307,如果第一档案列表中的一个档案对应有拍摄区域,还可以将该档案与对应的拍摄区域的区域编号进行映射,形成第二档案列表。
另外,还将参考时长划分为多个时段,确定每个时段内归档至该档案中的图像数量,作为该档案的时段归档频率。根据该时段内归档至该档案中的图像的拍摄区域确定该档案在该时段内是否对应有拍摄区域。示例性地,对于划分得到的任一个时段,本实施例从上述第一档案列表包括的各个档案中,选择在该时段内的时段归档频率大于一定阈值的档案对应的档案ID组成第三档案列表。由于通过划分得到了多个时段,每个时段对应一个第三档案列表,因而能够得到多个第三档案列表。进一步地,如果一个第三档案列表中的一个档案对应有拍摄区域,还可以将该档案与对应的拍摄区域的区域编号进行映射,形成第四档案列表。示例性地,参见图13中的1307,第四档案列表中一个档案的拍摄区域是指该档案在对应的时段内的拍摄区域,而不是该档案在参考时长内的拍摄区域。
对于第一档案所需满足的三个条件,将图像归档频率大于频率阈值作为条件一,将与第四图像的拍摄区域相匹配作为条件二,将与第四图像的拍摄时段相匹配作为条件三,对基于上述四个档案列表快速确定第一档案的过程进行说明:
如果第一档案需要满足条件一,则将第一档案列表包括的各个档案作为第一档案;
如果第一档案需要满足条件一和条件二,则确定第四图像的拍摄区域的区域编号,将第二档案列表中映射有相同区域编号的档案作为第一档案;
如果第一档案需要满足条件一和条件三,则确定第四图像的拍摄时段,在多个第三档案列表中确定与第四图像的拍摄时段相对应的一个第三档案列表,将这个第三档案列表包括的档案作为第一档案;
如果第一档案需要满足条件一、条件二和条件三,则确定第四图像的拍摄区域的区域编号,从第四档案列表中筛选出映射有相同区域编号的第四档案列表,再从筛选出的第四档案列表中确定与第四图像的拍摄时段相对应的一个第四档案列表,将这个第四档案列表包括的档案作为第一档案。
当然,在通过一次统计整理过程得到四个档案列表之后,随着图像的不断增加以及各个档案中图像的不断归档,还需要对四个档案列表进行更新。示例性地,本实施例每隔一定时间进行一次四个档案列表的更新,或者多个档案的图像归档频率之和超过一定阈值进行一次四个档案列表的更新。
1203,响应于命中第一档案中的一个第一目标档案,基于第一目标档案进行图像处理。
根据1202中的说明可知,使用第二特征向量与第一档案进行匹配包括:确定第二特征向量与第一档案之间的相似度,该相似度例如为第二特征向量与第一档案对应的第一中心特征向量之间的第一向量距离。相应地,响应于第二特征向量与第一档案中的第一目标档案的相似度最高且满足阈值,例如第二特征向量与该第一目标档案对应的第一中心特征向量之间的第一向量距离最小且小于第一距离阈值,则认为第二特征向量与第一目标档案匹配成功,也就是命中了该第一目标档案。
在命中第一目标档案之后,基于第二目标档案进行图像处理,包括将第四图像归档至第一目标档案。例如,参见图13中的1303,在命中了第一目标档案的情况下直接将第四图像归档至命中的第一目标档案,该处理方式适用于图像归档场景。或者,基于第二目标档案进行图像处理也包括从第一目标档案中读取图像,该处理方式适用于图像检索场景。另外,对于在第一档案中未能命中第一目标档案的情况,参见后文1204-1206中的说明。
在1201-1203中,本实施例优先将第四图像与多个档案中的第一档案进行匹配。其中,第一档案是多个档案中图像归档频率大于阈值的档案,与第四图像匹配成功的可能性较高。如果成功匹配了第一档案中的第一目标档案,则能够直接基于第一目标档案进行图像处理,而无需再与其他档案进行比对。相比于第四图像与所有档案一一进行比对的方式,本申请实施例所提供的处理方式不仅命中率高,而且所需的比对次数较少、计算量较小、所耗费的时间较短,因而处理效率较高。
以上介绍了能够命中第一档案中的第一目标档案的情况。而在实际应用中,还可能存在未命中第一档案的情况。在此种情况下,需要将待归档或待检索的图像与多个档案中除第一档案之外的其他档案进行匹配。因此,参见图12,方法还包括如下的步骤。
1204,获得第五图像,对第五图像进行特征提取,得到第五图像对应的第三特征向量。
其中,对第五图像进行特征提取的方式参见301中的说明,此处不再进行赘述。
1205,使用第三特征向量与多个档案中的第一档案进行匹配。响应于未命中第一档案,使用第三特征向量与多个档案中的第二档案进行匹配。
使用第三特征向量与多个档案中的第一档案进行匹配的方式参见1202中的说明,此处不再进行赘述。示例性地,在第三特征向量与各个第一档案的相似度均不满足阈值,例如第三特征向量与各个第一档案对应的第一中心特征向量之间的向量距离均不小于上述第一距离阈值的情况下,则认为未命中第一档案。因此,使用第三特征向量与多个档案中的第二档案进行匹配。其中,第二档案满足的条件包括:图像归档频率低于频率阈值。
示例性地,本实施例使用第三特征向量与各个第二档案依次进行匹配。或者,参见图13中1304示出的档间类比对过程,在示例性实施例中,本实施例还可以参考301-304中基于档间类的匹配方式,对多个档案中除第一档案之外的其他档案进行聚类,从而得到多个档间类。其中,在第一档案满足的条件仅包括图像归档频率低于频率阈值的情况下,多个档案中除第一档案之外的其他档案等同于第二档案。而在第一档案还满足其他条件的情况下,多个档案中除第一档案之外的其他档案包括第二档案,还包括第二档案之外的档案。在多个档间类中,选择与第五图像的相似程度大于一定阈值的档间类,从而基于选择出的档间类包括的档案确定出第二档案。
示例性地,本实施例将选择出的档间类包括的所有档案均确定为第二档案。或者,示例性地,参见图13中1304示出的短特征比对过程,确定选择出的档间类包括的档案的短特征向量,确定方式参见上文303中的说明。之后,基于码表生成第五图像的距离表,根据距离表确定各个档案的短特征向量与第五图像的第三特征向量之间的向量距离,将向量距离小于一定阈值的档案确定为第二档案。
在确定出第二档案之后,使用第三特征向量与第二档案进行匹配。匹配过程中,参见图13中1304示出的长特征比对过程,对于各个第二档案,分别确定第三特征向量与该第二档案的多个代表特征向量之间的多个向量距离,也就是一个第二档案对应多个向量距离。之后,根据所有第二档案对应的向量距离确定是否命中第二档案中的一个第二目标档案,详见1206中的说明。
1206,响应于命中第二档案中的一个第二目标档案,则基于第二目标档案进行图像处理。
示例性地,在所有第二档案对应的向量距离中,确定最小向量距离对应的第二档案,从而将最小向量距离对应的第二档案确定为命中的第二目标档案。或者,对一个第二档案对应的多个向量距离进行加权求和,得到该第二档案对应的加权求和距离。之后,将最小加权求和距离对应的第二档案确定为命中的第二目标档案。在命中第二目标档案之后,基于第二目标档案进行图像处理,包括将第五图像归档至第二目标档案以及从第二目标档案中读取图像中的至少一种。
当然,在实际应用中,还可能存在第五图像未命中第二档案的情况。此种情况下,可以重新按照上述说明的各步骤对该第五图像进行处理。或者,等待后续获得其他图像之后,对第五图像与后续获得的其他图像进行聚类,从而进行批量处理。另外,需要说明的是,上述1204-1206中的说明针对于一张第五图像未命中第一档案的情况。对于对多张图像进行聚类,并对一个类别对应的图像进行批量归档的情况,则将1204-1206说明中第五图像对应的第三特征向量替换为类别对应的第二中心特征向量,此处不再进行赘述。
接下来,参见图14,以对多张图像进行批量归档为例,对本申请实施例提供的图像归档方法的整体流程进行说明。
在开始进行图像归档之前,首先按照1409确定多个档案中各个档案的典型特征向量,典型特征向量包括一个中心特征向量以及多个代表特征向量,典型特征向量属于档案的长特征向量。按照1412对多个档案进行聚类,得到多个档间类,档间类对应有中心特征向量。按照1410生成码表,基于码表将各个档案的中心特征向量转换为短特征向量,各个档案的 短特征向量可以存储于档案库。按照1411统计得到高频档案,其中,高频档案对应于上述说明中图像归档频率大于频率阈值的档案。
在获得多张待归档的图像之后,开始进行图像归档。参见1401,对图像进行特征提取,得到图像的长特征向量。参见1402,根据图像数量的不同选择不同的聚类方式。其中,如果长特征向量的量级不大于阈值,则直接基于长特征向量进行聚类。如果长特征向量的量级大于阈值,则基于1410生成的码表将图像的长特征向量转换为图像的短特征向量,再基于图像的短特征向量进行聚类。两种聚类方式中的任一种均能够得到多个类别,每个类别具有中心特征向量,每个类别均按照如下的说明进行比对。
在1403中,使用类别的中心特征向量与高频档案的中心特征向量进行比对。比对过程中,首先选择与类别的拍摄时段和/或拍摄区域匹配的高频档案进行比对,再选择其他的高频档案进行比对。如果命中一个高频档案,则转入1407进行聚类精度的确定。在1407中,如果聚类精度满足要求,则将类别对应的各个图像以及图像的长特征向量通过1408归档并存储于档案库。如果聚类精度不满足要求,则按照1407将类别中各个图像的长特征向量逐个与命中的高频档案的中心特征向量进行比对。比对成功的图像通过1408归档并存储于档案库,比对不成功的图像则返回1402,等待与后续获得的其他图像重新进行聚类,再重新进行归档。
如果未命中任何高频档案,则转入1404与档间类进行比对。在1404中,确定类别的中心特征向量与档间类的中心特征向量之间的距离。对距离按照从小到大进行排序,选择前X个距离对应的档间类并转入1405。在1405中,获得1404中选择出的档间类包括的档案的短特征向量,再确定类别的中心特征向量与这些短特征向量之间的距离。对距离按照从小到大进行排序,选择前Y个距离对应的档案。在1406中,获得1405中选择出的档案的代表特征向量,确定类别的中心特征向量与这些代表特征向量之间的多个距离,从而根据得到的多个距离命中一个档案,例如将最小距离对应的档案作为命中的档案。接着,转入1407进行聚类精度的确定,具体过程参见上文对于1407的说明,此处不再进行赘述。
另外,在图14所示的过程中,响应于通过1408进行了归档入库,则触发1411进行各个档案的图像归档频率的统计,以便于后续进行高频档案的更新。
根据以上说明可知,在图像归档过程中涉及大量的、不同特征向量之间的计算。在示例性实施例中,本实施例针对计算过程进行加速,以便于缩短处理过程所需的时间、提高处理效率。
其中,对于短特征向量相关的计算,本实施例通过知识产权(intellectual property,IP)内核(core)固化计算逻辑,以实现短特征向量相关的计算加速。其中,IP内核基于HardQ算法建立,HardQ算法基于PQ算法调整得到。如图15所示,本实施例中短特征向量相关的计算包括但不限于:距离查询(通过查询距离表获得距离)、距离组合(对从距离表中查询到的多个距离进行组合,得到不同短特征向量之间的距离)以及距离排序(按照一定顺序对多个距离进行排序)等等。
对于长特征向量相关的计算,例如计算图像的特征向量与档案的代表特征向量之间的距离,本实施例通过提供距离计算算子来实现计算加速。其中,该距离计算算子的作用包括但不限于:计算余弦距离、计算欧式距离以及进行距离排序等等。另外,如图16所示, 距离计算算子是基于AI内核的算子。
示例性地,本实施例通过如图15所示的现场可编程逻辑门阵列(field programmable gate array,FPGA)提供IP内核以及距离计算算子,FPGA由软件开发工具包(software development kit,SDK)开发得到。或者,还可以通过如图16所示的专用集成电路(application specific integrated circuit,ASIC)提供IP内核以及距离计算算子。
以上介绍了本申请的图像档案的处理方法,与上述方法对应,本申请还提供图像档案的处理装置。该装置用于通过图17所示的各个模块执行上述图3及图12中所示的图像档案的处理方法。如图17所示,本申请提供的图像档案的处理装置包括如下几个模块。
获得模块1701,用于获得第一图像,对第一图像进行特征提取,得到第一图像对应的第一特征向量。获得模块1701所执行的步骤参见上文301、1201以及1204中的说明,此处不再进行赘述。
聚合模块1702,用于对档案库中的多个档案进行聚合,得到多个档间类,多个档间类的数量小于多个档案的数量。聚合模块1702所执行的步骤参见上文302中的说明,此处不再进行赘述。
确定模块1703,用于从多个档间类中确定与第一图像的相似程度大于第一阈值的目标档间类,目标档间类的数量小于多个档间类的数量。确定模块1703所执行的步骤参见上文302中的说明,此处不再进行赘述。
匹配模块1704,用于使用第一特征向量与目标档间类包括的档案中的备选档案进行匹配。匹配模块1704所执行耳朵步骤参见上文303、1202和1205中的说明,此处不再进行赘述。
处理模块1705,用于备选档案中的一个目标档案,基于目标档案进行图像处理。处理模块1705所执行的步骤参见上文304、1203和1206中的说明,此处不再进行赘述。
在示例性实施例中,确定模块1703,还用于确定目标档间类包括的档案的短特征向量;根据目标档间类包括的档案的短特征向量,在目标档间类包括的档案中将与第一图像之间的相似程度大于第二阈值的档案确定为备选档案。
在示例性实施例中,多个档案中的各个档案分别对应多个代表特征向量,匹配模块1704,用于基于各个档案对应的多个代表特征向量,在备选档案中确定与第一特征向量之间的相似程度最大的备选档案,所确定的备选档案用于作为目标档案。
在示例性实施例中,聚合模块1702,用于将多个档案中相似程度大于第三阈值的档案聚合为同一档间类,得到多个档间类。
在示例性实施例中,获得模块1701,还用于获得第二图像;
处理模块1705,还用于响应于第二图像与第一图像对应同一对象,将第二图像与第一图像一并归档至目标档案。
在示例性实施例中,获得模块1701,还用于获得第三图像;
确定模块1703,还用于响应于第三图像与第一图像相似,确定第三图像是否与目 标档案相匹配;
处理模块1705,还用于响应于第三图像与目标档案相匹配,将第三图像与第一图像一并归档至目标档案。
在示例性实施例中,获得模块1701,还用于获得第四图像,对第四图像进行特征提取,得到第四图像对应的第二特征向量;
匹配模块1704,还用于使用第二特征向量与多个档案中的第一档案进行匹配,其中,第一档案的数量小于多个档案的数量,第一档案满足的条件包括:图像归档频率高于频率阈值;
处理模块1705,还用于响应于命中第一档案中的一个第一目标档案,基于第一目标档案进行图像处理。
在示例性实施例中,第二档案满足的条件还包括:第二档案的拍摄区域与第二图像的拍摄区域相匹配,第二档案的拍摄区域基于第二档案中的图像的拍摄区域确定。
在示例性实施例中,第二档案满足的条件还包括:第二档案的拍摄时段与第二图像的拍摄时段相匹配,第二档案的拍摄时段基于第二档案中的图像的拍摄时段确定。
在示例性实施例中,获得模块1701,还用于获得第五图像,对第五图像进行特征提取,得到第五图像对应的第三特征向量;
匹配模块1704,还用于使用第三特征向量与第一档案进行匹配;响应于未命中第一档案,使用第三特征向量与多个档案中的第二档案进行匹配,其中,第二档案满足的条件包括:图像归档频率低于频率阈值;
处理模块1705,还用于响应于命中第二档案中的一个第二目标档案,基于第二目标档案进行图像处理。
在示例性实施例中,处理模块1705,用于将第一图像归档至目标档案。
在示例性实施例中,处理模块1705,用于从目标档案中读取图像。
应理解的是,上述图17提供的装置在实现其功能时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的装置与方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
本申请提供了一种图像档案的处理设备,该设备包括:通信接口和处理器,可选的,该通信设备还包括存储器。其中,该通信接口、该存储器和该处理器通过内部连接通路互相通信,该存储器用于存储指令,该处理器用于执行该存储器存储的指令,以控制通信接口接收信号,并控制通信接口发送信号,并且当该处理器执行该存储器存储的指令时,使得该处理器执行本申请所提供的任一种示例性的图像档案的处理方法。
参见图18,图18示出了本申请一示例性的图像档案的处理设备1800的结构示意图。图18所示的图像档案的处理设备1800用于执行上述图3及图12所示的图像档案的处理方法所涉及的操作。该图像档案的处理设备1800例如是一台服务器、由多台服务器组成的服务器集群,或者是一个云计算服务中心等。
如图18所示,图像档案的处理设备1800包括至少一个处理器1801、存储器1803以及至少一个通信接口1804。
处理器1801例如是通用CPU、数字信号处理器(digital signal processor,DSP)、网络处理器(network processer,NP)、GPU、神经网络处理器(neural-network processing units,NPU)、数据处理单元(Data Processing Unit,DPU)、微处理器或者一个或多个用于实现本申请方案的集成电路或专用集成电路(application-specific integrated circuit,ASIC),可编程逻辑器件(programmable logic device,PLD)或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。PLD例如是复杂可编程逻辑器件(complex programmable logic device,CPLD)、现场可编程逻辑门阵列(field-programmable gate array,FPGA)、通用阵列逻辑(generic array logic,GAL)或其任意组合。其可以实现或执行结合本申请公开内容所描述的各种逻辑方框、模块和电路。处理器也可以是实现计算功能的组合,例如包括一个或多个微处理器组合,DSP和微处理器的组合等等。
可选的,图像档案的处理设备1800还包括总线1802。总线1802用于在图像档案的处理设备1800的各组件之间传送信息。总线1802可以是外设部件互连标准(peripheral component interconnect,简称PCI)总线或扩展工业标准结构(extended industry standard architecture,简称EISA)总线等。总线1802可以分为地址总线、数据总线、控制总线等。为便于表示,图18中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
存储器1803例如是只读存储器(read-only memory,ROM)或可存储静态信息和指令的其它类型的存储设备,又如是随机存取存储器(random access memory,RAM)或者可存储信息和指令的其它类型的动态存储设备,又如是电可擦可编程只读存储器(electrically erasable programmable read-only Memory,EEPROM)、只读光盘(compact disc read-only memory,CD-ROM)或其它光盘存储、光碟存储(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质或者其它磁存储设备,或者是能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其它介质,但不限于此。存储器1803例如是独立存在,并通过总线1802与处理器1801相连接。存储器1803也可以和处理器1801集成在一起。
通信接口1804使用任何收发器一类的装置,用于与其它设备或通信网络通信,通信网络可以为以太网、无线接入网(radio access network,RAN)或无线局域网(wireless local area network,WLAN)等。通信接口1804可以包括有线通信接口,还可以包括无线通信接口。具体的,通信接口1804可以为以太(Ethernet)接口,如:快速以太(Fast Ethernet,FE)接口、千兆以太(Gigabit Ethernet,GE)接口,异步传输模式(Asynchronous Transfer Mode,ATM)接口,WLAN接口,蜂窝网络通信接口或其组合。以太网接口可以是光接口,电接口或其组合。在本申请的一些实施方式中,通信接口1804可以用于图像档案的处理设备1800与其他设备进行通信。
在具体实现中,作为一些实施方式,处理器1801可以包括一个或多个CPU,如图18 中所示的CPU0和CPU1。这些处理器中的每一个可以是一个单核处理器,也可以是一个多核处理器。这里的处理器可以指一个或多个设备、电路、和/或用于处理数据(例如计算机程序指令)的处理核。
在具体实现中,作为一些实施方式,图像档案的处理设备1800可以包括多个处理器,如图18中所示的处理器1801和处理器1805。这些处理器中的每一个可以是一个单核处理器,也可以是一个多核处理器。这里的处理器可以指一个或多个设备、电路、和/或用于处理数据(如计算机程序指令)的处理核。
在一些实施方式中,存储器1803用于存储执行本申请方案的程序代码1810,处理器1801可以执行存储器1803中存储的程序代码1810。也即是,图像档案的处理设备1800可以通过处理器1801以及存储器1803中的程序代码1810,来实现方法实施例提供的图像档案的处理方法。程序代码1810中可以包括一个或多个软件模块。可选地,处理器1801自身也可以存储执行本申请方案的程序代码或指令。
在具体实施过程中,本申请的图像档案的处理设备1800可对应于用于执行上述方法的设备,图像档案的处理设备1800中的处理器1801读取存储器1803中的指令,使图18所示的图像档案的处理设备1800能够执行方法实施例中的全部或部分步骤。
图像档案的处理设备1800还可以对应于上述图17所示的装置,图17所示的装置中的每个功能模块采用图像档案的处理设备1800的软件实现。换句话说,图17所示的装置包括的功能模块为图像档案的处理设备1800的处理器1801读取存储器1803中存储的程序代码1810后生成的。
其中,图3及图12所示的图像档案的处理方法的各步骤通过图像档案的处理设备1800的处理器中的硬件的集成逻辑电路或者软件形式的指令完成。结合本申请所公开的方法实施例的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法实施例的步骤,为避免重复,这里不再详细描述。
应理解的是,上述处理器可以是中央处理器(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(digital signal processing,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现场可编程门阵列(field-programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者是任何常规的处理器等。值得说明的是,处理器可以是支持进阶精简指令集机器(advanced RISC machines,ARM)架构的处理器。
进一步地,在一种可选的实施例中,上述存储器可以包括只读存储器和随机存取存储器,并向处理器提供指令和数据。存储器还可以包括非易失性随机存取存储器。例如,存储器还可以存储设备类型的信息。
该存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储 器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用。例如,静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(dynamic random access memory,DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data date SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。
本申请提供了一种计算机程序,当计算机程序被计算机执行时,可以使得处理器或计算机执行上述方法实施例中对应的各个步骤和/或流程。
本申请实施例提供了一种计算机程序(产品),计算机程序(产品)包括:计算机程序代码,当计算机程序代码被计算机运行时,使得计算机执行上述任一种示例性实施所提供的方法。
本申请实施例提供了一种计算机可读存储介质,计算机可读存储介质存储程序或指令,当程序或指令在计算机上运行时,上述任一种示例性实施所提供的方法被执行。
本申请实施例提供了一种芯片,包括处理器,用于从存储器中调用并运行存储器中存储的指令,使得安装有芯片的通信设备执行上述任一种示例性实施所提供的方法。
本申请实施例提供另一种芯片,包括:输入接口、输出接口、处理器和存储器,输入接口、输出接口、处理器以及存储器之间通过内部连接通路相连,处理器用于执行存储器中的代码,当代码被执行时,处理器用于执行上述任一种示例性实施所提供的方法。
在上述实施方式中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行计算机程序指令时,全部或部分地产生按照本申请的流程或功能。计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘)等。
在本申请的上下文中,计算机程序代码或者相关数据可以由任意适当载体承载,以使得设备、装置或者处理器能够执行上文描述的各种处理和操作。载体的示例包括计算机可 读介质等等。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的系统、设备和模块的具体工作过程,可以参见前述方法中的对应过程,在此不再赘述。
在本申请所提供的几个实施方式中,应该理解到,所揭露的系统、设备和方法,可以通过其它的方式实现。例如,以上所描述的设备仅仅是示意性的,例如,该模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个模块或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口、设备或模块的间接耦合或通信连接,也可以是电的,机械的或其它的形式连接。
该作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理模块,即可以位于一个地方,或者也可以分布到多个网络模块上。可以根据实际的需要选择其中的部分或者全部模块来实现本申请方案的目的。
另外,在本申请各个实施方式中的各功能模块可以集成在一个处理模块中,也可以是各个模块单独物理存在,也可以是两个或两个以上模块集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
本申请中术语“第一”、“第二”等字样用于对作用和功能基本相同的相同项或相似项进行区分,应理解,“第一”、“第二”、“第n”之间不具有逻辑或时序上的依赖关系,也不对数量和执行顺序进行限定。还应理解,尽管以下描述使用术语第一、第二等来描述各种元素,但这些元素不应受术语的限制。这些术语只是用于将一元素与另一元素区别分开。例如,在不脱离各种示例的范围的情况下,第一设备可以被称为第二设备,并且类似地,第二设备可以被称为第一设备。第一设备和第二设备都可以是通信,并且在某些情况下,可以是单独且不同的设备。
还应理解,在本申请的各个实施方式中,各个过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请的实施过程构成任何限定。
本申请中术语“至少一个”的含义是指一个或多个,本申请中术语“多个”的含义是指两个或两个以上。本文中术语“系统”和“网络”经常可互换使用。
应理解,根据A确定B并不意味着仅仅根据A确定B,还可以根据A和其它信息确定B。
还应理解,说明书通篇中提到的“一个实施方式”、“一实施方式”、“一种可能的实现方式”意味着与实施方式或实现方式有关的特定特征、结构或特性包括在本申请的至少一个实施方式中。因此,在整个说明书各处出现的“在一个实施方式中”或“在一实施方式中”、“一种可能的实现方式”未必一定指相同的实施方式。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施方式中。
以上所述,以上实施方式仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施方式对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施方式所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施方式技术方案的范围。
Claims (26)
- 一种图像档案的处理方法,其特征在于,所述方法包括:获得第一图像,对所述第一图像进行特征提取,得到所述第一图像对应的第一特征向量;对档案库中的多个档案进行聚合,得到多个档间类,从所述多个档间类中确定与所述第一图像的相似程度大于第一阈值的目标档间类,所述多个档间类的数量小于所述多个档案的数量,所述目标档间类的数量小于所述多个档间类的数量;使用所述第一特征向量与所述目标档间类包括的档案中的备选档案进行匹配;响应于命中所述备选档案中的一个目标档案,基于所述目标档案进行图像处理。
- 根据权利要求1所述的方法,其特征在于,所述使用所述第一特征向量与所述目标档间类包括的档案中的备选档案进行匹配之前,所述方法还包括:确定所述目标档间类包括的档案的短特征向量;根据所述目标档间类包括的档案的短特征向量,在所述目标档间类包括的档案中将与所述第一图像之间的相似程度大于第二阈值的档案确定为所述备选档案。
- 根据权利要求1或2所述的方法,其特征在于,所述多个档案中的各个档案分别对应多个代表特征向量,所述使用所述第一特征向量与所述目标档间类包括的档案中的备选档案进行匹配,包括:基于所述各个档案对应的多个代表特征向量,在所述备选档案中确定与第一特征向量之间的相似程度最大的备选档案,所确定的备选档案用于作为所述目标档案。
- 根据权利要求1-3任一所述的方法,其特征在于,所述对档案库中的多个档案进行聚合,得到多个档间类,包括:将所述多个档案中相似程度大于第三阈值的档案聚合为同一档间类,得到所述多个档间类。
- 根据权利要求1-4任一所述的方法,其特征在于,所述方法还包括:获得第二图像;响应于所述第二图像与所述第一图像对应同一对象,将所述第二图像与所述第一图像一并归档至所述目标档案。
- 根据权利要求1-4任一所述的方法,其特征在于,所述方法还包括:获得第三图像;响应于所述第三图像与所述第一图像相似,确定所述第三图像是否与所述目标档案相匹配;响应于所述第三图像与所述目标档案相匹配,将所述第三图像与所述第一图像一并归档至所述目标档案。
- 根据权利要求1-6任一所述的方法,其特征在于,所述方法还包括:获得第四图像,对所述第四图像进行特征提取,得到所述第四图像对应的第二特征向量;使用所述第二特征向量与所述多个档案中的第一档案进行匹配,其中,所述第一档案的数量小于所述多个档案的数量,所述第一档案满足的条件包括:图像归档频率高于频率阈值;响应于命中所述第一档案中的一个第一目标档案,基于所述第一目标档案进行图像处理。
- 根据权利要求7所述的方法,其特征在于,所述第一档案满足的条件还包括:所述第一档案的拍摄区域与所述第四图像的拍摄区域相匹配,所述第一档案的拍摄区域基于所述第一档案中的图像的拍摄区域确定。
- 根据权利要求7或8所述的方法,其特征在于,所述第一档案满足的条件还包括:所述第一档案的拍摄时段与所述第四图像的拍摄时段相匹配,所述第一档案的拍摄时段基于所述第一档案中的图像的拍摄时段确定。
- 根据权利要求7-9任一所述的方法,其特征在于,所述方法还包括:获得第五图像,对所述第五图像进行特征提取,得到所述第五图像对应的第三特征向量;使用所述第三特征向量与所述第一档案进行匹配;响应于未命中所述第一档案,使用所述第三特征向量与所述多个档案中的第二档案进行匹配,其中,所述第二档案满足的条件包括:图像归档频率低于所述频率阈值;响应于命中所述第二档案中的一个第二目标档案,基于所述第二目标档案进行 图像处理。
- 根据权利要求1-10任一所述的方法,其特征在于,所述基于所述目标档案进行图像处理,包括:将所述第一图像归档至所述目标档案。
- 根据权利要求1-10任一所述的方法,其特征在于,所述基于所述目标档案进行图像处理,包括:从所述目标档案中读取图像。
- 一种图像档案的处理装置,其特征在于,所述装置包括:获得模块,用于获得第一图像,对所述第一图像进行特征提取,得到所述第一图像对应的第一特征向量;聚合模块,用于对档案库中的多个档案进行聚合,得到多个档间类,所述多个档间类的数量小于所述多个档案的数量;确定模块,用于从所述多个档间类中确定与所述第一图像的相似程度大于第一阈值的目标档间类,所述目标档间类的数量小于所述多个档间类的数量;匹配模块,用于使用所述第一特征向量与所述目标档间类包括的档案中的备选档案进行匹配;处理模块,用于备选档案中的一个目标档案,基于所述目标档案进行图像处理。
- 根据权利要求13所述的装置,其特征在于,所述确定模块,还用于确定所述目标档间类包括的档案的短特征向量;根据所述目标档间类包括的档案的短特征向量,在所述目标档间类包括的档案中将与所述第一图像之间的相似程度大于第二阈值的档案确定为所述备选档案。
- 根据权利要求13或14所述的装置,其特征在于,所述多个档案中的各个档案分别对应多个代表特征向量,所述匹配模块,用于基于所述各个档案对应的多个代表特征向量,在所述备选档案中确定与第一特征向量之间的相似程度最大的备选档案,所确定的备选档案用于作为所述目标档案。
- 根据权利要求13-15任一所述的装置,其特征在于,所述聚合模块,用于将所述多个档案中相似程度大于第三阈值的档案聚合为同一档间类,得到所述多个 档间类。
- 根据权利要求13-16任一所述的装置,其特征在于,所述获得模块,还用于获得第二图像;所述处理模块,还用于响应于所述第二图像与所述第一图像对应同一对象,将所述第二图像与所述第一图像一并归档至所述目标档案。
- 根据权利要求13-16任一所述的装置,其特征在于,所述获得模块,还用于获得第三图像;所述确定模块,还用于响应于所述第三图像与所述第一图像相似,确定所述第三图像是否与所述目标档案相匹配;所述处理模块,还用于响应于所述第三图像与所述目标档案相匹配,将所述第三图像与所述第一图像一并归档至所述目标档案。
- 根据权利要求13-18任一所述的装置,其特征在于,所述获得模块,还用于获得第四图像,对所述第四图像进行特征提取,得到所述第四图像对应的第二特征向量;所述匹配模块,还用于使用所述第二特征向量与所述多个档案中的第一档案进行匹配,其中,所述第一档案的数量小于所述多个档案的数量,所述第一档案满足的条件包括:图像归档频率高于频率阈值;所述处理模块,还用于响应于命中所述第一档案中的一个第一目标档案,基于所述第一目标档案进行图像处理。
- 根据权利要求19所述的装置,其特征在于,所述第二档案满足的条件还包括:所述第二档案的拍摄区域与所述第二图像的拍摄区域相匹配,所述第二档案的拍摄区域基于所述第二档案中的图像的拍摄区域确定。
- 根据权利要求19或20所述的装置,其特征在于,所述第二档案满足的条件还包括:所述第二档案的拍摄时段与所述第二图像的拍摄时段相匹配,所述第二档案的拍摄时段基于所述第二档案中的图像的拍摄时段确定。
- 根据权利要求19-21任一所述的装置,其特征在于,所述获得模块,还用于获得第五图像,对所述第五图像进行特征提取,得到所述第五图像对应的第三特征向量;所述匹配模块,还用于使用所述第三特征向量与所述第一档案进行匹配;响应于未命中所述第一档案,使用所述第三特征向量与所述多个档案中的第二档案进行匹配,其中,所述第二档案满足的条件包括:图像归档频率低于所述频率阈值;所述处理模块,还用于响应于命中所述第二档案中的一个第二目标档案,基于所述第二目标档案进行图像处理。
- 根据权利要求13-22任一所述的装置,其特征在于,所述处理模块,用于将所述第一图像归档至所述目标档案。
- 根据权利要求13-22任一所述的装置,其特征在于,所述处理模块,用于从所述目标档案中读取图像。
- 一种图像档案的处理设备,其特征在于,所述设备包括存储器及处理器;所述存储器中存储有至少一条指令,所述至少一条指令由所述处理器加载并执行,以实现权利要求1-12中任一所述的图像档案的处理方法。
- 一种计算机可读存储介质,其特征在于,所述存储介质中存储有至少一条指令,所述指令由处理器加载并执行以实现如权利要求1-12中任一所述的图像档案的处理方法。
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011158362 | 2020-10-26 | ||
CN202011158362.5 | 2020-10-26 | ||
CN202110119296.9 | 2021-01-28 | ||
CN202110119296.9A CN114510587A (zh) | 2020-10-26 | 2021-01-28 | 图像档案的处理方法、装置、设备及计算机可读存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022088909A1 true WO2022088909A1 (zh) | 2022-05-05 |
Family
ID=81383524
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/115209 WO2022088909A1 (zh) | 2020-10-26 | 2021-08-30 | 图像档案的处理方法、装置、设备及计算机可读存储介质 |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2022088909A1 (zh) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090290798A1 (en) * | 2005-08-31 | 2009-11-26 | Toyota Jidosha Kabushiki Kaisha | Image search method and device |
US20130259324A1 (en) * | 2012-04-03 | 2013-10-03 | Chung Hua University | Method for face recognition |
CN111144332A (zh) * | 2019-12-30 | 2020-05-12 | 深圳云天励飞技术有限公司 | 一种图片聚档方法、装置和电子设备 |
CN111488894A (zh) * | 2019-01-25 | 2020-08-04 | 华为技术有限公司 | 档案合并方法及装置 |
CN111695441A (zh) * | 2020-05-20 | 2020-09-22 | 平安科技(深圳)有限公司 | 图像文档处理方法、装置及计算机可读存储介质 |
-
2021
- 2021-08-30 WO PCT/CN2021/115209 patent/WO2022088909A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090290798A1 (en) * | 2005-08-31 | 2009-11-26 | Toyota Jidosha Kabushiki Kaisha | Image search method and device |
US20130259324A1 (en) * | 2012-04-03 | 2013-10-03 | Chung Hua University | Method for face recognition |
CN111488894A (zh) * | 2019-01-25 | 2020-08-04 | 华为技术有限公司 | 档案合并方法及装置 |
CN111144332A (zh) * | 2019-12-30 | 2020-05-12 | 深圳云天励飞技术有限公司 | 一种图片聚档方法、装置和电子设备 |
CN111695441A (zh) * | 2020-05-20 | 2020-09-22 | 平安科技(深圳)有限公司 | 图像文档处理方法、装置及计算机可读存储介质 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10896164B2 (en) | Sample set processing method and apparatus, and sample querying method and apparatus | |
JP6662990B2 (ja) | オブジェクトネットワークをモデル化するシステム及び方法 | |
Pedronette et al. | Multimedia retrieval through unsupervised hypergraph-based manifold ranking | |
JP2013519152A (ja) | テキスト分類の方法及びシステム | |
US20100241615A1 (en) | Mitigation of obsolescence for archival services | |
JPWO2013129580A1 (ja) | 近似最近傍探索装置、近似最近傍探索方法およびそのプログラム | |
TWI769665B (zh) | 目標資料更新方法、電子設備及電腦可讀儲存介質 | |
CN113918753A (zh) | 基于人工智能的图像检索方法及相关设备 | |
JP7108784B2 (ja) | データ記憶方法、データ取得方法、及び機器 | |
Chakraborty et al. | A novel shot boundary detection system using hybrid optimization technique | |
TW202217597A (zh) | 圖像的增量聚類方法、電子設備、電腦儲存介質 | |
CN108062384A (zh) | 数据检索的方法和装置 | |
CN112069875B (zh) | 人脸图像的分类方法、装置、电子设备和存储介质 | |
CN109218366A (zh) | 基于k均值的监控视频热度云存储方法 | |
CN112947860A (zh) | 一种分布式数据副本的分级存储与调度方法 | |
TW202109312A (zh) | 圖像特徵提取及網路的訓練方法、電子設備和電腦可讀儲存媒體 | |
CN115878824B (zh) | 图像检索系统、方法和装置 | |
Doshi et al. | LANNS: a web-scale approximate nearest neighbor lookup system | |
WO2022063150A1 (zh) | 数据存储方法及装置、数据查询方法及装置 | |
WO2022088909A1 (zh) | 图像档案的处理方法、装置、设备及计算机可读存储介质 | |
US11599577B2 (en) | System and method for content-hashed object storage | |
Guo et al. | Event recognition in personal photo collections using hierarchical model and multiple features | |
WO2015176840A1 (en) | Offline, hybrid and hybrid with offline image recognition | |
CN115146103A (zh) | 图像检索方法、装置、计算机设备、存储介质和程序产品 | |
CN110059148A (zh) | 应用于电子地图的空间关键字查询的准确搜索方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21884641 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21884641 Country of ref document: EP Kind code of ref document: A1 |