CN111860627A - Feature comparison method, system, device and medium - Google Patents

Feature comparison method, system, device and medium Download PDF

Info

Publication number
CN111860627A
CN111860627A CN202010640869.8A CN202010640869A CN111860627A CN 111860627 A CN111860627 A CN 111860627A CN 202010640869 A CN202010640869 A CN 202010640869A CN 111860627 A CN111860627 A CN 111860627A
Authority
CN
China
Prior art keywords
feature
similarity
scalar data
range
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202010640869.8A
Other languages
Chinese (zh)
Inventor
王鹏飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202010640869.8A priority Critical patent/CN111860627A/en
Publication of CN111860627A publication Critical patent/CN111860627A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a feature comparison method, which comprises the following steps: constructing a characteristic base library; calculating the similarity between every two of a plurality of feature vectors corresponding to each object in the feature bottom library to obtain the similarity range corresponding to each object; taking a union set of the similarity ranges of the multiple objects to obtain a threshold range of the similarity; calculating the similarity between the feature vector to be compared and each feature vector in the feature base library; sorting the feature vectors corresponding to the similarity within the threshold value range; and returning the object IDs respectively corresponding to the feature vectors to the user based on the sorting. The invention also discloses a system, a computer device and a readable storage medium. According to the scheme provided by the invention, the similarity is calculated for all the feature vectors of the same object in the feature base library, so that the accurate similarity threshold range is determined, and the influence of overlarge similarity caused by abnormal conditions on the actual comparison result is prevented.

Description

Feature comparison method, system, device and medium
Technical Field
The present invention relates to the field of feature comparison, and in particular, to a feature comparison method, system, device, and storage medium.
Background
The rapid comparison process refers to that the similarity comparison is carried out on the features extracted in real time and the feature base library, and the features with high similarity are selected and returned to the index. The similarity calculation methods are more, such as Euclidean distance, Pearson correlation coefficient, cosine similarity and the like, and the core method comprises vector multiplication and addition operation. The feature base library is large-scale biological recognition features extracted in advance, such as face feature vectors and voiceprint feature vectors, and with the continuous improvement of the capability and the demand of computing equipment, the features of the base library are expanded to tens of millions or hundreds of millions. The real-time feature extraction refers to the fact that the vectors requested to be compared by the client are compared one by one on a single device and a huge base library, and response delay is huge.
The existing technical scheme is mostly finished based on CPU equipment, feature data are extracted by combining big data or a search engine, the feature data are usually stored in a single-precision fixed dimension vector, and then M: and N (M is the real-time characteristic quantity, and N is the bottom library characteristic quantity), the scheme can store data in a hard disk, and has higher storage redundancy. At present, algorithms for realizing feature comparison comprise similarity and fast-ranking algorithms which have a plurality of mature schemes, for example, solving Euclidean distance and screening by using bitonic ordering to obtain topN data, the schemes can meet user requirements to a certain extent, and can be put into use in combination with CPU computing equipment in the face of comparison of the level of millions to billions. The technical process of the scheme is as follows: 1. storing the characteristic library vector in a memory; 2. comparing the similarity of the current real-time characteristic with all the characteristics of the bottom library; 3. and (5) quickly sequencing comparison results. However, the feature comparison scheme based on the CPU device has a small computational throughput, and has a long delay time in the case of large-scale data comparison, which affects the overall efficiency of the comparison system.
At present, a characteristic comparison scheme realized based on GPU-CUDA is available, the basic flow is similar to the CPU flow, but the strong floating point arithmetic capability and the parallel architecture thereof greatly reduce the number and the requirements of computing equipment under the same task, the floating point computing throughput capability is strong, and the multi-task parallel comparison can be carried out. However, in the feature comparison scheme based on the GPU device, due to the adoption of single-precision vector storage and limited video memory space, the number of processing tasks of a single device is reduced, and the problems of low efficiency and slow calculation are caused by the adoption of artificially designed similarity and fast-arranging CUDA kernel function when facing a huge feature base.
Disclosure of Invention
In view of the above, in order to overcome at least one aspect of the above problems, an embodiment of the present invention provides a feature comparison method, including the steps of:
constructing a characteristic base library;
calculating the similarity between every two of a plurality of feature vectors corresponding to each object in the feature bottom library to obtain the similarity range corresponding to each object;
taking a union set of the similarity ranges of the multiple objects to obtain a threshold range of the similarity;
calculating the similarity between the feature vector to be compared and each feature vector in the feature base library;
Sorting the feature vectors corresponding to the similarity within the threshold value range;
and returning the object IDs respectively corresponding to the feature vectors to the user based on the sorting.
In some embodiments, building a base library of features further comprises:
respectively extracting feature vectors from a plurality of images of the object by using the trained neural network model;
and reducing the precision of the extracted plurality of feature vectors to a target precision.
In some embodiments, reducing the accuracy of the extracted plurality of feature vectors to a target accuracy, further comprises:
sorting scalar data constituting the feature vector;
respectively calculating the data densities corresponding to the first M scalar data and the last M scalar data according to the range value corresponding to the target precision;
determining a first scaling value by dividing the scalar data with the maximum data density in the first M scalar data by the maximum value in the range values corresponding to the target precision, and determining a second scaling value by dividing the scalar data with the maximum data density in the last M scalar data by the minimum value in the range values corresponding to the target precision;
and taking the larger value of the first scaling value and the second scaling value as a final scaling value, and performing quantization processing on the plurality of feature vectors by using the final scaling value so as to reduce the precision of the plurality of feature vectors to the target precision.
In some embodiments, the calculating the data density corresponding to the first M scalar data and the last M scalar data according to the range value corresponding to the target precision further includes:
subtracting the maximum value in the range values corresponding to the target precision from each scalar data in the first M scalar data to obtain an intermediate value;
respectively counting the number of each scalar data in the first M scalar data and the scalar data in the corresponding intermediate value range to determine the data density according to the number; or,
subtracting the minimum value in the range value corresponding to the target precision from each scalar data in the last M scalar data to obtain an intermediate value;
counting the number of each scalar data in the last M scalar data and the scalar data in the corresponding intermediate value range respectively to determine the data density according to the number.
In some embodiments, quantizing the plurality of feature vectors using the final scaled value, further comprising:
dividing each scalar data constituting the feature vector by the scaling value.
In some embodiments, calculating a similarity between the feature vector to be compared and each feature vector in the feature base library further comprises:
and calculating the similarity between the feature vector to be compared and each feature vector in the feature bottom library by using a CUBLAS library.
In some embodiments, the merging the similarity ranges of the plurality of objects to obtain the threshold range of similarity, further comprises:
adjusting the threshold range.
Based on the same inventive concept, according to another aspect of the present invention, an embodiment of the present invention further provides a feature comparison system, including:
a build module configured to build a base library of features;
a first calculation module, configured to calculate similarity between each two of a plurality of feature vectors corresponding to each object in the feature base library to obtain a similarity range corresponding to each object;
a threshold determination module configured to aggregate similarity ranges of a plurality of objects to obtain a threshold range of similarity;
a second calculation module configured to calculate a similarity between the feature vector to be compared and each feature vector in the feature base library;
The sorting module is configured to sort the feature vectors corresponding to the calculated similarity within the threshold range;
a returning module configured to return object IDs respectively corresponding to the feature vectors to a user based on the sorting.
Based on the same inventive concept, according to another aspect of the present invention, an embodiment of the present invention further provides a computer apparatus, including:
at least one processor; and
a memory storing a computer program operable on the processor, wherein the processor executes the program to perform any of the steps of the feature comparison method as described above.
Based on the same inventive concept, according to another aspect of the present invention, an embodiment of the present invention further provides a computer-readable storage medium storing a computer program, which when executed by a processor performs the steps of any of the feature comparison methods described above.
The invention has one of the following beneficial technical effects: according to the scheme provided by the invention, the similarity is calculated for all the feature vectors of the same object in the feature base library, so that the accurate similarity threshold range is determined, and the influence of overlarge similarity caused by abnormal conditions on the actual comparison result is prevented.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained by using the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a feature comparison method provided by an embodiment of the present invention;
FIG. 2 is a schematic diagram of a feature comparison system provided in an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a computer device provided in an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a computer-readable storage medium according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the following embodiments of the present invention are described in further detail with reference to the accompanying drawings.
It should be noted that all expressions using "first" and "second" in the embodiments of the present invention are used for distinguishing two entities with the same name but different names or different parameters, and it should be noted that "first" and "second" are merely for convenience of description and should not be construed as limitations of the embodiments of the present invention, and they are not described in any more detail in the following embodiments.
According to an aspect of the present invention, an embodiment of the present invention provides a feature comparison method, as shown in fig. 1, which may include the steps of:
s1, constructing a feature base library;
s2, calculating the similarity between every two of a plurality of feature vectors corresponding to each object in the feature bottom library to obtain the similarity range corresponding to each object;
s3, merging the similarity ranges of the multiple objects to obtain a threshold range of similarity;
s4, calculating the similarity between the feature vector to be compared and each feature vector in the feature base library;
s5, sorting the feature vectors corresponding to the similarity within the threshold value range;
s6, returning object IDs corresponding to a plurality of feature vectors (e.g., top N, N may be 10 or 15, etc.) to the user based on the sorting.
According to the scheme provided by the invention, the similarity is calculated for all the feature vectors of the same object in the feature base library, so that the accurate similarity threshold range is determined, and the influence of overlarge similarity caused by abnormal conditions on the actual comparison result is prevented.
In some embodiments, in step S1, building a feature base library further includes:
S11, respectively extracting feature vectors from the plurality of images of the object by using the trained neural network model;
and S12, reducing the precision of the extracted plurality of feature vectors to the target precision.
Specifically, feature vectors of different object images (for example, images of a human face under different backgrounds) can be extracted by using a trained model, the obtained vector feature dimensions are generally 128-1024 different (for example, 512 dimensions), and the precision can be a single-precision vector, generally, the higher the vector dimensions are, the better the representation effect of the object features is, but the calculation amount of the model extraction features and the calculation amount of later comparison are both increased significantly; the obtained set of feature vectors of different objects is a feature bottom library, for example, the number of the feature bottom libraries may be 1200 ten thousand objects by 512 dimensions. In this embodiment, the feature vector may be processed, and the processing may be selected to be half precision (fp16) or int8 precision, and int8 precision is preferably adopted. The base library imported into the GPU video memory through the processing is prepared, the fp32 precision is converted into int8 precision, the video memory space of 3/4 can be reduced theoretically, and about 3/5 of the video memory can be reduced practically.
In some embodiments, step S12, reducing the accuracy of the extracted plurality of feature vectors to a target accuracy, further includes:
S121, sorting scalar data forming the feature vectors;
s122, respectively calculating data densities corresponding to the first M scalar data and the last M scalar data according to the range value corresponding to the target precision;
s123, determining a first scaling value by dividing the scalar data with the maximum data density in the first M scalar data sets by a maximum value in the range values corresponding to the target precision, and determining a second scaling value by dividing the scalar data with the maximum data density in the last M scalar data sets by a minimum value in the range values corresponding to the target precision;
s124, using the larger value of the first scaling value and the second scaling value as a final scaling value, and performing quantization processing on the plurality of feature vectors by using the final scaling value, so as to reduce the precision of the plurality of feature vectors to the target precision.
Specifically, for example, the extracted plurality of feature vectors may be a (a)1,a2,。。。,a512) And B (B)1,b2,。。。,b512) The scalar data constituting the feature vectors A and B is a1,a2,。。。,a512,b1,b2,。。。,b512There are 1024 data. And then sorting the 1024 data according to size, selecting the first M data and the last M data, and calculating data density and a scaling value according to a range value corresponding to target precision. Or marking all the data on a coordinate axis, moving the coordinate axis from the maximum value and the minimum value to the original point direction for M times in sequence to respectively obtain M data, and then utilizing the obtained data to obtain a corresponding range according to the target precision The data density and the scaling value are calculated in the peripheral values.
In some embodiments, in step S122, calculating data densities corresponding to the first M pieces of scalar data and the last M pieces of scalar data according to the range values corresponding to the target precision, respectively, further includes:
s1221, subtracting a maximum value of the range values corresponding to the target precision from each scalar data of the first M scalar data to obtain an intermediate value;
s1222 counting the number of each scalar data in the first M scalar data and the corresponding scalar data in the middle value range, respectively, to determine the data density according to the number; or,
s1223, subtracting a minimum value of the range values corresponding to the target precision from each scalar data of the last M scalar data to obtain an intermediate value;
s1224, respectively counting the number of each scalar data in the last M scalar data and the scalar data in the corresponding intermediate value range, so as to determine the data density according to the number.
Specifically, the data density may be obtained by the number of data within the range of max (max — maximum value in the range of values of target accuracy), where max refers to each of the first M scalar data, (or the obtained number of data/maximum value in the range of values of target accuracy). Alternatively, the data density is obtained by the number of data in the range of min (min being the minimum value of the range values of the target accuracy) to (the minimum value of the range values of min — target accuracy) (or the data density is obtained by the maximum value of the range values of the obtained number of data/target accuracy), where min refers to each of the next M scalar data.
For example, when the target precision is int8 (range value-128 ~ 128), in the 1024 scalar data sorting results constituting the feature vectors A and B, a1~aMIs the first M data, b1~bMThe last M data are counted, and then 1024 data are respectively counted at a1-128,a2-128,。。。,aM-128 rangeThe data amount in the enclosure is utilized, and then the scalar data a corresponding to the maximum data amount is utilizedx128 to obtain a first scaling value. And respectively counting 1024 data in b1-(-128),b2-(-128),。。。,bM- (-128), the amount of data in the range, and then scalar data b corresponding to the maximum amount of data is usedx128 to obtain a second scaling value.
In some embodiments, in step S124, performing quantization processing on the plurality of feature vectors by using the final scaling value, further includes:
dividing each scalar data constituting the feature vector by the scaling value.
Specifically, after the larger value of the first scaling value and the second scaling value is selected, all the feature vectors are quantized by using the larger scaling value.
For example, feature vector A (a)1,a2,。。。,a512) The result obtained after quantization was A (a)1Scaling value, a2The value of/scaling. . . A is a512A/scaling value).
Because GPU has limited video memory, space must be saved when facing feature base library of large-scale data amount, so low-precision saving can be adopted, and the current low-precision quantization method is to map the maximum value of high precision (such as fp32) to the maximum value of low precision (such as int8), and its scale value (scaling value) is a fixed value (i.e. the maximum value of fp32 divided by the maximum value of int8), which may result in large loss after reducing precision. The low-precision quantization used in the scheme combines the fact that the scale value adopted by the scene feature is not a fixed value but is obtained according to the size of the feature data, and therefore the precision is guaranteed to be reduced, and meanwhile great loss is avoided.
In some embodiments, in step S2, the similarity between each two feature vectors corresponding to each object in the feature library is calculated to obtain a similarity range corresponding to each object, specifically, the feature library includes a plurality of feature vectors of a plurality of objects, and the similarity of the feature vectors of the same object is similar, so that the similarity between every two feature vectors of the same object in the feature bottom library can be calculated, therefore, a similarity range is obtained, for example, the similarity range obtained through the feature vector 1, the feature vector 2, the feature vector 3, the feature vector 4 and the feature vector 5 of the face a is 0.8-0.88, for example, the similarity range obtained through the feature vector 1, the feature vector 2, the feature vector 3, the feature vector 4 and the feature vector 5 of the face B is 0.81-0.89, and so on, and the corresponding similarity ranges of all objects are obtained.
In some embodiments, in step S3, the merging the similarity ranges of the plurality of objects to obtain a threshold range of similarity, further includes:
adjusting the threshold range.
Specifically, after the corresponding similarity ranges of all objects are obtained, a union set of the similarity ranges is selected (for example, the similarity range of the face a and the face B is 0.80-0.89). And fine adjustment and expansion can be performed on the union set data to serve as a similarity threshold range.
It should be noted that the feature base library needs to continuously update the feature vectors corresponding to the objects in different scenes, and therefore the threshold range also needs to be continuously updated, that is, after a new feature vector is updated in the feature base library, the similarity range needs to be updated according to the above procedure.
In some embodiments, in step S4, calculating a similarity between the feature vector to be compared and each feature vector in the feature base library, further includes:
and calculating the similarity between the feature vector to be compared and each feature vector in the feature bottom library by using a CUBLAS library.
Specifically, feature combination can be performed on the feature vectors led into the bottom base according to the dimensions to form a matrix, feature combination is performed on the feature vectors to be compared according to the dimensions to form a matrix, low-precision multiplication is performed on the matrix, which is equivalent to cross correlation of the vectors, the similarity result of each real-time vector and the bottom base vector is output, and the dimensions are the number of the real-time vectors plus the number of the bottom base vectors. The feature vector of the bottom library and the feature vector to be compared are converted into the matrix and matrix calculation similarity by adopting a feature combination mode, the conversion can fully utilize a parallel calculation framework of a GPU and a high-efficiency matrix calculation library (cublas), and the calculation efficiency is greatly improved. The feature combination is a matrix in which the base library features are combined into a dimension × ID _ NUM size according to a vector dimension and an ID _ NUM (ID number), and a matrix in which the feature vectors to be compared are combined into a dimension × concurrency _ NUM size according to a dimension and concurrency.
In some embodiments, in step S5, the feature vectors corresponding to the calculated similarity within the threshold range are sorted, and specifically, a thrast library sorting and a bitonic sorting (GPU kernel) may be adopted to greatly improve the calculation efficiency.
According to the scheme provided by the invention, the similarity is calculated for all the feature vectors of the same object in the feature base library, so that the accurate similarity threshold range is determined, and the influence of overlarge similarity caused by abnormal conditions on the actual comparison result is prevented. And the precision of the feature vector is reduced, so that the problem that the storage space is limited in calculation and all the characteristics of the bottom library cannot be loaded at one time in the face of massive bottom library characteristic comparison can be solved, the calculation delay is reduced, and the global sequencing efficiency is improved. The method also solves the problem that the collapse risk of stability, throughput and efficiency in the face of large data volume is reduced while the similarity of manual design and the cuda kernel function of fast exhaust have better performance for specific tasks.
Based on the same inventive concept, according to another aspect of the present invention, an embodiment of the present invention further provides a feature comparison system 400, as shown in fig. 2, including:
a build module 401, the build module 401 configured to build a library of feature bases;
A first calculating module 402, where the first calculating module 402 is configured to calculate similarity between each two of a plurality of feature vectors corresponding to each object in the feature bottom library to obtain a similarity range corresponding to each object;
a threshold determination module 403, the threshold determination module 403 configured to merge the similarity ranges of the plurality of objects to obtain a threshold range of similarity;
a second calculation module 404, wherein the second calculation module 404 is configured to calculate a similarity between the feature vector to be compared and each feature vector in the feature bottom library;
a sorting module 405, where the sorting module 405 is configured to sort the feature vectors corresponding to the calculated similarity within the threshold range;
a returning module 406, wherein the returning module 406 is configured to return the object IDs respectively corresponding to the feature vectors to the user based on the sorting.
Based on the same inventive concept, according to another aspect of the present invention, as shown in fig. 3, an embodiment of the present invention further provides a computer apparatus 501, comprising:
at least one processor 520; and
the memory 510, the memory 510 storing a computer program 511 executable on the processor, the processor 520 executing the program performing the steps of any of the above feature comparison methods.
Based on the same inventive concept, according to another aspect of the present invention, as shown in fig. 4, an embodiment of the present invention further provides a computer-readable storage medium 601, where the computer-readable storage medium 601 stores computer program instructions 610, and the computer program instructions 610, when executed by a processor, perform the steps of any of the above feature comparison methods.
Finally, it should be noted that, as will be understood by those skilled in the art, all or part of the processes of the methods of the above embodiments may be implemented by a computer program to instruct related hardware to implement the methods. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), a Random Access Memory (RAM), or the like. The embodiments of the computer program may achieve the same or similar effects as any of the above-described method embodiments.
In addition, the apparatuses, devices, and the like disclosed in the embodiments of the present invention may be various electronic terminal devices, such as a mobile phone, a Personal Digital Assistant (PDA), a tablet computer (PAD), a smart television, and the like, or may be a large terminal device, such as a server, and the like, and therefore the scope of protection disclosed in the embodiments of the present invention should not be limited to a specific type of apparatus, device. The client disclosed by the embodiment of the invention can be applied to any one of the electronic terminal devices in the form of electronic hardware, computer software or a combination of the electronic hardware and the computer software.
Furthermore, the method disclosed according to an embodiment of the present invention may also be implemented as a computer program executed by a CPU, and the computer program may be stored in a computer-readable storage medium. The computer program, when executed by the CPU, performs the above-described functions defined in the method disclosed in the embodiments of the present invention.
Further, the above method steps and system elements may also be implemented using a controller and a computer readable storage medium for storing a computer program for causing the controller to implement the functions of the above steps or elements.
Further, it should be appreciated that the computer-readable storage media (e.g., memory) herein can be either volatile memory or nonvolatile memory, or can include both volatile and nonvolatile memory. By way of example, and not limitation, nonvolatile memory can include Read Only Memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM), which can act as external cache memory. By way of example and not limitation, RAM is available in a variety of forms such as synchronous RAM (DRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), and Direct Rambus RAM (DRRAM). The storage devices of the disclosed aspects are intended to comprise, without being limited to, these and other suitable types of memory.
Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as software or hardware depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the disclosed embodiments of the present invention.
The various illustrative logical blocks, modules, and circuits described in connection with the disclosure herein may be implemented or performed with the following components designed to perform the functions herein: a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination of these components. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP, and/or any other such configuration.
The steps of a method or algorithm described in connection with the disclosure herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC. The ASIC may reside in a user terminal. In the alternative, the processor and the storage medium may reside as discrete components in a user terminal.
In one or more exemplary designs, the functions may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer. By way of example, and not limitation, such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a general-purpose or special-purpose computer, or a general-purpose or special-purpose processor. Also, any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk and disc, as used herein, includes Compact Disc (CD), laser disc, optical disc, Digital Versatile Disc (DVD), floppy disk, blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.
The foregoing is an exemplary embodiment of the present disclosure, but it should be noted that various changes and modifications could be made herein without departing from the scope of the present disclosure as defined by the appended claims. The functions, steps and/or actions of the method claims in accordance with the disclosed embodiments described herein need not be performed in any particular order. Furthermore, although elements of the disclosed embodiments of the invention may be described or claimed in the singular, the plural is contemplated unless limitation to the singular is explicitly stated.
It should be understood that, as used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly supports the exception. It should also be understood that "and/or" as used herein is meant to include any and all possible combinations of one or more of the associated listed items.
The numbers of the embodiments disclosed in the embodiments of the present invention are merely for description, and do not represent the merits of the embodiments.
It will be understood by those skilled in the art that all or part of the steps of implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, and the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, of embodiments of the invention is limited to these examples; within the idea of an embodiment of the invention, also technical features in the above embodiment or in different embodiments may be combined and there are many other variations of the different aspects of the embodiments of the invention as described above, which are not provided in detail for the sake of brevity. Therefore, any omissions, modifications, substitutions, improvements, and the like that may be made without departing from the spirit and principles of the embodiments of the present invention are intended to be included within the scope of the embodiments of the present invention.

Claims (10)

1. A feature comparison method, comprising the steps of:
constructing a characteristic base library;
calculating the similarity between every two of a plurality of feature vectors corresponding to each object in the feature bottom library to obtain the similarity range corresponding to each object;
taking a union set of the similarity ranges of the multiple objects to obtain a threshold range of the similarity;
calculating the similarity between the feature vector to be compared and each feature vector in the feature base library;
Sorting the feature vectors corresponding to the similarity within the threshold value range;
and returning the object IDs respectively corresponding to the feature vectors to the user based on the sorting.
2. The method of claim 1, wherein building a base library of features further comprises:
respectively extracting feature vectors from a plurality of images of the object by using the trained neural network model;
and reducing the precision of the extracted plurality of feature vectors to a target precision.
3. The method of claim 2, wherein reducing the accuracy of the extracted plurality of feature vectors to a target accuracy, further comprises:
sorting scalar data constituting the feature vector;
respectively calculating the data densities corresponding to the first M scalar data and the last M scalar data according to the range value corresponding to the target precision;
determining a first scaling value by dividing the scalar data with the maximum data density in the first M scalar data by the maximum value in the range values corresponding to the target precision, and determining a second scaling value by dividing the scalar data with the maximum data density in the last M scalar data by the minimum value in the range values corresponding to the target precision;
And taking the larger value of the first scaling value and the second scaling value as a final scaling value, and performing quantization processing on the plurality of feature vectors by using the final scaling value so as to reduce the precision of the plurality of feature vectors to the target precision.
4. The method of claim 3, wherein the data densities corresponding to the first M scalar data and the last M scalar data are respectively calculated according to the range values corresponding to the target accuracy, further comprising:
subtracting the maximum value in the range values corresponding to the target precision from each scalar data in the first M scalar data to obtain an intermediate value;
respectively counting the number of each scalar data in the first M scalar data and the scalar data in the corresponding intermediate value range to determine the data density according to the number; or,
subtracting the minimum value in the range value corresponding to the target precision from each scalar data in the last M scalar data to obtain an intermediate value;
counting the number of each scalar data in the last M scalar data and the scalar data in the corresponding intermediate value range respectively to determine the data density according to the number.
5. The method of claim 3, wherein quantizing the plurality of feature vectors using the final scaled value, further comprising:
dividing each scalar data constituting the feature vector by the scaling value.
6. The method of claim 1, wherein calculating a similarity between the feature vector to be compared and each feature vector in the feature base library further comprises:
and calculating the similarity between the feature vector to be compared and each feature vector in the feature bottom library by using a CUBLAS library.
7. The method of claim 1, wherein the similarity ranges of the plurality of objects are merged to obtain a threshold range of similarities, further comprising:
adjusting the threshold range.
8. A feature comparison system, comprising:
a build module configured to build a base library of features;
a first calculation module, configured to calculate similarity between each two of a plurality of feature vectors corresponding to each object in the feature base library to obtain a similarity range corresponding to each object;
A threshold determination module configured to aggregate similarity ranges of a plurality of objects to obtain a threshold range of similarity;
a second calculation module configured to calculate a similarity between the feature vector to be compared and each feature vector in the feature base library;
the sorting module is configured to sort the feature vectors corresponding to the calculated similarity within the threshold range;
a returning module configured to return object IDs respectively corresponding to the feature vectors to a user based on the sorting.
9. A computer device, comprising:
at least one processor; and
memory storing a computer program operable on the processor, wherein the processor executes the program to perform the steps of the method according to any of claims 1-7.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, is adapted to carry out the steps of the method according to any one of claims 1 to 7.
CN202010640869.8A 2020-07-06 2020-07-06 Feature comparison method, system, device and medium Withdrawn CN111860627A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010640869.8A CN111860627A (en) 2020-07-06 2020-07-06 Feature comparison method, system, device and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010640869.8A CN111860627A (en) 2020-07-06 2020-07-06 Feature comparison method, system, device and medium

Publications (1)

Publication Number Publication Date
CN111860627A true CN111860627A (en) 2020-10-30

Family

ID=73152177

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010640869.8A Withdrawn CN111860627A (en) 2020-07-06 2020-07-06 Feature comparison method, system, device and medium

Country Status (1)

Country Link
CN (1) CN111860627A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112446816A (en) * 2021-02-01 2021-03-05 成都点泽智能科技有限公司 Video memory dynamic data storage method and device and server
CN112735437A (en) * 2020-12-15 2021-04-30 厦门快商通科技股份有限公司 Voiceprint comparison method, system and device and storage mechanism

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112735437A (en) * 2020-12-15 2021-04-30 厦门快商通科技股份有限公司 Voiceprint comparison method, system and device and storage mechanism
CN112446816A (en) * 2021-02-01 2021-03-05 成都点泽智能科技有限公司 Video memory dynamic data storage method and device and server

Similar Documents

Publication Publication Date Title
CN110555450B (en) Face recognition neural network adjusting method and device
CN110413255B (en) Artificial neural network adjusting method and device
CN112203095B (en) Video motion estimation method, device, equipment and computer readable storage medium
CN110555508A (en) Artificial neural network adjusting method and device
CN111968150B (en) Weak surveillance video target segmentation method based on full convolution neural network
US20220217343A1 (en) Point cloud partition methods, encoder, and decoder
CN110149529B (en) Media information processing method, server and storage medium
CN111860627A (en) Feature comparison method, system, device and medium
CN111222005B (en) Voiceprint data reordering method and device, electronic equipment and storage medium
CN109447023B (en) Method for determining image similarity, and method and device for identifying video scene switching
JP7099163B2 (en) Learning device and learning method
Zhang et al. Lightweight and progressively-scalable networks for semantic segmentation
JP2023510945A (en) Scene identification method and apparatus, intelligent device, storage medium and computer program
CN115630236A (en) Global fast retrieval positioning method of passive remote sensing image, storage medium and equipment
Lin et al. Robust fisher codes for large scale image retrieval
CN114155388B (en) Image recognition method and device, computer equipment and storage medium
WO2016110125A1 (en) Hash method for high dimension vector, and vector quantization method and device
CN113704565B (en) Learning type space-time index method, device and medium based on global interval error
CN116610840A (en) Similar data searching method, system and electronic equipment
CN117648495B (en) Data pushing method and system based on cloud primary vector data
CN116306709A (en) Data processing method, medium and electronic equipment
Li et al. A hierarchical retrieval method based on hash table for audio fingerprinting
CN112200275B (en) Artificial neural network quantification method and device
CN113177627B (en) Optimization system, retraining system, method thereof, processor and readable medium
CN111382287A (en) Picture searching method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20201030