CN114168081A - High-dimensional feature storage method and device, storage medium and electronic equipment - Google Patents

High-dimensional feature storage method and device, storage medium and electronic equipment Download PDF

Info

Publication number
CN114168081A
CN114168081A CN202111500602.XA CN202111500602A CN114168081A CN 114168081 A CN114168081 A CN 114168081A CN 202111500602 A CN202111500602 A CN 202111500602A CN 114168081 A CN114168081 A CN 114168081A
Authority
CN
China
Prior art keywords
feature
stored
features
storage
distance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111500602.XA
Other languages
Chinese (zh)
Inventor
钟凯
杨娟
韩志均
李骁
陈琳莉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Corp Ltd
Original Assignee
China Telecom Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Corp Ltd filed Critical China Telecom Corp Ltd
Priority to CN202111500602.XA priority Critical patent/CN114168081A/en
Publication of CN114168081A publication Critical patent/CN114168081A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Library & Information Science (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure belongs to the technical field of computers, and relates to a high-dimensional feature storage method and device, a storage medium and electronic equipment. The method comprises the following steps: acquiring a feature to be stored, and determining a feature space corresponding to the feature to be stored; determining characteristic values of the to-be-stored characteristics on various characteristic dimensions, and determining reference characteristics in a characteristic space according to the characteristic values; and calculating the to-be-stored characteristics and the reference characteristics to obtain a first characteristic distance, and establishing a mapping relation between the to-be-stored characteristics and the storage area according to the first characteristic distance so as to store the to-be-stored characteristics in a partitioning manner based on the mapping relation. According to the method and the device for searching the features, the features to be stored can be stored in a partition mode through the first feature distance between the features to be stored and the reference features, node pressure caused by the fact that all the features to be stored are stored in one node is reduced, the features to be stored are stored in a partition mode, and efficiency of follow-up feature retrieval is improved.

Description

High-dimensional feature storage method and device, storage medium and electronic equipment
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a high-dimensional feature storage method, a high-dimensional feature storage apparatus, a computer-readable storage medium, and an electronic device.
Background
With the development of computer technology, in practical application scenarios, the dimensions of extracted sample features are generally high, and the number of extracted sample features is huge, and in order to perform feature retrieval, these features with high dimensions and huge number need to be stored.
In the prior art, these features with high dimensionality and huge number are usually stored in the same computer node, and further the storage pressure of the one computing node is increased.
In view of the above, there is a need in the art to develop a new method and apparatus for storing high-dimensional features.
It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present disclosure, and thus may include information that does not constitute prior art known to those of ordinary skill in the art.
Disclosure of Invention
The present disclosure is directed to a high-dimensional feature storage method, a high-dimensional feature storage apparatus, a computer-readable storage medium, and an electronic device, so as to overcome, at least to some extent, the problem of excessive storage pressure of computer nodes caused by storing all features in one computer node due to related technologies.
Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.
According to a first aspect of embodiments of the present invention, there is provided a high-dimensional feature storage method, the method including: acquiring a feature to be stored, and determining a feature space corresponding to the feature to be stored; determining feature values of the features to be stored on each feature dimension, and determining reference features in the feature space according to the feature values; and calculating the features to be stored and the reference features to obtain a first feature distance, and establishing a mapping relation between the features to be stored and a storage area according to the first feature distance so as to store the features to be stored in a partitioned manner based on the mapping relation.
In an exemplary embodiment of the present invention, the determining a reference feature in the feature space according to the feature value includes: calculating the characteristic values corresponding to the same characteristic dimension to obtain a characteristic calculation result, and dividing the characteristic dimension according to the number of preset reference characteristics to obtain a plurality of dimension division results of the preset reference characteristics; searching in the feature space to obtain a reference feature; wherein, in the reference feature, the feature value corresponding to one of the target feature dimensions is consistent with the feature calculation result; the target feature dimension corresponds to the dimension division result.
In an exemplary embodiment of the present invention, the establishing a mapping relationship between the feature to be stored and the storage area according to the first feature distance includes: determining the first characteristic distance corresponding to the same characteristic dimension, and calculating according to the first characteristic distance to obtain a characteristic distance range under the same characteristic dimension; acquiring a preset division interval, and dividing the characteristic distance range according to the preset division interval by taking the target reference characteristic as a starting point to obtain a distance division result; wherein the target reference features correspond to the same feature dimension; determining a target distance division result to which the first characteristic distance belongs in the distance division results to establish a partition mapping relation between a partition identifier and the first characteristic distance under the same characteristic dimension; wherein the partition identification corresponds to the target distance division result; and determining the partition identifications corresponding to all the feature dimensions as area identifications corresponding to the features to be stored, and establishing a mapping relation between the features to be stored and a storage area according to the area identifications.
In an exemplary embodiment of the present invention, the establishing, according to the area identifier, a mapping relationship between the feature to be stored and a storage area includes: searching for an adjacent area identifier having an identifier adjacent relation with the area identifier; wherein, the partition identifier on at least one characteristic dimension of the adjacent region identifier has an adjacent relationship with the partition identifier on the same characteristic dimension in the region identifier; and determining the features to be stored corresponding to the area identification and the features to be stored corresponding to the adjacent area identification as target features to be stored so as to establish a mapping relation between the target features to be stored and the storage area.
In an exemplary embodiment of the present invention, the performing, on the basis of the mapping relationship, partition storage on the feature to be stored includes: and storing the target characteristics to be stored with the mapping relation with the same storage area in the same computer node.
In an exemplary embodiment of the invention, the method further comprises: acquiring an object to be identified, and extracting a feature to be identified corresponding to the object to be identified; calculating a second feature distance between the feature to be identified and the reference feature, and determining a target storage area having an identification mapping relation with the feature to be identified in all the storage areas according to the second feature distance; and determining the features to be stored belonging to the target storage area, and determining target storage features similar to the objects to be identified in the features to be stored so as to determine the storage objects corresponding to the target storage features as identification results.
In an exemplary embodiment of the present invention, the determining, in all the storage areas, a target storage area having an identification mapping relationship with the feature to be identified includes: in the storage area, determining a first storage area to which the second characteristic distance belongs, and determining an area identifier corresponding to the first storage area; and searching the adjacent area identifier having the identifier adjacent relation with the area identifier, and determining a second storage area corresponding to the adjacent area identifier so as to determine the first storage area and the second storage area as target storage areas.
According to a second aspect of embodiments of the present invention, there is provided a high-dimensional feature storage apparatus, the apparatus including: the device comprises a first determining module, a second determining module and a storage module, wherein the first determining module is configured to acquire a feature to be stored and determine a feature space corresponding to the feature to be stored; the second determination module is configured to determine feature values of the features to be stored on various feature dimensions and determine reference features in the feature space according to the feature values; the storage module is configured to calculate the feature to be stored and the reference feature to obtain a first feature distance, and establish a mapping relation between the feature to be stored and a storage area according to the first feature distance, so as to perform partition storage on the feature to be stored based on the mapping relation.
According to a third aspect of embodiments of the present invention, there is provided an electronic apparatus including: a processor and a memory; wherein the memory has stored thereon computer readable instructions which, when executed by the processor, implement the high dimensional feature storage method of any of the exemplary embodiments described above.
According to a fourth aspect of embodiments of the present invention, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the high-dimensional feature storage method in any of the exemplary embodiments described above.
As can be seen from the foregoing technical solutions, the high-dimensional feature storage method, the high-dimensional feature storage apparatus, the computer storage medium, and the electronic device in the exemplary embodiments of the present invention have at least the following advantages and positive effects:
in the method and the device provided by the exemplary embodiment of the disclosure, on one hand, the to-be-stored feature and the reference feature are calculated to obtain the first feature distance, and based on this, the to-be-stored feature is stored in a partition manner, so that the problem of overlarge storage pressure of computer nodes caused by storing all the to-be-stored features on the same computer node is avoided; on the other hand, as the features to be stored are stored in a partition mode, in the subsequent process of searching or identifying the features, all the features to be stored do not need to be compared, and only the features to be stored in the corresponding storage areas need to be compared, the time for obtaining the search result or the identification result is shortened, and the searching or identifying efficiency is improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. It is to be understood that the drawings in the following description are merely exemplary of the disclosure, and that other drawings may be derived from those drawings by one of ordinary skill in the art without the exercise of inventive faculty.
FIG. 1 is a schematic flow chart illustrating a high-dimensional feature storage method according to an embodiment of the present disclosure;
FIG. 2 is a schematic flow chart illustrating the determination of the reference feature in the high-dimensional feature storage method according to the embodiment of the disclosure;
fig. 3 schematically illustrates a flow chart of establishing a mapping relationship between a feature to be stored and a storage region in the high-dimensional feature storage method in the embodiment of the present disclosure;
FIG. 4 is a diagram schematically illustrating a distance division result in a high-dimensional feature storage method according to an embodiment of the present disclosure;
fig. 5 is a schematic diagram illustrating a distance division result when the number of reference features is 3 in the high-dimensional feature storage method in the embodiment of the present disclosure;
fig. 6 schematically illustrates a flow chart of establishing a mapping relationship between a feature to be stored and a storage region in the high-dimensional feature storage method in the embodiment of the present disclosure;
FIG. 7 is a schematic flow chart illustrating determination of a recognition result in a high-dimensional feature storage method according to an embodiment of the disclosure;
FIG. 8 is a schematic flow chart illustrating the determination of a target storage area in the high-dimensional feature storage method according to the embodiment of the disclosure;
fig. 9 is a schematic diagram schematically illustrating a region in which a feature most similar to a feature to be identified may be stored in one feature dimension in a high-dimensional feature storage method according to an embodiment of the present disclosure;
FIG. 10 is a schematic diagram illustrating an architecture of a high-dimensional feature storage device in an embodiment of the present disclosure;
FIG. 11 schematically illustrates an electronic device for a high-dimensional feature storage method in an embodiment of the present disclosure;
FIG. 12 schematically illustrates a computer-readable storage medium for a high-dimensional feature storage method in an embodiment of the disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the subject matter of the present disclosure can be practiced without one or more of the specific details, or with other methods, components, devices, steps, and the like. In other instances, well-known technical solutions have not been shown or described in detail to avoid obscuring aspects of the present disclosure.
The terms "a," "an," "the," and "said" are used in this specification to denote the presence of one or more elements/components/parts/etc.; the terms "comprising" and "having" are intended to be inclusive and mean that there may be additional elements/components/etc. other than the listed elements/components/etc.; the terms "first" and "second", etc. are used merely as labels, and are not limiting on the number of their objects.
Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities.
In order to solve the problems in the related art, the present disclosure provides a high-dimensional feature storage method. Fig. 1 shows a flow diagram of a high-dimensional feature storage method, as shown in fig. 1, the high-dimensional feature storage method at least includes the following steps:
and S110, acquiring the features to be stored, and determining a feature space corresponding to the features to be stored.
And S120, determining characteristic values of the features to be stored in each characteristic dimension, and determining the reference features in the feature space according to the characteristic values.
Step S130, calculating the feature to be stored and the reference feature to obtain a first feature distance, and establishing a mapping relation between the feature to be stored and the storage area according to the first feature distance so as to store the feature to be stored in a partition mode based on the mapping relation.
In the method and the device provided by the exemplary embodiment of the disclosure, on one hand, the to-be-stored feature and the reference feature are calculated to obtain the first feature distance, and based on this, the to-be-stored feature is stored in a partition manner, so that the problem of overlarge storage pressure of computer nodes caused by storing all the to-be-stored features on the same computer node is avoided; on the other hand, as the features to be stored are stored in a partition mode, in the subsequent process of searching or identifying the features, all the features to be stored do not need to be compared, and only the features to be stored in the corresponding storage areas need to be compared, the time for obtaining the search result or the identification result is shortened, and the searching or identifying efficiency is improved.
The following describes each step of the high-dimensional feature storage method in detail.
In step S110, a feature to be stored is acquired, and a feature space corresponding to the feature to be stored is determined.
In an exemplary embodiment of the present disclosure, the feature to be stored refers to a high-dimensional feature, and the feature to be stored is a feature that needs to be stored in a partitioned manner subsequently, specifically, the feature to be stored may be an image feature, may be a voiceprint feature, may be a feature in a signal, and this exemplary embodiment is not particularly limited to this. The feature space is an abstract space corresponding to features to be stored, and there are many features to be stored having different feature values in the feature space.
For example, 10 features to be stored are acquired to determine a feature space a corresponding to the 10 features to be stored.
In the present exemplary embodiment, a feature space corresponding to a feature to be stored is determined, which facilitates subsequent determination of a reference feature in the feature space.
In step S120, feature values of the features to be stored in the respective feature dimensions are determined, and the reference feature is determined in the feature space according to the feature values.
In an exemplary embodiment of the present disclosure, the feature to be stored may be a feature having 3 feature dimensions, a feature having 4 feature dimensions, a feature having 5 positive dimensions, or even a feature having a higher feature dimension, and one feature to be stored is composed of feature values in the respective feature dimensions.
The reference feature value refers to a feature for dividing a feature space, the feature space can be divided into a plurality of partitions by dividing the feature space by the reference feature, and the reference feature usually has at least two features, and the distances between every two features are close, in addition, it is also required to ensure that the number of features to be stored in the plurality of partitions is equal as much as possible.
For example, the feature to be stored is a feature having 6 feature dimensions, feature values of the feature to be stored in a first feature dimension, a second feature dimension, a third feature dimension, a fourth feature dimension, a fifth feature dimension, and a sixth feature dimension are respectively determined, and three reference features are determined in a feature space according to the feature values.
In an alternative embodiment, fig. 2 is a schematic flow chart illustrating the determination of the reference feature in the high-dimensional feature storage method, as shown in fig. 2, the method at least includes the following steps: in step S210, feature values corresponding to the same feature dimension are calculated to obtain a feature calculation result, and the feature dimension is divided according to the number of preset reference features to obtain a dimension division result of the number of preset reference features.
The feature calculation result refers to a result of calculating feature values of all features to be stored in the same feature dimension, and specifically, the feature calculation result may be a result of calculating an average value of the feature values of all features to be stored in the same feature dimension, or may be a result of calculating a median of the feature values of all features to be stored in the same feature dimension, which is not particularly limited in this exemplary embodiment.
The preset reference feature number is a preset quantity value of the reference feature that needs to be determined, generally, the preset reference feature number may be 3 or 2, and may also be adjusted along with the change of the quantity of the feature dimensions of the feature to be stored, which is not particularly limited in this exemplary embodiment.
The dimension division result refers to a result obtained by dividing the feature dimension according to the preset reference feature number, for example, if the feature dimension of the feature to be stored is 6, and the preset reference feature number is 3, then there are 3 division results obtained by dividing the feature dimension, the first division result is a first feature dimension of the feature to be stored and a second feature dimension of the feature to be stored, the second division result is a third feature dimension of the feature to be stored and a fourth feature dimension of the feature to be stored, and similarly, the third division result is a fifth feature dimension of the feature to be stored and a sixth feature dimension of the feature to be stored.
For example, formula (1) is a formula for calculating the feature calculation result.
Figure BDA0003402471050000081
Wherein M represents the number of the features to be stored, N represents the feature dimension of the features to be stored, fi,jAnd the j-dimension characteristic value of the ith characteristic to be stored is represented, and based on the j-dimension characteristic value, the characteristic calculation results of the characteristic to be stored under the same characteristic dimension can be respectively calculated.
And assuming that the feature dimension of the feature to be stored is 6, and the number of the preset reference features is 3, there are 3 dimension division results, where the first division result is a first feature dimension of the feature to be stored and a second feature dimension of the feature to be stored, the second division result is a third feature dimension of the feature to be stored and a fourth feature dimension of the feature to be stored, and similarly, the third division result is a fifth feature dimension of the feature to be stored and a sixth feature dimension of the feature to be stored.
In step S220, performing a search in the feature space to obtain a reference feature; in the reference feature, a feature value corresponding to one target feature dimension is consistent with a feature calculation result; and the dimension of the target feature corresponds to the dimension division result.
The reference feature is a feature in a feature space, and in the reference feature, there is a feature value consistent with the feature statistical result, and the feature value corresponds to a target feature dimension, which is a feature dimension in the dimension division result.
For example, if the feature dimension of the feature to be stored is 6, and the number of the preset reference features is 3, there are 3 dimension division results, the first division result is a first feature dimension of the feature to be stored and a second feature dimension of the feature to be stored, the second division result is a third feature dimension of the feature to be stored and a fourth feature dimension of the feature to be stored, and similarly, the third division result is a fifth feature dimension of the feature to be stored and a sixth feature dimension of the feature to be stored.
Further, using formula (1), feature calculation results in different feature dimensions can be calculated, specifically, the calculated feature calculation result in the first feature dimension is a, the calculated feature calculation result in the second feature dimension is B, the calculated feature calculation result in the third feature dimension is C, the calculated feature calculation result in the fourth feature dimension is D, the calculated feature calculation result in the fifth feature dimension is E, and the calculated feature calculation result in the sixth feature dimension is F.
Based on this, in the first reference feature, the feature value in the first feature dimension is consistent with the feature calculation result a, the feature value in the second feature dimension is consistent with the feature calculation result B, the feature values in the remaining feature dimensions may be 0, in the second reference feature, the feature value in the third feature dimension is consistent with the feature calculation result C, the feature value in the fourth feature dimension is consistent with the feature calculation result D, the feature values in the remaining feature dimensions may be 0, in the third reference feature, the feature value in the fifth feature dimension is consistent with the feature calculation result E, the feature value in the sixth feature dimension is consistent with the feature calculation result F, and the feature values in the remaining feature dimensions may be 0.
In the exemplary embodiment, the reference feature is determined in the feature space according to the feature value of the feature to be stored, which is helpful for subsequently determining the storage area with the mapping relationship between the feature to be stored, so that the partitioned storage of the feature to be stored is realized, the situation that all the features to be stored are stored in the same computer node is avoided, and besides, a basis is provided for subsequent efficient feature retrieval.
In step S130, the feature to be stored and the reference feature are calculated to obtain a first feature distance, and a mapping relationship between the feature to be stored and the storage area is established according to the first feature distance, so as to perform partition storage on the feature to be stored based on the mapping relationship.
In the exemplary embodiment of the present disclosure, the first feature distance reflects a similarity value between the feature to be stored and the reference feature, and specifically, the first feature distance may be obtained by calculation using an euclidean distance formula, may also be obtained by using a manhattan distance formula, may also be obtained by using a cosine distance formula, and may also be obtained by using a hamming distance, which is not particularly limited in this exemplary embodiment.
After the first feature distance is obtained, a mapping relation between the features to be stored and the storage area can be established according to the first feature distance, so that the features to be stored are stored in the storage area with the mapping relation.
For example, equation (2) shows one equation for calculating the first feature distance.
Figure BDA0003402471050000101
Wherein X is the feature to be stored, Y is the reference feature, n represents the feature dimension of the feature to be stored and the reference feature, and XiRepresenting the feature value, y, of the feature to be stored in the ith feature dimensioniRepresenting the feature value of the reference feature in the ith feature dimension.
After the first feature distance is obtained based on formula (2), mapping relationships between different features to be stored and different storage areas may be established according to the size of the first feature distance, and assuming that a mapping relationship between the feature a to be stored and the storage area a1 is established and a relationship between the feature B to be stored and the storage area C1 is established, the feature a to be stored is stored in the storage area a1 and the feature B to be stored is stored in the storage area C1.
In an alternative embodiment, fig. 3 is a schematic flow chart illustrating a method for establishing a mapping relationship between a feature to be stored and a storage region in a high-dimensional feature storage method, where as shown in fig. 3, the method at least includes the following steps: in step S310, a first feature distance corresponding to the same feature dimension is determined, and a feature distance range in the same feature dimension is obtained through calculation according to the first feature distance.
After the first feature distance is obtained through calculation, the first feature distances of all the features to be stored in the same feature dimension need to be determined, and the feature distance range in the same feature dimension is determined according to the obtained first feature dimension distance in the same feature dimension, that is, the maximum value of the first feature distance in the same feature dimension and the minimum value of the first feature distance in the same feature dimension are determined.
For example, the first feature distance includes feature distances calculated by different features to be stored in different feature dimensions, and based on this, assuming that the features to be stored have 3 feature dimensions, it is necessary to determine the first feature distance of the different features to be stored in the first feature dimension, the first feature distance of the different features to be stored in the second feature dimension, and the first feature distance of the different features to be stored in the third feature dimension.
Based on this, a first feature distance range in three feature dimensions can be obtained.
In step S320, a preset division interval is obtained, and a feature distance range is divided according to the preset division interval with the target reference feature as a starting point to obtain a distance division result; wherein the target reference features correspond to the same feature dimension.
The preset division interval refers to an interval value, so that the characteristic distance range is divided to obtain a distance division result, and it is worth explaining that starting points for dividing the characteristic distance range correspond to the same characteristic dimension.
For example, fig. 4 schematically illustrates a schematic diagram of a distance division result, where the same feature dimension is a first feature dimension of a feature to be stored, as shown in fig. 4, a feature 410 is a reference feature a corresponding to the first feature dimension, a preset division interval is d1, and a coordinate 420 represents a feature distance range, and then the distance division result 430 may be obtained by dividing the feature distance range with the reference feature a as a starting point.
In step S330, in the distance division result, determining a target distance division result to which the first feature distance belongs, so as to establish a partition mapping relationship between the partition identifier and the first feature distance in the same feature dimension; and the partition identification corresponds to the target distance division result.
The partition identifier refers to an identifier corresponding to a target distance division result, where the target distance division result is a result to which a first feature distance in the distance division result belongs, and further, a partition mapping relationship between the partition identifier and the first feature distance may be established, where the partition mapping relationship is a mapping relationship between the first feature distance and the distance division result to which the first feature distance belongs.
For example, table 1 schematically shows a partition mapping relationship between the partition identifier and the first feature distance, as shown in table 1, where the first column of table 1 represents the region identifier, the second column of table 1 represents the range to which the first feature distance belongs, and in a second row, the second row represents that, when the first feature distance is in the range greater than or equal to 0 and less than d1, the partition mapping relationship between the first feature distance and the partition identifier d0 is established, and similarly, the meanings represented by the other rows in table 1 are similar to those in row 2.
Region identification The first characteristic distance belongs to
d0 0 is less than or equal to the first characteristic distance < d1
d1 d1 ≦ first feature distance < d2
d2 d2 ≦ first feature distance < d3
d3 d3 ≦ first feature distance < d4
d4 d4 ≦ first feature distance < d5
In step S340, the partition identifiers corresponding to all the feature dimensions are determined as the area identifiers corresponding to the features to be stored, and a mapping relationship between the features to be stored and the storage area is established according to the area identifiers.
The region identifier includes partition identifiers in different feature dimensions, and after the partition identifier of the feature to be stored is determined, a mapping relationship between the feature to be stored and a storage region can be established, wherein the storage region corresponds to the partition identifier.
For example, fig. 5 schematically shows a schematic diagram of distance division results when the reference features are 3, as shown in fig. 5, where the feature 510 is a reference feature corresponding to a first feature dimension, the feature 520 is a reference feature corresponding to a second feature dimension, the feature 530 is a reference feature corresponding to a third feature dimension, a straight line 540 represents a feature distance range corresponding to the feature in the dimension, information on the interval of the straight line 540 represents a partition identifier, specifically, a first partition identifier corresponding to the reference feature 510 may be a1, a second partition identifier corresponding to the reference feature 510 may be a2, other partition identifiers may be similar, a first partition identifier corresponding to the reference identifier 520 may be b1, a second partition identifier corresponding to the reference identifier 520 may be b2, other partition identifiers may be similar, a first partition identifier corresponding to the reference identifier 530 may be c1, the second partition identification corresponding to the reference identification 530 may be c2, other partition identifications and so on.
Based on this, the region identifiers corresponding to the features 550 to be stored are (a5, b7, c4), and a mapping relationship between the features 550 to be stored and the region identifiers (a5, b7, c4) is established.
In the exemplary embodiment, the mapping relationship between the features to be stored and the storage area is established according to the area identifier, so that partition storage of the subsequent features to be stored is facilitated, the situation that all the features to be stored are stored in the same computer node is avoided, the storage pressure of the computer node is reduced, and besides, the efficiency of subsequent feature retrieval is facilitated to be improved.
In the present exemplary embodiment, fig. 6 is a schematic flowchart illustrating a process of establishing a mapping relationship between a feature to be stored and a storage area in a high-dimensional feature storage, where as shown in fig. 6, the method at least includes the following steps: in step S610, an adjacent area identifier having an identifier adjacency relation with the area identifier is searched for; and the partition identification on at least one characteristic dimension of the adjacent region identification and the partition identification on the same characteristic dimension in the region identification have an adjacent relation.
The neighboring area identifier refers to one of the area identifiers, but has an adjacent relationship with the current area identifier, and specifically, the adjacent relationship refers to that the partition identifier on at least one feature dimension in the neighboring area identifier needs to have an adjacent relationship with the partition identifier on the same dimension in the area identifier.
For example, as shown in fig. 5, it is assumed that the region identifiers are (a5, b4, c6), where a5 is the partition identifier in the first feature dimension, b4 is the partition identifier in the second feature dimension, and c6 is the partition identifier in the third feature dimension, and obviously, in the first feature dimension, the partitions having an adjacent relationship with a5 are identified as a4 and a6, similarly, in the second feature dimension, the partitions having an adjacent relationship with b4 are identified as b3 and b5, and in the third feature dimension, the partitions having an adjacent relationship with c6 are identified as c5 and c 7.
Based on this, there is 3MM is the number of feature dimensions, and specifically, the adjacent region identifier includes only one featureThe dimension-wise partition identifier and the partition identifier in the same feature dimension in the region identifier have an adjacent relationship (a4, b4, c6), (a6, b4, c6), (a5, b3, c6), (a5, b5, c6), (a5, b4, c5), (a5, b4, c7), and the region identifier in the two feature dimensions and the partition identifier in the same feature dimension in the region identifier have an adjacent relationship, for example, (a4, b3, c6), (a4, b5, c6), (a3, b3, c6), (a3, b3, c 3), (a3, c 3), and similarly, the region identifier in the three feature dimensions and the partition identifier in the same feature dimension in the region identifier have an adjacent relationship, for example, (a3, b3, c3, 3).
In step S620, the feature to be stored corresponding to the area identifier and the feature to be stored corresponding to the adjacent area identifier are determined as target features to be stored, so as to establish a mapping relationship between the target features to be stored and the storage area.
And establishing a mapping relation between the target to-be-stored characteristics and the storage area, wherein the target to-be-stored characteristics comprise the to-be-stored characteristics corresponding to the area identification and the to-be-stored characteristics corresponding to the adjacent area identification.
For example, the feature to be stored corresponding to the area identifier includes a feature a to be stored and a feature B to be stored, and the feature to be stored corresponding to the adjacent area identifier includes a feature C to be stored, based on which the target feature to be stored includes the feature a to be stored, the feature B to be stored, and the feature C to be stored, and then a mapping relationship between the target feature to be stored and the storage area is established.
In the exemplary embodiment, the target to-be-stored features include to-be-stored features corresponding to the area identifiers and to-be-stored features corresponding to the adjacent area identifiers, and on this basis, the mapping relationship between the target to-be-stored features and the storage area is established, so that the situation that only the to-be-stored features corresponding to the area identifiers are searched, but the to-be-stored features corresponding to the adjacent area identifiers are ignored is avoided, and the accuracy of subsequent feature search is improved.
In an optional embodiment, performing partition storage on the to-be-stored feature based on the mapping relationship includes: and storing the target characteristics to be stored which have mapping relation with the same storage area in the same computer node.
The computer nodes are nodes for storing the areas to be stored, and different target characteristics to be stored, which have mapping relations with different storage areas, can be stored in different computer nodes.
For example, the target to-be-stored features having a mapping relationship with the storage area a include the to-be-stored feature C1 and the to-be-stored feature C2, the target to-be-stored features having a mapping relationship with the storage area B include the to-be-stored feature C3 and the to-be-stored feature C4, the to-be-stored feature C1 and the to-be-stored feature C2 are stored in the computer node D1, and the to-be-stored feature C3 and the to-be-stored feature C4 are stored in the computer node D2.
In the exemplary embodiment, different target to-be-stored features having mapping relations with different storage areas are stored in different computer nodes, so that the situation that the storage pressure of the computer nodes is too large due to the fact that all the to-be-stored features are stored in one computer node is avoided.
In an alternative embodiment, fig. 7 is a schematic flow chart illustrating the determination of the recognition result in the high-dimensional feature storage method, as shown in fig. 7, the method at least includes the following steps: in step S710, an object to be recognized is acquired, and a feature to be recognized corresponding to the object to be recognized is extracted.
The object to be recognized refers to an object that needs to be compared with features to be stored in different storage areas, specifically, the object to be recognized may be a picture, a segment of sound, or a signal.
For example, if the object to be recognized is a picture a, the extracted feature to be recognized is a picture feature corresponding to the picture a.
In step S720, a second feature distance between the feature to be recognized and the reference feature is calculated, and a target storage area having a recognition mapping relation with the feature to be recognized is determined in all the storage areas according to the second feature distance.
The second feature distance refers to a similarity between the feature to be recognized and the reference feature, and specifically may be obtained by calculation using an euclidean distance formula, or may be obtained by using a manhattan distance formula, or may be obtained by using a cosine distance formula, or may be obtained by using a hamming distance, which is not particularly limited in this exemplary embodiment.
The target storage area refers to a storage area having an identification mapping relation with the features to be identified, the identification mapping relation refers to the features to be stored in the target storage area, which are used for comparing with the features to be identified, and the identification result can be obtained according to the comparison result.
For example, a second feature distance is calculated by using formula (2), where X in formula (2) is a feature to be recognized, and the calculated second feature distance includes a second feature distance corresponding to the first feature dimension, a second feature distance corresponding to the second feature dimension, and a third feature distance corresponding to the third feature dimension, based on which the second feature distance may be (a, B, C), and as shown in fig. 5, assuming that a belongs to a1, B belongs to B3, and C belongs to C7, a target storage region having a recognition mapping relationship with the feature to be recognized, which can be determined according to the partition identifier (a1, B3, C7), is a storage region D.
In step S730, features to be stored belonging to the target storage area are determined, and a target storage feature similar to the object to be recognized is determined among the features to be stored, so that the storage object corresponding to the target storage feature is determined as a recognition result.
Specifically, the similarity calculation process may be obtained by calculation using an euclidean distance formula, may also be obtained by using a manhattan distance formula, may also be obtained by using a cosine distance formula, and may also be obtained by using a hamming distance, which is not particularly limited in this exemplary embodiment.
For example, the feature to be identified is a feature extracted from a picture, the feature to be stored belonging to a target storage area is determined as a feature a to be stored and a feature B to be stored, the feature a to be stored and the feature to be identified are calculated by using a formula (2) to obtain a calculation result C, the feature B to be stored and the feature to be identified are calculated by using the formula (2) to obtain a calculation result D, if the calculation result C is smaller than the calculation result D, the similarity between the feature a to be stored and the feature to be identified is higher, based on this, the storage object corresponding to the feature a to be stored is determined as an identification result, that is, the picture corresponding to the feature a to be stored is determined as the identification result.
In the exemplary embodiment, after the features to be stored are stored in a partitioned manner, feature retrieval can be performed, and the retrieved objects are the features to be stored in the target storage area, so that the phenomenon that the retrieved objects are all the features to be stored is avoided, and the efficiency of feature retrieval is improved.
In an alternative embodiment, fig. 8 is a schematic flow chart illustrating a method for determining a target storage area in a high-dimensional feature storage method, as shown in fig. 8, the method at least includes the following steps: in step S810, of the storage areas, a first storage area to which the second feature distance belongs is determined, and an area identification corresponding to the first storage area is determined.
The first storage area is an area to which the second characteristic distance belongs.
For example, as shown in fig. 5, if the second feature distance belongs to the region a in the first feature dimension, and the region identifier corresponding to the region a is a4, the region identifier corresponding to the region a is B in the second feature dimension, the region identifier corresponding to the region B is B3, the region identifier corresponding to the region C is C in the third feature dimension, and the region identifier corresponding to the region C is C2, the region identifier corresponding to the first storage region is (a4, B3, C2).
In step S820, an adjacent area identifier having an identifier adjacent relationship with the area identifier is searched, and a second storage area corresponding to the adjacent area identifier is determined, so that the first storage area and the second storage area are determined as target storage areas.
The reason why the adjacent area identifier having an identifier adjacency relation with the area identifier needs to be searched is that the feature to be stored in the area identifier is not necessarily the feature with the highest similarity to the feature to be identified.
Fig. 9 schematically shows a schematic diagram of a region in which features most similar to the feature to be identified may be stored in one feature dimension, as shown in fig. 9, where regions between different circles store regions of the feature to be stored, and it is assumed that the feature 910 is the feature to be identified, and obviously, the feature to be identified is located in a region 920, that is, the first storage region to which the distance of the second feature belongs is determined to be the region 920.
If the feature 930 can be obtained by searching the feature to be stored only in the first storage area 920, but obviously, the feature 950 to be stored belonging to the area 940 is closer to the feature 910 to be identified, that is, the similarity is greater, so when determining the target storage area, it is not possible to determine only the first storage area 920 and it is also necessary to determine a second storage area adjacent to the first storage area, and the accuracy of the obtained identification result can be ensured.
Specifically, the second area is determined, an adjacent area identifier having an identifier adjacent relationship with the area identifier is determined first, then the second storage area corresponding to the adjacent area identifier can be determined, and based on this, the first storage area and the second storage area are used as the target storage area.
For example, the region identifier is (a4, b3, c2), based on which 27 neighboring region identifiers can be determined, for example, (a3, b3, c2), (a3, b2, c2), and (a5, b4, c1), and further 27 second storage regions corresponding to the 27 neighboring region identifiers can be determined, so that the target storage region includes 28, specifically, one first storage region corresponding to the region identifier (a4, b3, c2) and one second storage region corresponding to the 27 neighboring region identifiers.
In the exemplary embodiment, the target storage area includes the first storage area and the second storage area, which avoids the situation that the target area includes only the first storage area, and improves the accuracy of the determined recognition result.
In the method and the device provided by the exemplary embodiment of the disclosure, on one hand, the to-be-stored feature and the reference feature are calculated to obtain the first feature distance, and based on this, the to-be-stored feature is stored in a partition manner, so that the problem of overlarge storage pressure of computer nodes caused by storing all the to-be-stored features on the same computer node is avoided; on the other hand, as the features to be stored are stored in a partition mode, in the subsequent process of searching or identifying the features, all the features to be stored do not need to be compared, and only the features to be stored in the corresponding storage areas need to be compared, the time for obtaining the search result or the identification result is shortened, and the searching or identifying efficiency is improved.
The following describes the high-dimensional feature storage method in the embodiment of the present disclosure in detail with reference to an application scenario.
Acquiring 5 features to be stored, determining a feature space A corresponding to the 5 features to be stored, determining feature values of the 5 features to be stored in each feature dimension, determining 3 reference features in the feature space according to the feature values in each feature dimension, and calculating the 5 features to be stored respectively in the 3 reference features to obtain a first feature distance.
And establishing a mapping relation between the 5 to-be-stored characteristics and the storage area according to the first characteristic distance so as to store the to-be-stored characteristics based on the mapping relation.
In the application scenario, on one hand, the to-be-stored features and the reference features are calculated to obtain the first feature distance, and based on the first feature distance, the to-be-stored features are stored in a partition mode, so that the problem of overlarge storage pressure of computer nodes caused by the fact that all the to-be-stored features are stored on the same computer node is solved; on the other hand, as the features to be stored are stored in a partition mode, in the subsequent process of searching or identifying the features, all the features to be stored do not need to be compared, and only the features to be stored in the corresponding storage areas need to be compared, the time for obtaining the search result or the identification result is shortened, and the searching or identifying efficiency is improved.
Further, in an exemplary embodiment of the present disclosure, a high dimensional feature storage device is also provided. Fig. 10 shows a schematic structure of a high-dimensional feature storage apparatus, and as shown in fig. 10, the high-dimensional feature storage apparatus 1000 may include: a first determination module 1010, a second determination module 1020, and a storage module 1030. Wherein:
a first determining module 1010 configured to acquire a feature to be stored and determine a feature space corresponding to the feature to be stored; a second determining module 1020 configured to determine feature values of the features to be stored in the feature dimensions, and determine a reference feature in the feature space according to the feature values; the storage module 1030 is configured to calculate the to-be-stored feature and the reference feature to obtain a first feature distance, and establish a mapping relationship between the to-be-stored feature and the storage area according to the first feature distance, so as to perform partition storage on the to-be-stored feature based on the mapping relationship.
The details of the high-dimensional feature storage apparatus 1000 are already described in detail in the corresponding high-dimensional feature storage method, and therefore will not be described herein again.
It should be noted that although several modules or units of the high-dimensional feature storage 1000 are mentioned in the above detailed description, such partitioning is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.
In addition, in an exemplary embodiment of the present disclosure, an electronic device capable of implementing the above method is also provided.
An electronic device 1100 according to such an embodiment of the invention is described below with reference to fig. 11. The electronic device 1100 shown in fig. 11 is only an example and should not bring any limitations to the function and the scope of use of the embodiments of the present invention.
As shown in fig. 11, electronic device 1100 is embodied in the form of a general purpose computing device. The components of the electronic device 1100 may include, but are not limited to: the at least one processing unit 1110, the at least one memory unit 1120, a bus 1130 connecting different system components (including the memory unit 1120 and the processing unit 1110), and a display unit 1140.
Wherein the storage unit stores program code that is executable by the processing unit 1110 to cause the processing unit 1110 to perform steps according to various exemplary embodiments of the present invention as described in the above section "exemplary methods" of the present specification.
The storage unit 1120 may include readable media in the form of volatile storage units, such as a random access memory unit (RAM)1121 and/or a cache memory unit 1122, and may further include a read-only memory unit (ROM) 1123.
Storage unit 1120 can also include a program/usage tool 1124 having a set (at least one) of program modules 1125, such program modules 1125 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, and in some combination, may comprise a representation of a network environment.
Bus 1130 may be representative of one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.
The electronic device 1100 may also communicate with one or more external devices 1170 (e.g., keyboard, pointing device, bluetooth device, etc.), one or more devices that enable a user to interact with the electronic device 1100, and/or any devices (e.g., router, modem, etc.) that enable the electronic device 1100 to communicate with one or more other computing devices. Such communication may occur via an input/output (I/O) interface 1150. Also, the electronic device 1100 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the internet) via the network adapter 1160. As shown, the network adapter 1160 communicates with the other modules of the electronic device 1100 over the bus 1130. It should be appreciated that although not shown, other hardware and/or software modules may be used in conjunction with the electronic device 1100, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) to execute the method according to the embodiments of the present disclosure.
In an exemplary embodiment of the present disclosure, there is also provided a computer-readable storage medium having stored thereon a program product capable of implementing the above-described method of the present specification. In some possible embodiments, aspects of the invention may also be implemented in the form of a program product comprising program code means for causing a terminal device to carry out the steps according to various exemplary embodiments of the invention described in the above-mentioned "exemplary methods" section of the present description, when said program product is run on the terminal device.
Referring to fig. 12, a program product 1200 for implementing the above method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
A computer readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims (10)

1. A method for storing high-dimensional features, the method comprising:
acquiring a feature to be stored, and determining a feature space corresponding to the feature to be stored;
determining feature values of the features to be stored on each feature dimension, and determining reference features in the feature space according to the feature values;
and calculating the features to be stored and the reference features to obtain a first feature distance, and establishing a mapping relation between the features to be stored and a storage area according to the first feature distance so as to store the features to be stored in a partitioned manner based on the mapping relation.
2. The method according to claim 1, wherein the determining a reference feature in the feature space according to the feature value comprises:
calculating the characteristic values corresponding to the same characteristic dimension to obtain a characteristic calculation result, and dividing the characteristic dimension according to the number of preset reference characteristics to obtain a plurality of dimension division results of the preset reference characteristics;
searching in the feature space to obtain a reference feature; wherein, in the reference feature, the feature value corresponding to one of the target feature dimensions is consistent with the feature calculation result; the target feature dimension corresponds to the dimension division result.
3. The method according to claim 2, wherein the establishing a mapping relationship between the feature to be stored and a storage area according to the first feature distance comprises:
determining the first characteristic distance corresponding to the same characteristic dimension, and calculating according to the first characteristic distance to obtain a characteristic distance range under the same characteristic dimension;
acquiring a preset division interval, and dividing the characteristic distance range according to the preset division interval by taking the target reference characteristic as a starting point to obtain a distance division result; wherein the target reference features correspond to the same feature dimension;
determining a target distance division result to which the first characteristic distance belongs in the distance division results to establish a partition mapping relation between a partition identifier and the first characteristic distance under the same characteristic dimension; wherein the partition identification corresponds to the target distance division result;
and determining the partition identifications corresponding to all the feature dimensions as area identifications corresponding to the features to be stored, and establishing a mapping relation between the features to be stored and a storage area according to the area identifications.
4. The method according to claim 3, wherein the establishing a mapping relationship between the feature to be stored and a storage region according to the region identifier comprises:
searching for an adjacent area identifier having an identifier adjacent relation with the area identifier; wherein, the partition identifier on at least one characteristic dimension of the adjacent region identifier has an adjacent relationship with the partition identifier on the same characteristic dimension in the region identifier;
and determining the features to be stored corresponding to the area identification and the features to be stored corresponding to the adjacent area identification as target features to be stored so as to establish a mapping relation between the target features to be stored and the storage area.
5. The method according to claim 4, wherein the partitioning and storing the features to be stored based on the mapping relationship comprises:
and storing the target characteristics to be stored with the mapping relation with the same storage area in the same computer node.
6. The high-dimensional feature storage method of claim 4, further comprising:
acquiring an object to be identified, and extracting a feature to be identified corresponding to the object to be identified;
calculating a second feature distance between the feature to be identified and the reference feature, and determining a target storage area having an identification mapping relation with the feature to be identified in all the storage areas according to the second feature distance;
and determining the features to be stored belonging to the target storage area, and determining target storage features similar to the objects to be identified in the features to be stored so as to determine the storage objects corresponding to the target storage features as identification results.
7. The method according to claim 6, wherein the determining a target storage area having an identification mapping relationship with the feature to be identified in all the storage areas comprises:
in the storage area, determining a first storage area to which the second characteristic distance belongs, and determining an area identifier corresponding to the first storage area;
and searching the adjacent area identifier having the identifier adjacent relation with the area identifier, and determining a second storage area corresponding to the adjacent area identifier so as to determine the first storage area and the second storage area as target storage areas.
8. A high-dimensional feature storage device, comprising:
the device comprises a first determining module, a second determining module and a storage module, wherein the first determining module is configured to acquire a feature to be stored and determine a feature space corresponding to the feature to be stored;
the second determination module is configured to determine feature values of the features to be stored on various feature dimensions and determine reference features in the feature space according to the feature values;
the storage module is configured to calculate the feature to be stored and the reference feature to obtain a first feature distance, and establish a mapping relation between the feature to be stored and a storage area according to the first feature distance, so as to perform partition storage on the feature to be stored based on the mapping relation.
9. An electronic device, comprising:
a processor;
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the high dimensional feature storage method of any one of claims 1-7 via execution of the executable instructions.
10. A computer-readable storage medium on which a computer program is stored, which, when being executed by a processor, implements the high-dimensional feature storage method of any one of claims 1 to 7.
CN202111500602.XA 2021-12-09 2021-12-09 High-dimensional feature storage method and device, storage medium and electronic equipment Pending CN114168081A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111500602.XA CN114168081A (en) 2021-12-09 2021-12-09 High-dimensional feature storage method and device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111500602.XA CN114168081A (en) 2021-12-09 2021-12-09 High-dimensional feature storage method and device, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN114168081A true CN114168081A (en) 2022-03-11

Family

ID=80484937

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111500602.XA Pending CN114168081A (en) 2021-12-09 2021-12-09 High-dimensional feature storage method and device, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN114168081A (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060136402A1 (en) * 2004-12-22 2006-06-22 Tsu-Chang Lee Object-based information storage, search and mining system method
US20100266168A1 (en) * 2007-06-22 2010-10-21 Warwick Warp Limited Fingerprint matching method and apparatus
CN108334551A (en) * 2017-12-29 2018-07-27 谷米科技有限公司 Date storage method and system, data query method and system
CN109299088A (en) * 2018-08-22 2019-02-01 中国平安人寿保险股份有限公司 Mass data storage means, device, storage medium and electronic equipment
CN110309328A (en) * 2018-03-14 2019-10-08 深圳云天励飞技术有限公司 Date storage method, device, electronic equipment and storage medium
CN110427377A (en) * 2019-08-02 2019-11-08 北京博睿宏远数据科技股份有限公司 Data processing method, device, equipment and storage medium
CN112115134A (en) * 2020-08-04 2020-12-22 北京金山云网络技术有限公司 Data storage method and device, electronic equipment and storage medium
CN112269789A (en) * 2020-11-16 2021-01-26 北京百度网讯科技有限公司 Method and device for storing data and method and device for reading data
CN113111038A (en) * 2021-03-31 2021-07-13 北京达佳互联信息技术有限公司 File storage method, device, server and storage medium
CN113392240A (en) * 2021-06-11 2021-09-14 江苏云从曦和人工智能有限公司 Biological characteristic storage optimization method and device, electronic equipment and storage medium
CN113625967A (en) * 2021-07-26 2021-11-09 深圳市汉云科技有限公司 Data storage method, data query method and server
CN113672175A (en) * 2021-08-09 2021-11-19 浙江大华技术股份有限公司 Distributed object storage method, device and equipment and computer storage medium

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060136402A1 (en) * 2004-12-22 2006-06-22 Tsu-Chang Lee Object-based information storage, search and mining system method
US20100266168A1 (en) * 2007-06-22 2010-10-21 Warwick Warp Limited Fingerprint matching method and apparatus
CN108334551A (en) * 2017-12-29 2018-07-27 谷米科技有限公司 Date storage method and system, data query method and system
CN110309328A (en) * 2018-03-14 2019-10-08 深圳云天励飞技术有限公司 Date storage method, device, electronic equipment and storage medium
CN109299088A (en) * 2018-08-22 2019-02-01 中国平安人寿保险股份有限公司 Mass data storage means, device, storage medium and electronic equipment
CN110427377A (en) * 2019-08-02 2019-11-08 北京博睿宏远数据科技股份有限公司 Data processing method, device, equipment and storage medium
CN112115134A (en) * 2020-08-04 2020-12-22 北京金山云网络技术有限公司 Data storage method and device, electronic equipment and storage medium
CN112269789A (en) * 2020-11-16 2021-01-26 北京百度网讯科技有限公司 Method and device for storing data and method and device for reading data
CN113111038A (en) * 2021-03-31 2021-07-13 北京达佳互联信息技术有限公司 File storage method, device, server and storage medium
CN113392240A (en) * 2021-06-11 2021-09-14 江苏云从曦和人工智能有限公司 Biological characteristic storage optimization method and device, electronic equipment and storage medium
CN113625967A (en) * 2021-07-26 2021-11-09 深圳市汉云科技有限公司 Data storage method, data query method and server
CN113672175A (en) * 2021-08-09 2021-11-19 浙江大华技术股份有限公司 Distributed object storage method, device and equipment and computer storage medium

Similar Documents

Publication Publication Date Title
CN109783490B (en) Data fusion method and device, computer equipment and storage medium
US11551027B2 (en) Object detection based on a feature map of a convolutional neural network
CN111125658B (en) Method, apparatus, server and storage medium for identifying fraudulent user
CA3052846A1 (en) Character recognition method, device, electronic device and storage medium
CN111062431A (en) Image clustering method, image clustering device, electronic device, and storage medium
CN111738009B (en) Entity word label generation method, entity word label generation device, computer equipment and readable storage medium
CN109582906B (en) Method, device, equipment and storage medium for determining data reliability
WO2021143016A1 (en) Approximate data processing method and apparatus, medium and electronic device
US11475068B2 (en) Automatic question answering method and apparatus, storage medium and server
CN111460117B (en) Method and device for generating intent corpus of conversation robot, medium and electronic equipment
CN113191565A (en) Security prediction method, security prediction device, security prediction medium, and security prediction apparatus
CN117011581A (en) Image recognition method, medium, device and computing equipment
CN114168081A (en) High-dimensional feature storage method and device, storage medium and electronic equipment
CN110175128A (en) A kind of similar codes case acquisition methods, device, equipment and storage medium
US11599743B2 (en) Method and apparatus for obtaining product training images, and non-transitory computer-readable storage medium
CN109783745B (en) Method, device and computer equipment for personalized typesetting of pages
CN110647519B (en) Method and device for predicting missing attribute value in test sample
CN114630185B (en) Target user identification method and device, electronic equipment and storage medium
CN116304253B (en) Data storage method, data retrieval method and method for identifying similar video
CN113127610B (en) Data processing method, device, equipment and medium
CN110516024B (en) Map search result display method, device, equipment and storage medium
US20230214394A1 (en) Data search method and apparatus, electronic device and storage medium
CN115661478A (en) Image processing method, image processing device, electronic equipment and storage medium
CN117608630A (en) Code quality detection method, device, equipment and storage medium
CN114064840A (en) Data processing method, medium, device and computing equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination