WO2023065696A1 - Nearest neighbor search method and apparatus, terminal, and storage medium - Google Patents

Nearest neighbor search method and apparatus, terminal, and storage medium Download PDF

Info

Publication number
WO2023065696A1
WO2023065696A1 PCT/CN2022/099850 CN2022099850W WO2023065696A1 WO 2023065696 A1 WO2023065696 A1 WO 2023065696A1 CN 2022099850 W CN2022099850 W CN 2022099850W WO 2023065696 A1 WO2023065696 A1 WO 2023065696A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature
dimensional
value
low
module
Prior art date
Application number
PCT/CN2022/099850
Other languages
French (fr)
Chinese (zh)
Inventor
张号逵
胡文泽
王孝宇
Original Assignee
深圳云天励飞技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳云天励飞技术股份有限公司 filed Critical 深圳云天励飞技术股份有限公司
Publication of WO2023065696A1 publication Critical patent/WO2023065696A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24534Query rewriting; Transformation
    • G06F16/24549Run-time optimisation

Definitions

  • the present application belongs to the field of computer technology, and in particular relates to a nearest neighbor search method, device, terminal and storage medium.
  • the main goal of the approximate nearest neighbor search algorithm is to retrieve multiple data feature vectors most similar to a given query object from a database containing a large number of data feature vectors under a certain similarity measure criterion.
  • Approximate nearest neighbor search is the basis of information retrieval and is widely used in various search engines and recommendation systems. How to quickly and accurately implement approximate nearest neighbor search under the condition of limited hardware cost has always been a research hotspot in the field of information retrieval.
  • the approximate nearest neighbor search algorithm based on the index graph is to iteratively approach the query object along the boundary line in the pre-configured relative neighbor graph (RNG).
  • RNG relative neighbor graph
  • This algorithm only needs to calculate the eigenvector of the query object and the approximate route
  • the similarity between data feature vectors can significantly improve the retrieval speed.
  • This type of method is the most widely used approximate nearest neighbor search algorithm in recent years, and has been applied in many practical scenarios.
  • Embodiments of the present application provide a nearest neighbor search method, device, terminal, and storage medium, which can improve the efficiency of RNG construction while ensuring the accuracy of nearest neighbor search.
  • the first aspect of the embodiment of the present application provides a nearest neighbor search method, including:
  • a nearest neighbor search is performed based on the low-dimensional neighbor graph and the target feature to obtain a reference object with the closest distance to the target object.
  • the high-dimensional neighbor relationship is the high-dimensional Euclidean distance between the sample high-dimensional features associated with each two sample objects among the plurality of sample objects
  • the low-dimensional neighbor relationship is It is the low-dimensional Euclidean distance between the sample low-dimensional features associated with each two sample objects among the plurality of sample objects
  • the loss function of the feature compression network is based on the sum of the high-dimensional Euclidean distance and the The error value between the low-dimensional Euclidean distance corresponding to the high-dimensional Euclidean distance, and the function obtained by the weight value associated with the high-dimensional Euclidean distance, wherein, the value of the weight value and the high-dimensional Euclidean distance associated with it
  • the size of the Euclidean distance is related.
  • the feature compression network includes a compression module, a projection module, and a global optimization module
  • the compression module includes a first linear mapping module, a second linear mapping module, and a feature compression module
  • the reference high-dimensional features of the object are input into the feature compression network, and the reference low-dimensional features output by the feature compression network are obtained, including: inputting the reference high-dimensional features into the feature compression module and the first linear mapping module and the projection module to obtain the first feature output by the feature compression module, the second feature output by the first linear mapping module, and at least one third feature output by the projection module, wherein the The dimension of the first feature, the second feature and each of the third features is the same as the dimension of the reference low-dimensional feature; the first feature, the second feature and the at least one third feature
  • the feature is input to the global optimization module to obtain a fourth feature and at least one fifth feature output by the global optimization module; the fourth feature and the at least one fifth feature are input to the compression module to obtain The reference low-dimensional features output by the compression module.
  • the global optimization module includes at least one encoder; the input of the first feature, the second feature and the at least one third feature to the global optimization module, Obtaining the fourth feature and at least one fifth feature output by the global optimization module includes: composing the first feature and the at least one third feature into a first vector, and inputting the first vector into the The first encoder in the at least one encoder obtains the second vector output by the first encoder based on the multi-attention head mechanism and the linear mapping layer, wherein the second vector includes the same as the first The sixth feature corresponding to the feature, and at least one seventh feature corresponding to the at least one third feature one-to-one; the sixth feature corresponding to the first feature in the second vector is compared with the second feature Add to obtain the eighth feature; replace the sixth feature in the second vector with the eighth feature to obtain a third vector; input the third vector to the first in the at least one encoder two encoders, and so on, until the target vector output by the last encoder of the at least one encoder is obtained
  • the calculation process of the output value of the multi-attention head mechanism includes: determining the first input value, the second input value and the first input value of the multi-attention head mechanism based on the first vector Three input values; respectively performing mapping processing on the first input value, the second input value, and the third input value to obtain the first mapping value and the second input value corresponding to the first input value The corresponding second mapping value and the third mapping value corresponding to the third input value, wherein the dimension of the third mapping value is lower than the dimension of the first mapping value and lower than the dimension of the second mapping value Dimension; use the first dimension reduction value and the second dimension reduction value to calculate the merging parameter of the third mapping value; use the third mapping value and the merging parameter to calculate the multi-attention head mechanism output value.
  • performing the nearest neighbor search based on the low-dimensional neighbor graph and the target feature to obtain the reference object with the closest distance to the target object includes: combining the low-dimensional neighbor graph Each reference low-dimensional feature is replaced by its corresponding reference high-dimensional feature, and the high-dimensional neighbor graph corresponding to the low-dimensional neighbor graph is obtained; based on the high-dimensional neighbor graph and the target feature, the nearest neighbor search is performed to obtain the corresponding high-dimensional neighbor graph. The closest reference object to the target object.
  • the feature compression unit is used to input the reference high-dimensional feature of the reference object into the feature compression network to obtain the reference low-dimensional feature output by the feature compression network, and the loss function of the feature compression network is based on the high-dimensional neighbor of the sample object The function obtained by the relationship and the low-dimensional neighbor relationship of the sample object;
  • a neighbor graph construction unit configured to use the reference low-dimensional features to build a low-dimensional neighbor graph
  • a feature acquisition unit configured to acquire the target feature of the target object
  • a nearest neighbor search unit is configured to perform a nearest neighbor search based on the low-dimensional neighbor graph and the target feature to obtain a reference object with the closest distance to the target object.
  • the third aspect of the embodiments of the present application provides a terminal, including a memory, a processor, and a computer program stored in the memory and operable on the processor, and the above method is implemented when the processor executes the computer program A step of.
  • a fourth aspect of the embodiments of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the steps of the foregoing method are implemented.
  • the fifth aspect of the embodiments of the present application provides a computer program product, which, when the computer program product runs on a terminal, enables the terminal to execute the steps of the method.
  • the reference low-dimensional features output by the feature compression network are obtained, and the low-dimensional neighbor graph is established by using the reference low-dimensional features, and then the target The target feature of the object, and the nearest neighbor search is performed based on the low-dimensional neighbor graph and the target feature, and the reference object closest to the target object is obtained.
  • the loss function of the feature compression network is based on the high-dimensional neighbor relationship of the sample object and the The function obtained by describing the low-dimensional neighbor relationship of the sample object, using the trained feature compression network to reduce the dimensionality of the reference high-dimensional features, can avoid the problem of loss of adjacent relationship information between features caused by dimensionality reduction directly through the dimensionality reduction algorithm , and then improve the search accuracy.
  • building a low-dimensional neighbor graph based on the reduced-dimensional reference low-dimensional features can reduce the time consumption of building RNG and improve the efficiency of RNG construction.
  • FIG. 1 is a schematic diagram of the implementation flow of a nearest neighbor search method provided by an embodiment of the present application
  • Fig. 2 is a schematic diagram of the nearest neighbor search provided by the embodiment of the present application.
  • Fig. 3 is a schematic structural diagram of a feature compression network provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of a specific implementation flow of step S101 provided by the embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of an encoder provided in an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a nearest neighbor search device provided in an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of a terminal provided by an embodiment of the present application.
  • the approximate nearest neighbor search algorithm based on the index graph needs to pre-build a high-precision RNG.
  • RNG the degree of eigenvector data
  • the problem that it takes too long to construct the RNG seriously limits the scope of application of the approximate nearest neighbor search algorithm based on the index graph.
  • this application proposes a nearest neighbor search method.
  • the feature compression network is obtained by training the loss function based on the high-dimensional neighbor relationship of the sample object and the low-dimensional neighbor relationship of the sample object, and uses the trained feature compression network to compare the reference Dimensionality reduction of high-dimensional features can avoid the problem of loss of adjacent relationship information between features caused by dimensionality reduction directly through dimensionality reduction algorithms, thereby improving search accuracy. , which can reduce the time consumption of building RNG and improve the efficiency of RNG building.
  • Figure 1 shows a schematic diagram of the implementation process of a nearest neighbor search method provided by the embodiment of the present application.
  • Smart devices such as devices, set-top boxes, servers, and satellite wireless devices can be applied to situations where it is necessary to improve the efficiency of RNG construction while ensuring the accuracy of the nearest neighbor search.
  • the above nearest neighbor search method may include the following steps S101 to S104.
  • Step S101 input the reference high-dimensional features of the reference object into the feature compression network to obtain the reference low-dimensional features output by the feature compression network.
  • the reference object refers to the object used to construct the RNG in the database, and the type of the reference object can be adjusted according to the actual situation, and generally can be an image or the like.
  • the terminal can extract the reference high-dimensional features of the reference object through the feature extraction algorithm, and use the trained feature compression network to convert the reference high-dimensional features of the reference object to Compress to reference low-dimensional vector
  • the network structure of the feature extraction algorithm and the feature compression network can be set according to the actual situation.
  • the loss function of the feature compression network is a function based on the high-dimensional neighbor relationship of the sample object and the low-dimensional neighbor relationship of the sample object.
  • the sample object is the object used to train the feature compression network.
  • the high-dimensional neighbor relationship refers to the neighbor relationship between the sample high-dimensional features associated with every two sample objects in the sample object
  • the low-dimensional neighbor relationship refers to the sample low-dimensional features associated with every two sample objects in the sample object. neighbor relationship between them.
  • the present application can use the loss function related to the neighbor relationship of the sample object to train the feature compression network to be trained until the feature compression network converges to obtain a trained feature compression network.
  • this application does not limit the algorithm used for model training, for example, it can be implemented by using a gradient descent algorithm.
  • the conventional loss function is generally a function established based on the error between the high-dimensional features of the sample and the low-dimensional features of the sample.
  • the loss function is constructed based on the high-dimensional neighbor relationship and the low-dimensional neighbor relationship.
  • Dimensionality reduction through conventional dimensionality reduction algorithms, or dimensionality reduction through the feature compression network trained by conventional loss functions can keep the low-dimensional features of the sample to maintain certain neighbor relationship information, and because the search process of the nearest neighbor search algorithm is It is realized by using the neighbor relationship between features, therefore, the method provided by the present application improves the integrity of the neighbor relationship in the compression process, thereby improving the search accuracy.
  • Step S102 using the reference low-dimensional features to establish a low-dimensional neighbor graph.
  • HNSW Hierarchical Navigable Small World
  • NSG Nearest Serving Spreading-out Graph
  • Step S103 acquiring target features of the target object.
  • the target object refers to the object to be queried, and its type is the same as that of the reference object and the sample object.
  • the above-mentioned target features may refer to target high-dimensional features or target low-dimensional features of the target object.
  • step S104 the nearest neighbor search is performed based on the low-dimensional neighbor graph and the target feature to obtain the reference object with the closest distance to the target object.
  • the terminal can perform feature extraction on the target object to obtain the target high-dimensional features of the target object, and then input the target high-dimensional features of the target object into the aforementioned feature compression network to obtain the target high-dimensional features of the target object.
  • the output target low-dimensional feature, and the obtained target low-dimensional feature is used as the target feature of the target object.
  • one or more reference low-dimensional features with the closest Euclidean distance to the target feature in the low-dimensional neighbor map can be searched, and combined with the search
  • the reference objects associated with one or more reference low-dimensional features are taken as the closest reference objects to the target object.
  • the terminal can also replace each reference low-dimensional feature of the low-dimensional neighbor graph with its corresponding reference high-dimensional feature to obtain the high-dimensional feature corresponding to the low-dimensional neighbor graph.
  • the nearest neighbor graph and based on the high-dimensional neighbor graph and the target feature, the nearest neighbor search is performed to obtain the reference object closest to the target object.
  • the terminal may perform feature extraction on the target object to obtain target high-dimensional features of the target object, and use the obtained target high-dimensional features as target features of the target object.
  • the terminal may perform feature extraction on the target object to obtain target high-dimensional features of the target object, and use the obtained target high-dimensional features as target features of the target object.
  • the neighbor relationship between the reference high-dimensional features recorded in the high-dimensional neighbor map to perform nearest neighbor search one or more reference high-dimensional features with the closest Euclidean distance to the target feature in the high-dimensional neighbor map can be searched, and combined with the search
  • the reference objects associated with one or more reference high-dimensional features are taken as the closest reference objects to the target object.
  • the high-dimensional neighbor graph corresponding to the low-dimensional neighbor graph is obtained, and the high-dimensional neighbor graph is used to carry out
  • the nearest neighbor search allows the terminal to use the Euclidean distance between the reference high-dimensional feature and the target high-dimensional feature when performing the nearest neighbor search. Compared with the Euclidean distance between the reference low-dimensional feature and the target low-dimensional feature, The accuracy of the search can be improved.
  • the target features of the target image can be extracted, and then one or more reference heights with the closest Euclidean distance to the target feature can be searched in the high-dimensional nearest neighbor graph.
  • dimensional features, and the reference images associated with the searched one or more reference high-dimensional features are taken as the reference images with the closest distance to the target image, and then the scene to which the target image belongs can be determined as the reference image with the closest distance to the target image. scene.
  • the reference low-dimensional features output by the feature compression network are obtained, and the low-dimensional neighbor graph is established by using the reference low-dimensional features, and then the target The target feature of the object, and the nearest neighbor search is performed based on the low-dimensional neighbor graph and the target feature, and the reference object closest to the target object is obtained.
  • the loss function of the feature compression network is based on the high-dimensional neighbor relationship of the sample object and the The function obtained by describing the low-dimensional neighbor relationship of the sample object, using the trained feature compression network to reduce the dimensionality of the reference high-dimensional features, can avoid the problem of loss of adjacent relationship information between features caused by dimensionality reduction directly through the dimensionality reduction algorithm , and then improve the search accuracy.
  • building a low-dimensional neighbor graph based on the reduced-dimensional reference low-dimensional features can reduce the time consumption of building RNG and improve the efficiency of RNG construction.
  • the terminal may construct the feature compression network shown in FIG. 3 .
  • the feature compression network may include a compression module 31 , a projection module 32 and a global optimization module 33 .
  • the compression module 31 specifically includes a first linear mapping module 311 , a second linear mapping module 312 and a feature compression module 313 .
  • step S101 may specifically include the following steps S401 to S402.
  • Step S401 input the reference high-dimensional features into the feature compression module, the first linear mapping module and the projection module, and obtain the first feature output by the feature compression module, the second feature output by the first linear mapping module, and the projection module At least one third feature of the output.
  • the terminal will refer to the high-dimensional features After being input to the feature compression module 313, the first feature whose dimension is d out can be obtained.
  • the first linear mapping module 311 may contain a linear mapping function parameter Similarly, the terminal will refer to high-dimensional features After being input to the feature compression module 311, the second feature cp(x) whose dimension is d out can be obtained.
  • the dimension of the first feature, the second feature and each third feature is the same as that of the reference low-dimensional feature, all of which are d out .
  • Step S402 inputting the first feature, the second feature and at least one third feature to the global optimization module to obtain a fourth feature and at least one fifth feature output by the global optimization module.
  • the global optimization module 33 may include at least one encoder.
  • the above-mentioned terminal may combine the first feature output by the feature compression module 313 and at least one third feature output by the projection module 32 to obtain the first vector.
  • the first vector is input to the first encoder 331 of the at least one encoder to obtain a second vector output by the first encoder 331 based on the multi-attention head mechanism and the linear mapping layer.
  • the second vector includes a sixth feature corresponding to the first feature, and a sixth feature corresponding to the at least one third feature One-to-one correspondence with at least one seventh feature.
  • the target vector includes a fourth feature and at least one fifth feature.
  • Fig. 5 shows the structure of a single encoder of the present application.
  • a single encoder can include a multi-head attention module as well as an encoder mapping module.
  • the above-mentioned multi-head attention mechanism module adopts a multi-head attention mechanism.
  • the process for the terminal to calculate the output value of the multi-attention head mechanism may specifically include: based on the first vector Determine the first input value Q, the second input value K and the third input value V of the multi-attention head mechanism; respectively map the first input value Q, the second input value K and the third input value V to obtain the first The first mapping value corresponding to an input value Q The second mapping value corresponding to the second input value K The third mapping value corresponding to the third input value V Using the first mapping value and the second mapping value, calculate the merging parameter of the third mapping value; using the third mapping value and the merging parameter to calculate the output value head i (Q, K, V) of the multi-attention head mechanism.
  • the dimension of the first mapping value and the dimension of the second mapping value are both lower than the dimension of the third mapping value.
  • the above-mentioned mapping process on the first input value Q may adopt a linear mapping method, and the obtained first mapping value may be expressed as in, e is the expansion coefficient, h n is the preset number of attention heads.
  • the mapping process of the second input value K can also be obtained by linear mapping, and the first mapping value can be expressed as in,
  • the mapping process for the third input value V can be performed in a linear mapping manner, and the obtained third mapping value can be expressed as in, It can be seen that after the linear mapping, the dimension of the first mapping value and the dimension of the second mapping value are lower than the dimension of the third mapping value.
  • the terminal can use the first mapping value and the second mapped value Divided by the product of , calculate the square root value of the preset feature dimension value, and obtain the quotient input into the softmax function, and then use the output value of the softmax function as a combination parameter of the above-mentioned third mapping value.
  • the terminal can combine the output of each attention head into a vector through the mapping function
  • the encoder mapping module can be a Linear_BN layer and a Linear_ABN layer, wherein the Linear_BN layer includes a linear mapping layer and a BN layer.
  • the Linear_ABN layer consists of a linear mapping layer, an activation function and a BN layer in turn.
  • the encoder mapping module is used to map the output value of the multi-attention head mechanism to the feature output by the encoder.
  • the conventional multi-attention head mechanism generally requires the dimensions of Q, K, and V to be the same, but in the embodiment of this application, since Q and K are only used to provide information on how to combine V, the terminal can compare Q and K Carry out dimension reduction, and use the reduced Q, the reduced K and the original dimension V to calculate the output value of the multi-attention head mechanism, which reduces the calculation amount of the softmax value during training and improves the calculation speed of the encoder. , thus improving the training efficiency of the model.
  • Step S403 input the fourth feature and at least one fifth feature into the compression module to obtain reference low-dimensional features output by the compression module.
  • the terminal may add the fourth feature and at least one fifth feature, and input the obtained feature into the second linear mapping module 312 of the compression module 31 to obtain the reference low-dimensional feature
  • the second linear mapping module 312 may include a linear mapping function parameter
  • the eighth feature is obtained by adding the sixth feature output by the encoder to the second feature output by the first linear mapping module, and then the eighth feature is used as the input value of the next encoder, In the process of encoder iteration, the input value of each encoder will not deviate too much from the actual feature value, and the convergence speed of the feature compression network is improved.
  • the terminal After completing the construction of the feature compression network, the terminal can use the loss function obtained based on the high-dimensional neighbor relationship of the sample object and the low-dimensional neighbor relationship of the sample object to train the feature compression network.
  • the above-mentioned high-dimensional neighbor relationship can be the high-dimensional Euclidean distance between the sample high-dimensional features associated with every two sample objects in the multiple sample objects
  • the above-mentioned low-dimensional neighbor relationship is the high-dimensional Euclidean distance between every two sample objects in the multiple sample objects.
  • the low-dimensional Euclidean distance between the sample low-dimensional features associated with the objects respectively.
  • the loss function of the feature compression network is a function based on the error value between the high-dimensional Euclidean distance and the low-dimensional Euclidean distance corresponding to the high-dimensional Euclidean distance, and the weight value associated with the high-dimensional Euclidean distance .
  • the value of the weight value is related to the magnitude of the associated high-dimensional Euclidean distance.
  • the process of the terminal calculating the loss value loss of the above loss function may specifically include: calculating the high-dimensional Euclidean distance
  • m represents the total number of sample objects
  • 2 represents the high-dimensional Euclidean distance
  • f( xi ) represents the high-dimensional sample associated with the i-th sample object feature
  • f( xi ) represents the high-dimensional feature of the sample associated with the j-th sample object
  • 2 represents the low-dimensional Euclidean distance
  • x i represents the sample associated with the i-th sample object Low-dimensional feature
  • xj represents the sample low-dimensional feature associated with the jth sample object
  • ⁇ ij represents the weight value associated with the high-dimensional Euclidean distance.
  • the process for the terminal to calculate the above weight value ⁇ ij may include: obtaining the first hyperparameter ⁇ and the second hyperparameter ⁇ , and calculating the average boundary of each high-dimensional Euclidean distance; then, calculating The inverse of the natural logarithm value of the quotient of the high-dimensional Euclidean distance and the mean Determine the maximum value between the inverse number and the second hyperparameter, and use the minimum value between the first hyperparameter and the maximum value as the weight value associated with the high-dimensional Euclidean distance.
  • the first hyperparameter ⁇ is greater than the second hyperparameter ⁇ .
  • the specific values of ⁇ and ⁇ can be set according to the actual situation. In practical applications, ⁇ can be set to 2, and ⁇ can be set to 0.01.
  • the weight value ⁇ ij associated with it when the high-dimensional Euclidean distance d ij is small, the weight value ⁇ ij associated with it will be ⁇ or The weight value ⁇ ij is greater than ⁇ ; and when the high-dimensional Euclidean distance d ij is larger, the weight value ⁇ ij associated with it will be ⁇ . Furthermore, in the loss function, the weights corresponding to the high-dimensional features of two samples with small high-dimensional Euclidean distance are higher.
  • using the above weight formula and loss function to train the feature compression network can make the high-dimensional neighbor relationship and low-dimensional neighbor relationship between features with small high-dimensional Euclidean distance more accurate, that is, , after inputting the two reference high-dimensional features with smaller high-dimensional Euclidean distance into the trained feature compression network, the obtained neighbor relationship information between the two reference low-dimensional features will be more complete.
  • the purpose of the nearest neighbor search algorithm is to search for the feature with the closest distance to the target feature. Therefore, this method can make the adjacent relationship information between the features with a closer distance more complete, and further improve the search accuracy.
  • the terminal can also quantize the low-dimensional features output by the feature compression network into low-dimensional integer vectors through scalar quantization And the quantized low-dimensional integer vector is used as a reference low-dimensional feature to construct RNG.
  • FIG. 6 is a schematic structural diagram of a nearest neighbor search apparatus 600 provided in an embodiment of the present application, and the nearest neighbor search apparatus 600 is configured on a terminal.
  • the nearest neighbor search device 600 may include:
  • the feature compression unit 601 is used to input the reference high-dimensional features of the reference object into the feature compression network to obtain the reference low-dimensional features output by the feature compression network.
  • the loss function of the feature compression network is based on the high-dimensional neighbor relationship of the sample object and the sample The function obtained by the low-dimensional neighbor relationship of the object;
  • a neighbor graph construction unit 602 configured to construct a low-dimensional neighbor graph using reference low-dimensional features
  • the nearest neighbor search unit 604 is configured to perform a nearest neighbor search based on the low-dimensional neighbor graph and the target feature to obtain a reference object with the closest distance to the target object.
  • the above-mentioned high-dimensional neighbor relationship is the high-dimensional Euclidean distance between the sample high-dimensional features associated with every two sample objects in the multiple sample objects
  • the above-mentioned low-dimensional neighbor relationship is the multiple sample objects
  • the loss function of the above feature compression network is a function based on the error value between the high-dimensional Euclidean distance and the low-dimensional Euclidean distance corresponding to the high-dimensional Euclidean distance, and the weight value associated with the high-dimensional Euclidean distance, where , the value of the weight value is related to the size of the high-dimensional Euclidean distance associated with it.
  • the calculation process of the loss value of the above loss function includes: calculating the error value between the high-dimensional Euclidean distance and the low-dimensional Euclidean distance corresponding to the high-dimensional Euclidean distance ; Calculate the weight value associated with the high-dimensional Euclidean distance, and use each of the weight values to weight and add each of the error values to obtain an accumulated value; combine the accumulated value and the sample object The square value of the total number is divided to obtain the loss value of the loss function.
  • the calculation process of the above weight value includes: obtaining a first hyperparameter and a second hyperparameter, wherein the first hyperparameter is greater than the second hyperparameter; calculating each of the high an average value of the dimensional Euclidean distance; calculating the inverse of the natural logarithm value of the quotient of the high-dimensional Euclidean distance and the average; determining a maximum value between the inverse number and the second hyperparameter, and A minimum value between the first hyperparameter and the maximum value is used as a weight value associated with the high-dimensional Euclidean distance.
  • the above-mentioned feature compression network includes a compression module, a projection module, and a global optimization module
  • the compression module includes a first linear mapping module, a second linear mapping module, and a feature compression module.
  • the above-mentioned feature compression unit 601 can be specifically used to: input the reference high-dimensional feature to the feature compression module, the first linear mapping module and the projection module, and obtain the first feature output by the feature compression module and the first feature output by the first linear mapping module.
  • Two features, and at least one third feature output by the projection module wherein the dimension of the first feature, the second feature and each third feature is the same as the dimension of the reference low-dimensional feature; the first feature, the second feature and At least one third feature is input to the global optimization module to obtain a fourth feature and at least one fifth feature output by the global optimization module; the fourth feature and at least one fifth feature are input to the compression module to obtain a reference output by the compression module low-dimensional features.
  • the above-mentioned global optimization module includes at least one encoder.
  • the above-mentioned feature compression unit 601 can be specifically configured to: form the first feature and at least one third feature into a first vector, and input the first vector into the first encoder of at least one encoder to obtain The second vector based on the multi-attention head mechanism and the linear mapping layer output, wherein the second vector includes a sixth feature corresponding to the first feature, and at least one seventh feature corresponding to at least one third feature; Add the sixth feature corresponding to the first feature in the second vector to the second feature to obtain the eighth feature; replace the sixth feature in the second vector with the eighth feature to obtain the third vector; input the third vector to The second encoder of the at least one encoder, and so on, until the target vector output by the last encoder of the at least one encoder is obtained, the target vector includes the fourth feature and at least one fifth feature.
  • the above calculation process includes: determining the first input value, the second input value and the third input value of the multi-attention head mechanism based on the first vector;
  • the input value, the second input value, and the third input value are respectively subjected to mapping processing to obtain the first mapping value corresponding to the first input value, the second mapping value corresponding to the second input value, and the The third mapping value corresponding to the third input value, wherein the dimension of the first mapping value and the dimension of the second mapping value are lower than the dimension of the third mapping value; using the first mapping value and The second mapping value is used to calculate a combination parameter of the third mapping value; and the output value of the multi-attention head mechanism is calculated by using the third mapping value and the combination parameter.
  • the above-mentioned nearest neighbor search unit 604 can also be specifically configured to: replace each reference low-dimensional feature of the low-dimensional neighbor graph with its corresponding reference high-dimensional feature, and obtain the corresponding High-dimensional neighbor map; based on the high-dimensional neighbor map and target features, the nearest neighbor search is performed to obtain the reference object closest to the target object.
  • the specific working process of the nearest neighbor search device 600 may refer to the corresponding process of the method described in FIG. 1 to FIG. 5 , which will not be repeated here.
  • FIG. 7 it is a schematic diagram of a terminal provided in the embodiment of the present application.
  • the terminal 7 may include: a processor 70, a memory 71, and a computer program 72 stored in the memory 71 and operable on the processor 70, such as a nearest neighbor search program.
  • the processor 70 executes the computer program 72, it implements the steps in the above embodiments of the nearest neighbor search method, for example, steps S101 to S104 shown in FIG. 1 .
  • the processor 70 executes the computer program 72, it realizes the functions of each module/unit in the above-mentioned device embodiments, such as the feature compression unit 601 shown in FIG. 6 , the neighbor graph construction unit 602, the feature acquisition unit 603 and nearest neighbor search unit 604 .
  • the computer program can be divided into one or more modules/units, and the one or more modules/units are stored in the memory 71 and executed by the processor 70 to complete the present application.
  • the one or more modules/units may be a series of computer program instruction segments capable of accomplishing specific functions, and the instruction segments are used to describe the execution process of the computer program in the terminal.
  • the computer program can be divided into: a feature compression unit, a neighbor graph construction unit, a feature acquisition unit and a nearest neighbor search unit.
  • each unit is as follows: the feature compression unit is used to input the reference high-dimensional features of the reference object into the feature compression network to obtain the reference low-dimensional features output by the feature compression network, and the loss function of the feature compression network is based on The function obtained by the high-dimensional neighbor relationship of the sample object and the low-dimensional neighbor relationship of the sample object; the neighbor graph construction unit is used to use the reference low-dimensional feature to establish a low-dimensional neighbor graph; the feature acquisition unit is used to acquire the target object the target feature; a nearest neighbor search unit, configured to perform a nearest neighbor search based on the low-dimensional neighbor graph and the target feature, to obtain a reference object with the closest distance to the target object.
  • the terminal may include, but not limited to, a processor 70 and a memory 71 .
  • a processor 70 and a memory 71 .
  • FIG. 7 is only an example of a terminal, and does not constitute a limitation on the terminal. It may include more or less components than those shown in the figure, or combine certain components, or different components, such as the Terminals may also include input and output devices, network access devices, buses, and so on.
  • the so-called processor 70 can be a central processing unit (Central Processing Unit, CPU), and can also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Off-the-shelf programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
  • the storage 71 may be an internal storage unit of the terminal, such as a hard disk or memory of the terminal.
  • the memory 71 can also be an external storage device of the terminal, such as a plug-in hard disk equipped on the terminal, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) card, a flash memory card (Flash Card) etc.
  • the memory 71 may also include both an internal storage unit of the terminal and an external storage device.
  • the memory 71 is used to store the computer program and other programs and data required by the terminal.
  • the memory 71 can also be used to temporarily store data that has been output or will be output.
  • the disclosed device/terminal and method may be implemented in other ways.
  • the device/terminal embodiments described above are only illustrative.
  • the division of the modules or units is only a logical function division.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
  • the integrated module/unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, all or part of the processes in the methods of the above embodiments in the present application can also be completed by instructing related hardware through computer programs.
  • the computer programs can be stored in a computer-readable storage medium, and the computer When the program is executed by the processor, the steps in the above-mentioned various method embodiments can be realized.
  • the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file or some intermediate form.
  • the computer-readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a USB flash drive, a removable hard disk, a magnetic disk, an optical disk, a computer memory, and a read-only memory (Read-Only Memory, ROM) , random access memory (Random Access Memory, RAM), electric carrier signal, telecommunication signal and software distribution medium, etc.
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • electric carrier signal telecommunication signal and software distribution medium, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present application is applicable to the technical field of computers, and provides a nearest neighbor search method and apparatus, a terminal, and a storage medium. The nearest neighbor search method specifically comprises: inputting a reference high-dimensional feature of a reference object into a feature compression network to obtain a reference low-dimensional feature output by the feature compression network, a loss function of the feature compression network being a function obtained on the basis of a high-dimensional neighbor relationship of a sample object and a low-dimensional neighbor relationship of the sample object; establishing a low-dimensional neighbor graph by using the reference low-dimensional feature; obtaining a target feature of a target object; and performing nearest neighbor search on the basis of the low-dimensional neighbor graph and the target feature to obtain a reference object nearest to the target object. According to the embodiments of the present application, the construction efficiency of a neighbor graph can be improved while the nearest neighbor search precision is guaranteed.

Description

一种最近邻搜索方法、装置、终端和存储介质A nearest neighbor search method, device, terminal and storage medium
本申请要求于2021年10月21日提交中国专利局,申请号为202111227715.7、发明名称为“一种最近邻搜索方法、装置、终端和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number 202111227715.7 and the title of the invention "a nearest neighbor search method, device, terminal and storage medium" submitted to the China Patent Office on October 21, 2021, the entire content of which is passed References are incorporated in this application.
技术领域technical field
本申请属于计算机技术领域,尤其涉及一种最近邻搜索方法、装置、终端和存储介质。The present application belongs to the field of computer technology, and in particular relates to a nearest neighbor search method, device, terminal and storage medium.
背景技术Background technique
近似最近邻搜索算法的主要目标是在确定的相似度量准则下,从包含大量数据特征向量的数据库中检索出与给定查询对象最为相似的多条数据特征向量。近似最近邻搜索是信息检索的基础,在各种搜索引擎及推荐系统中有着非常广泛的应用。如何在硬件成本有限的条件下,快速准确的实现近似最近邻搜索一直是信息检索领域的一个研究热点。The main goal of the approximate nearest neighbor search algorithm is to retrieve multiple data feature vectors most similar to a given query object from a database containing a large number of data feature vectors under a certain similarity measure criterion. Approximate nearest neighbor search is the basis of information retrieval and is widely used in various search engines and recommendation systems. How to quickly and accurately implement approximate nearest neighbor search under the condition of limited hardware cost has always been a research hotspot in the field of information retrieval.
基于索引图的近似最近邻搜索算法,是在预构的近邻图(relative neighbor graph,RNG)中沿着边界线迭代逼近查询对象,这种算法只需要计算查询对象的特征向量与逼近的路线上数据特征向量之间的相似度,能够显著提升了检索速度。该类方法是近年来应用最为广泛地近似最近邻搜索算法,已经在很多实用场景落地应用。The approximate nearest neighbor search algorithm based on the index graph is to iteratively approach the query object along the boundary line in the pre-configured relative neighbor graph (RNG). This algorithm only needs to calculate the eigenvector of the query object and the approximate route The similarity between data feature vectors can significantly improve the retrieval speed. This type of method is the most widely used approximate nearest neighbor search algorithm in recent years, and has been applied in many practical scenarios.
但是,为了保障可靠性,该类方法需要预先构建高精度的RNG,在处理具有数亿特征向量数据的数据库时,三十多个线程构建RNG也需要花费几天甚至几周时间。构建RNG花费时间过长的问题严重限制了基于索引图的近似最近邻搜索算法的适用范围。However, in order to ensure reliability, this type of method needs to pre-build a high-precision RNG. When processing a database with hundreds of millions of eigenvector data, it takes days or even weeks for more than 30 threads to build the RNG. The problem that it takes too long to construct the RNG seriously limits the scope of application of the approximate nearest neighbor search algorithm based on the index graph.
发明内容Contents of the invention
本申请实施例提供一种最近邻搜索方法、装置、终端和存储介质,可以 在保障最近邻搜索精度的同时,提高RNG的构建效率。Embodiments of the present application provide a nearest neighbor search method, device, terminal, and storage medium, which can improve the efficiency of RNG construction while ensuring the accuracy of nearest neighbor search.
本申请实施例第一方面提供一种最近邻搜索方法,包括:The first aspect of the embodiment of the present application provides a nearest neighbor search method, including:
将参考对象的参考高维特征输入特征压缩网络中,得到由所述特征压缩网络输出的参考低维特征,所述特征压缩网络的损失函数为基于样本对象的高维近邻关系和所述样本对象的低维近邻关系得到的函数;Input the reference high-dimensional feature of the reference object into the feature compression network to obtain the reference low-dimensional feature output by the feature compression network, and the loss function of the feature compression network is based on the high-dimensional neighbor relationship of the sample object and the sample object The function obtained by the low-dimensional nearest neighbor relationship;
利用所述参考低维特征建立低维近邻图;Establishing a low-dimensional neighbor graph by using the reference low-dimensional features;
获取目标对象的目标特征;Obtain the target feature of the target object;
基于所述低维近邻图与所述目标特征进行最近邻搜索,得到与所述目标对象距离最近的参考对象。A nearest neighbor search is performed based on the low-dimensional neighbor graph and the target feature to obtain a reference object with the closest distance to the target object.
在本申请的一些实施方式中,所述高维近邻关系为多个所述样本对象中每两个样本对象分别关联的样本高维特征之间的高维欧氏距离,所述低维近邻关系为多个所述样本对象中每两个样本对象分别关联的样本低维特征之间的低维欧氏距离;所述特征压缩网络的损失函数为基于所述高维欧氏距离和与所述高维欧氏距离对应的低维欧氏距离之间的误差值,以及与所述高维欧氏距离关联的权重值得到的函数,其中,所述权重值的取值和与其关联的高维欧氏距离的大小相关。In some embodiments of the present application, the high-dimensional neighbor relationship is the high-dimensional Euclidean distance between the sample high-dimensional features associated with each two sample objects among the plurality of sample objects, and the low-dimensional neighbor relationship is It is the low-dimensional Euclidean distance between the sample low-dimensional features associated with each two sample objects among the plurality of sample objects; the loss function of the feature compression network is based on the sum of the high-dimensional Euclidean distance and the The error value between the low-dimensional Euclidean distance corresponding to the high-dimensional Euclidean distance, and the function obtained by the weight value associated with the high-dimensional Euclidean distance, wherein, the value of the weight value and the high-dimensional Euclidean distance associated with it The size of the Euclidean distance is related.
在本申请的一些实施方式中,所述特征压缩网络包括压缩模块、投影模块以及全局优化模块,所述压缩模块包括第一线性映射模块、第二线性映射模块以及特征压缩模块;所述将参考对象的参考高维特征输入特征压缩网络中,得到由所述特征压缩网络输出的参考低维特征,包括:将所述参考高维特征输入至所述特征压缩模块、所述第一线性映射模块和所述投影模块,得到由所述特征压缩模块输出的第一特征、由所述第一线性映射模块输出的第二特征,以及由所述投影模块输出的至少一个第三特征,其中,所述第一特征、所述第二特征和每个所述第三特征的维度与所述参考低维特征的维度相同;将所述第一特征、所述第二特征和所述至少一个第三特征输入至所述全局优化模块,得到由所述全局优化模块输出的第四特征和至少一个第五特征;将所述第四特征和所述至少一个第五特征输入至所述压缩模块,得到由所述压缩模块输出的所述参考低维特征。In some embodiments of the present application, the feature compression network includes a compression module, a projection module, and a global optimization module, and the compression module includes a first linear mapping module, a second linear mapping module, and a feature compression module; The reference high-dimensional features of the object are input into the feature compression network, and the reference low-dimensional features output by the feature compression network are obtained, including: inputting the reference high-dimensional features into the feature compression module and the first linear mapping module and the projection module to obtain the first feature output by the feature compression module, the second feature output by the first linear mapping module, and at least one third feature output by the projection module, wherein the The dimension of the first feature, the second feature and each of the third features is the same as the dimension of the reference low-dimensional feature; the first feature, the second feature and the at least one third feature The feature is input to the global optimization module to obtain a fourth feature and at least one fifth feature output by the global optimization module; the fourth feature and the at least one fifth feature are input to the compression module to obtain The reference low-dimensional features output by the compression module.
在本申请的一些实施方式中,所述全局优化模块包括至少一个编码器;所述将所述第一特征、所述第二特征和所述至少一个第三特征输入至所述全 局优化模块,得到由所述全局优化模块输出的第四特征和至少一个第五特征,包括:将所述第一特征和所述至少一个第三特征组成第一向量,并将所述第一向量输入所述至少一个编码器中的第一个编码器,得到由所述第一个编码器基于多注意力头机制和线性映射层输出的第二向量,其中,所述第二向量包括与所述第一特征对应的第六特征,以及与所述至少一个第三特征一一对应的至少一个第七特征;将所述第二向量中所述第一特征对应的第六特征与所述第二特征相加,得到第八特征;将所述第二向量中的所述第六特征替换为所述第八特征,得到第三向量;将所述第三向量输入至所述至少一个编码器中的第二个编码器,以此类推,直至得到由所述至少一个编码器的最后一个编码器输出的目标向量,所述目标向量包括所述第四特征和所述至少一个第五特征。In some embodiments of the present application, the global optimization module includes at least one encoder; the input of the first feature, the second feature and the at least one third feature to the global optimization module, Obtaining the fourth feature and at least one fifth feature output by the global optimization module includes: composing the first feature and the at least one third feature into a first vector, and inputting the first vector into the The first encoder in the at least one encoder obtains the second vector output by the first encoder based on the multi-attention head mechanism and the linear mapping layer, wherein the second vector includes the same as the first The sixth feature corresponding to the feature, and at least one seventh feature corresponding to the at least one third feature one-to-one; the sixth feature corresponding to the first feature in the second vector is compared with the second feature Add to obtain the eighth feature; replace the sixth feature in the second vector with the eighth feature to obtain a third vector; input the third vector to the first in the at least one encoder two encoders, and so on, until the target vector output by the last encoder of the at least one encoder is obtained, and the target vector includes the fourth feature and the at least one fifth feature.
在本申请的一些实施方式中,所述多注意力头机制的输出值的计算过程包括:基于所述第一向量确定所述多注意力头机制的第一输入值、第二输入值和第三输入值;对所述第一输入值、所述第二输入值和所述第三输入值分别进行映射处理,得到所述第一输入值对应的第一映射值、所述第二输入值对应的第二映射值和所述第三输入值对应的第三映射值,其中所述第三映射值的维度低于所述第一映射值的维度,且低于所述第二映射值的维度;利用所述第一降维值和所述第二降维值,计算所述第三映射值的合并参数;利用所述第三映射值和所述合并参数计算所述多注意力头机制的输出值。In some embodiments of the present application, the calculation process of the output value of the multi-attention head mechanism includes: determining the first input value, the second input value and the first input value of the multi-attention head mechanism based on the first vector Three input values; respectively performing mapping processing on the first input value, the second input value, and the third input value to obtain the first mapping value and the second input value corresponding to the first input value The corresponding second mapping value and the third mapping value corresponding to the third input value, wherein the dimension of the third mapping value is lower than the dimension of the first mapping value and lower than the dimension of the second mapping value Dimension; use the first dimension reduction value and the second dimension reduction value to calculate the merging parameter of the third mapping value; use the third mapping value and the merging parameter to calculate the multi-attention head mechanism output value.
在本申请的一些实施方式中,所述基于所述低维近邻图与所述目标特征进行最近邻搜索,得到与所述目标对象距离最近的参考对象,包括:将所述低维近邻图的每个参考低维特征替换为与其对应的参考高维特征,得到所述低维近邻图对应的高维近邻图;基于所述高维近邻图与所述目标特征进行最近邻搜索,得到与所述目标对象距离最近的参考对象。In some embodiments of the present application, performing the nearest neighbor search based on the low-dimensional neighbor graph and the target feature to obtain the reference object with the closest distance to the target object includes: combining the low-dimensional neighbor graph Each reference low-dimensional feature is replaced by its corresponding reference high-dimensional feature, and the high-dimensional neighbor graph corresponding to the low-dimensional neighbor graph is obtained; based on the high-dimensional neighbor graph and the target feature, the nearest neighbor search is performed to obtain the corresponding high-dimensional neighbor graph. The closest reference object to the target object.
本申请实施例第二方面提供的一种最近邻搜索装置,包括:A nearest neighbor search device provided in the second aspect of the embodiment of the present application includes:
特征压缩单元,用于将参考对象的参考高维特征输入特征压缩网络中,得到由所述特征压缩网络输出的参考低维特征,所述特征压缩网络的损失函数为基于样本对象的高维近邻关系和所述样本对象的低维近邻关系得到的函数;The feature compression unit is used to input the reference high-dimensional feature of the reference object into the feature compression network to obtain the reference low-dimensional feature output by the feature compression network, and the loss function of the feature compression network is based on the high-dimensional neighbor of the sample object The function obtained by the relationship and the low-dimensional neighbor relationship of the sample object;
近邻图构建单元,用于利用所述参考低维特征建立低维近邻图;A neighbor graph construction unit, configured to use the reference low-dimensional features to build a low-dimensional neighbor graph;
特征获取单元,用于获取目标对象的目标特征;A feature acquisition unit, configured to acquire the target feature of the target object;
最近邻搜索单元,用于基于所述低维近邻图与所述目标特征进行最近邻搜索,得到与所述目标对象距离最近的参考对象。A nearest neighbor search unit is configured to perform a nearest neighbor search based on the low-dimensional neighbor graph and the target feature to obtain a reference object with the closest distance to the target object.
本申请实施例第三方面提供一种终端,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现上述方法的步骤。The third aspect of the embodiments of the present application provides a terminal, including a memory, a processor, and a computer program stored in the memory and operable on the processor, and the above method is implemented when the processor executes the computer program A step of.
本申请实施例第四方面提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现上述方法的步骤。A fourth aspect of the embodiments of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the steps of the foregoing method are implemented.
本申请实施例第五方面提供了一种计算机程序产品,当计算机程序产品在终端上运行时,使得终端执行时实现方法的步骤。The fifth aspect of the embodiments of the present application provides a computer program product, which, when the computer program product runs on a terminal, enables the terminal to execute the steps of the method.
本申请的实施方式中,通过将参考对象的参考高维特征输入特征压缩网络中,得到由特征压缩网络输出的参考低维特征,并利用参考低维特征建立低维近邻图,然后,获取目标对象的目标特征,并基于低维近邻图与目标特征进行最近邻搜索,得到与目标对象距离最近的参考对象,一方面,由于特征压缩网络的损失函数为基于样本对象的高维近邻关系和所述样本对象的低维近邻关系得到的函数,利用训练好的特征压缩网络对参考高维特征进行降维,能够避免直接通过降维算法进行降维导致的特征之间相邻关系信息丢失的问题,进而提高搜索精度,另一方面,基于降维后的参考低维特征建立低维近邻图,能够降低构建RNG的时间消耗,提高RNG构建效率。In the embodiment of the present application, by inputting the reference high-dimensional features of the reference object into the feature compression network, the reference low-dimensional features output by the feature compression network are obtained, and the low-dimensional neighbor graph is established by using the reference low-dimensional features, and then the target The target feature of the object, and the nearest neighbor search is performed based on the low-dimensional neighbor graph and the target feature, and the reference object closest to the target object is obtained. On the one hand, because the loss function of the feature compression network is based on the high-dimensional neighbor relationship of the sample object and the The function obtained by describing the low-dimensional neighbor relationship of the sample object, using the trained feature compression network to reduce the dimensionality of the reference high-dimensional features, can avoid the problem of loss of adjacent relationship information between features caused by dimensionality reduction directly through the dimensionality reduction algorithm , and then improve the search accuracy. On the other hand, building a low-dimensional neighbor graph based on the reduced-dimensional reference low-dimensional features can reduce the time consumption of building RNG and improve the efficiency of RNG construction.
附图说明Description of drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application, the accompanying drawings that need to be used in the descriptions of the embodiments or the prior art will be briefly introduced below. Obviously, the accompanying drawings in the following description are only for the present application For some embodiments, those of ordinary skill in the art can also obtain other drawings based on these drawings without paying creative efforts.
图1是本申请实施例提供的一种最近邻搜索方法的实现流程示意图;FIG. 1 is a schematic diagram of the implementation flow of a nearest neighbor search method provided by an embodiment of the present application;
图2是本申请实施例提供的最近邻搜索示意图;Fig. 2 is a schematic diagram of the nearest neighbor search provided by the embodiment of the present application;
图3是本申请实施例提供的特征压缩网络的结构示意图;Fig. 3 is a schematic structural diagram of a feature compression network provided by an embodiment of the present application;
图4是本申请实施例提供的步骤S101的具体实现流程示意图;FIG. 4 is a schematic diagram of a specific implementation flow of step S101 provided by the embodiment of the present application;
图5是本申请实施例提供的编码器的结构示意图;FIG. 5 is a schematic structural diagram of an encoder provided in an embodiment of the present application;
图6是本申请实施例提供的一种最近邻搜索装置的结构示意图;FIG. 6 is a schematic structural diagram of a nearest neighbor search device provided in an embodiment of the present application;
图7是本申请实施例提供的终端的结构示意图。FIG. 7 is a schematic structural diagram of a terminal provided by an embodiment of the present application.
具体实施方式Detailed ways
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。基于本申请的实施例,本领域技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本申请保护。In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, and are not intended to limit the present application. Based on the embodiments of the present application, all other embodiments obtained by those skilled in the art without creative efforts shall belong to the protection of the present application.
为了保障可靠性,基于索引图的近似最近邻搜索算法,需要预先构建高精度的RNG,在处理具有数亿特征向量数据的数据库时,三十多个线程构建RNG也需要花费几天甚至几周时间。构建RNG花费时间过长的问题严重限制了基于索引图的近似最近邻搜索算法的适用范围。In order to ensure reliability, the approximate nearest neighbor search algorithm based on the index graph needs to pre-build a high-precision RNG. When processing a database with hundreds of millions of eigenvector data, it takes days or even weeks for more than 30 threads to build the RNG. time. The problem that it takes too long to construct the RNG seriously limits the scope of application of the approximate nearest neighbor search algorithm based on the index graph.
因此,本申请提出了一种最近邻搜索方法,首先基于样本对象的高维近邻关系和样本对象的低维近邻关系得到的损失函数训练得到特征压缩网络,并利用训练好的特征压缩网络对参考高维特征进行降维,能够避免直接通过降维算法进行降维导致的特征之间相邻关系信息丢失的问题,进而提高搜索精度,同时基于降维后的参考低维特征建立低维近邻图,能够降低构建RNG的时间消耗,提高RNG构建效率。Therefore, this application proposes a nearest neighbor search method. First, the feature compression network is obtained by training the loss function based on the high-dimensional neighbor relationship of the sample object and the low-dimensional neighbor relationship of the sample object, and uses the trained feature compression network to compare the reference Dimensionality reduction of high-dimensional features can avoid the problem of loss of adjacent relationship information between features caused by dimensionality reduction directly through dimensionality reduction algorithms, thereby improving search accuracy. , which can reduce the time consumption of building RNG and improve the efficiency of RNG building.
为了说明本申请的技术方案,下面通过具体实施例来进行说明。In order to illustrate the technical solution of the present application, specific examples are used below to illustrate.
图1示出了本申请实施例提供的一种最近邻搜索方法的实现流程示意图,该方法可以应用于终端上,该终端可以为电脑、手机、可穿戴设备、车载设备、增强现实/虚拟现实设备、机顶盒、服务器、卫星无线设备等智能设备,可适用于需在保障最近邻搜索精度的同时,提高RNG的构建效率的情形。Figure 1 shows a schematic diagram of the implementation process of a nearest neighbor search method provided by the embodiment of the present application. Smart devices such as devices, set-top boxes, servers, and satellite wireless devices can be applied to situations where it is necessary to improve the efficiency of RNG construction while ensuring the accuracy of the nearest neighbor search.
具体的,上述最近邻搜索方法可以包括以下步骤S101至步骤S104。Specifically, the above nearest neighbor search method may include the following steps S101 to S104.
步骤S101,将参考对象的参考高维特征输入特征压缩网络中,得到由特征压缩网络输出的参考低维特征。Step S101, input the reference high-dimensional features of the reference object into the feature compression network to obtain the reference low-dimensional features output by the feature compression network.
其中,参考对象是指数据库中用于构建RNG的对象,参考对象的类型可以根据实际情况进行调整,一般可以为图像等。Wherein, the reference object refers to the object used to construct the RNG in the database, and the type of the reference object can be adjusted according to the actual situation, and generally can be an image or the like.
在本申请的实施方式中,终端可以通过特征提取算法提取参考对象的参考高维特征,并通过训练好的特征压缩网络,将参考对象的参考高维特征
Figure PCTCN2022099850-appb-000001
压缩为参考低维向量
Figure PCTCN2022099850-appb-000002
In the embodiment of this application, the terminal can extract the reference high-dimensional features of the reference object through the feature extraction algorithm, and use the trained feature compression network to convert the reference high-dimensional features of the reference object to
Figure PCTCN2022099850-appb-000001
Compress to reference low-dimensional vector
Figure PCTCN2022099850-appb-000002
具体的,特征提取算法和特征压缩网络的网络结构可以根据实际情况进行设置。而特征压缩网络的损失函数为基于样本对象的高维近邻关系和样本对象的低维近邻关系得到的函数。Specifically, the network structure of the feature extraction algorithm and the feature compression network can be set according to the actual situation. The loss function of the feature compression network is a function based on the high-dimensional neighbor relationship of the sample object and the low-dimensional neighbor relationship of the sample object.
其中,样本对象是用于训练特征压缩网络的对象。高维近邻关系是指样本对象的中每两个样本对象分别关联的样本高维特征之间的近邻关系,低维近邻关系是指样本对象的中每两个样本对象分别关联的样本低维特征之间的近邻关系。Among them, the sample object is the object used to train the feature compression network. The high-dimensional neighbor relationship refers to the neighbor relationship between the sample high-dimensional features associated with every two sample objects in the sample object, and the low-dimensional neighbor relationship refers to the sample low-dimensional features associated with every two sample objects in the sample object. neighbor relationship between them.
也就是说,本申请可以利用与样本对象的近邻关系相关的损失函数,对待训练的特征压缩网络进行训练,直至特征压缩网络收敛,得到训练好的特征压缩网络。That is to say, the present application can use the loss function related to the neighbor relationship of the sample object to train the feature compression network to be trained until the feature compression network converges to obtain a trained feature compression network.
需要说明的是,本申请对模型训练所使用的算法不进行限制,例如可以采用梯度下降算法实现。It should be noted that this application does not limit the algorithm used for model training, for example, it can be implemented by using a gradient descent algorithm.
常规的损失函数一般是基于样本高维特征与样本低维特征之间的误差建立的函数,而在本申请的实施方式中,基于高维近邻关系和低维近邻关系构建损失函数,相较于通过常规的降维算法进行降维,或者通过利用常规的损失函数训练得到的特征压缩网络进行降维,可以使样本低维特征保持一定的近邻关系信息,又由于最近邻搜索算法的搜索过程是利用特征间的近邻关系实现的,因此,采用本申请提供的方法提高了压缩过程中近邻关系的完整性,进而提高了搜索的精度。The conventional loss function is generally a function established based on the error between the high-dimensional features of the sample and the low-dimensional features of the sample. In the embodiment of the present application, the loss function is constructed based on the high-dimensional neighbor relationship and the low-dimensional neighbor relationship. Compared with Dimensionality reduction through conventional dimensionality reduction algorithms, or dimensionality reduction through the feature compression network trained by conventional loss functions, can keep the low-dimensional features of the sample to maintain certain neighbor relationship information, and because the search process of the nearest neighbor search algorithm is It is realized by using the neighbor relationship between features, therefore, the method provided by the present application improves the integrity of the neighbor relationship in the compression process, thereby improving the search accuracy.
需要说明的是,本申请对特征压缩的比例d in/d out不进行限制,实际应用中特征压缩的比例d in/d out可以为2、4、8等。 It should be noted that this application does not limit the ratio d in /d out of feature compression, and the ratio d in /d out of feature compression in practical applications can be 2, 4, 8, etc.
步骤S102,利用参考低维特征建立低维近邻图。Step S102, using the reference low-dimensional features to establish a low-dimensional neighbor graph.
在本申请的实施方式中,可以采用HNSW(Hierarchical Navigable Small World)算法、NSG(Navigating Spreading-out Graph)算法等用于构建RNG的算法,利用参考低维特征建立低维近邻图。低维近邻图中记录有参考低维特征之间的近邻关系。In the implementation of the present application, algorithms such as HNSW (Hierarchical Navigable Small World) algorithm and NSG (Navigating Spreading-out Graph) algorithm can be used to construct RNG, and low-dimensional neighbor graphs can be established by using reference low-dimensional features. The low-dimensional neighbor graph records the neighbor relationship between the reference low-dimensional features.
由于构建RNG的过程中需要计算特征之间的相似度,则在建立低维近邻 图的过程中,只需计算参考低维特征之间的相似度,而不需要计算参考高维特征之间的相似度,因此相似度计算的速度更快,能够降低构建RNG的时间消耗,提高RNG的构建效率。Since the similarity between features needs to be calculated in the process of constructing RNG, in the process of establishing a low-dimensional neighbor graph, only the similarity between reference low-dimensional features needs to be calculated, and there is no need to calculate the similarity between reference high-dimensional features. Similarity, so the calculation of similarity is faster, which can reduce the time consumption of building RNG and improve the construction efficiency of RNG.
步骤S103,获取目标对象的目标特征。Step S103, acquiring target features of the target object.
具体的,目标对象是指待查询对象,其类型与参考对象和样本对象的类型相同。上述目标特征可以指目标对象的目标高维特征或目标低维特征。Specifically, the target object refers to the object to be queried, and its type is the same as that of the reference object and the sample object. The above-mentioned target features may refer to target high-dimensional features or target low-dimensional features of the target object.
步骤S104,基于低维近邻图与目标特征进行最近邻搜索,得到与目标对象距离最近的参考对象。In step S104, the nearest neighbor search is performed based on the low-dimensional neighbor graph and the target feature to obtain the reference object with the closest distance to the target object.
在本申请的一些实施方式中,终端可以对目标对象进行特征提取,得到目标对象的目标高维特征,然后,将目标对象的目标高维特征输入前述特征压缩网络中,得到由前述特征压缩网络输出的目标低维特征,并将得到的目标低维特征作为目标对象的目标特征。利用低维近邻图中记录的参考低维特征之间的近邻关系进行最近邻搜索,可以搜索出低维近邻图中与目标特征欧式距离最近的一个或多个参考低维特征,并将与搜索出的一个或多个参考低维特征分别关联的参考对象作为与目标对象距离最近的参考对象。In some embodiments of the present application, the terminal can perform feature extraction on the target object to obtain the target high-dimensional features of the target object, and then input the target high-dimensional features of the target object into the aforementioned feature compression network to obtain the target high-dimensional features of the target object. The output target low-dimensional feature, and the obtained target low-dimensional feature is used as the target feature of the target object. Using the neighbor relationship between the reference low-dimensional features recorded in the low-dimensional neighbor map to perform nearest neighbor search, one or more reference low-dimensional features with the closest Euclidean distance to the target feature in the low-dimensional neighbor map can be searched, and combined with the search The reference objects associated with one or more reference low-dimensional features are taken as the closest reference objects to the target object.
在本申请的另一些实施方式中,如图2所示,终端还可以将低维近邻图的每个参考低维特征替换为与其对应的参考高维特征,得到低维近邻图对应的高维近邻图,并基于高维近邻图与目标特征进行最近邻搜索,得到与目标对象距离最近的参考对象。In other embodiments of the present application, as shown in FIG. 2 , the terminal can also replace each reference low-dimensional feature of the low-dimensional neighbor graph with its corresponding reference high-dimensional feature to obtain the high-dimensional feature corresponding to the low-dimensional neighbor graph. The nearest neighbor graph, and based on the high-dimensional neighbor graph and the target feature, the nearest neighbor search is performed to obtain the reference object closest to the target object.
具体的,终端可以对目标对象进行特征提取,得到目标对象的目标高维特征,并将得到的目标高维特征作为目标对象的目标特征。利用高维近邻图中记录的参考高维特征之间的近邻关系进行最近邻搜索,可以搜索出高维近邻图中与目标特征欧式距离最近的一个或多个参考高维特征,并将与搜索出的一个或多个参考高维特征分别关联的参考对象作为与目标对象距离最近的参考对象。Specifically, the terminal may perform feature extraction on the target object to obtain target high-dimensional features of the target object, and use the obtained target high-dimensional features as target features of the target object. Using the neighbor relationship between the reference high-dimensional features recorded in the high-dimensional neighbor map to perform nearest neighbor search, one or more reference high-dimensional features with the closest Euclidean distance to the target feature in the high-dimensional neighbor map can be searched, and combined with the search The reference objects associated with one or more reference high-dimensional features are taken as the closest reference objects to the target object.
在本申请的实施方式中,通过将低维近邻图的每个参考低维特征替换为与其对应的参考高维特征,得到低维近邻图对应的高维近邻图,并使用高维近邻图进行最近邻搜索,使得终端在进行最近邻搜索时可以使用参考高维特征和目标高维特征之间的欧氏距离,与使用参考低维特征和目标低维特征之间的欧氏距离相比,能够提高搜索的精度。In the embodiment of the present application, by replacing each reference low-dimensional feature of the low-dimensional neighbor graph with its corresponding reference high-dimensional feature, the high-dimensional neighbor graph corresponding to the low-dimensional neighbor graph is obtained, and the high-dimensional neighbor graph is used to carry out The nearest neighbor search allows the terminal to use the Euclidean distance between the reference high-dimensional feature and the target high-dimensional feature when performing the nearest neighbor search. Compared with the Euclidean distance between the reference low-dimensional feature and the target low-dimensional feature, The accuracy of the search can be improved.
以目标对象为目标图像为例,当需要进行目标图像的场景识别时,可以提取目标图像的目标特征,再在高维近邻图中搜索出与目标特征欧氏距离最近的一个或多个参考高维特征,并将搜索出的一个或多个参考高维特征分别关联的参考图像作为与目标图像距离最近的参考图像,进而可以将目标图像所属的场景确定为与目标图像距离最近的参考图像所属的场景。Taking the target object as the target image as an example, when the scene recognition of the target image is required, the target features of the target image can be extracted, and then one or more reference heights with the closest Euclidean distance to the target feature can be searched in the high-dimensional nearest neighbor graph. dimensional features, and the reference images associated with the searched one or more reference high-dimensional features are taken as the reference images with the closest distance to the target image, and then the scene to which the target image belongs can be determined as the reference image with the closest distance to the target image. scene.
本申请的实施方式中,通过将参考对象的参考高维特征输入特征压缩网络中,得到由特征压缩网络输出的参考低维特征,并利用参考低维特征建立低维近邻图,然后,获取目标对象的目标特征,并基于低维近邻图与目标特征进行最近邻搜索,得到与目标对象距离最近的参考对象,一方面,由于特征压缩网络的损失函数为基于样本对象的高维近邻关系和所述样本对象的低维近邻关系得到的函数,利用训练好的特征压缩网络对参考高维特征进行降维,能够避免直接通过降维算法进行降维导致的特征之间相邻关系信息丢失的问题,进而提高搜索精度,另一方面,基于降维后的参考低维特征建立低维近邻图,能够降低构建RNG的时间消耗,提高RNG构建效率。In the embodiment of the present application, by inputting the reference high-dimensional features of the reference object into the feature compression network, the reference low-dimensional features output by the feature compression network are obtained, and the low-dimensional neighbor graph is established by using the reference low-dimensional features, and then the target The target feature of the object, and the nearest neighbor search is performed based on the low-dimensional neighbor graph and the target feature, and the reference object closest to the target object is obtained. On the one hand, because the loss function of the feature compression network is based on the high-dimensional neighbor relationship of the sample object and the The function obtained by describing the low-dimensional neighbor relationship of the sample object, using the trained feature compression network to reduce the dimensionality of the reference high-dimensional features, can avoid the problem of loss of adjacent relationship information between features caused by dimensionality reduction directly through the dimensionality reduction algorithm , and then improve the search accuracy. On the other hand, building a low-dimensional neighbor graph based on the reduced-dimensional reference low-dimensional features can reduce the time consumption of building RNG and improve the efficiency of RNG construction.
在本申请的一些实施方式中,终端可以构建图3所示的特征压缩网络。具体的,特征压缩网络中可以包括压缩模块31、投影模块32以及全局优化模块33。压缩模块31具体包括第一线性映射模块311、第二线性映射模块312以及特征压缩模块313。In some embodiments of the present application, the terminal may construct the feature compression network shown in FIG. 3 . Specifically, the feature compression network may include a compression module 31 , a projection module 32 and a global optimization module 33 . The compression module 31 specifically includes a first linear mapping module 311 , a second linear mapping module 312 and a feature compression module 313 .
相应的,上述步骤S101可以具体包括以下步骤S401至步骤S402。Correspondingly, the above step S101 may specifically include the following steps S401 to S402.
步骤S401,将参考高维特征输入至特征压缩模块、第一线性映射模块和投影模块,得到由特征压缩模块输出的第一特征、由第一线性映射模块输出的第二特征,以及由投影模块输出的至少一个第三特征。Step S401, input the reference high-dimensional features into the feature compression module, the first linear mapping module and the projection module, and obtain the first feature output by the feature compression module, the second feature output by the first linear mapping module, and the projection module At least one third feature of the output.
具体的,特征压缩模块313可以包含一个线性映射函数f(x)=W cx,参数
Figure PCTCN2022099850-appb-000003
一个激活函数Hardswish,以及一个批量归一化(Batch Normalization,BN)层。终端将参考高维特征
Figure PCTCN2022099850-appb-000004
输入至特征压缩模块313之后可以得到维度为d out的第一特征。
Specifically, the feature compression module 313 may include a linear mapping function f(x)=W c x, the parameter
Figure PCTCN2022099850-appb-000003
An activation function Hardswish, and a batch normalization (Batch Normalization, BN) layer. The terminal will refer to the high-dimensional features
Figure PCTCN2022099850-appb-000004
After being input to the feature compression module 313, the first feature whose dimension is d out can be obtained.
第一线性映射模块311可以包含一个线性映射函数
Figure PCTCN2022099850-appb-000005
参数
Figure PCTCN2022099850-appb-000006
同样的,终端将参考高维特征
Figure PCTCN2022099850-appb-000007
输入至特征压缩模块311之后可以得到维度为d out的第二特征cp(x)。
The first linear mapping module 311 may contain a linear mapping function
Figure PCTCN2022099850-appb-000005
parameter
Figure PCTCN2022099850-appb-000006
Similarly, the terminal will refer to high-dimensional features
Figure PCTCN2022099850-appb-000007
After being input to the feature compression module 311, the second feature cp(x) whose dimension is d out can be obtained.
投影模块32包含n个压缩投影函数,将参考高维特征
Figure PCTCN2022099850-appb-000008
输入投影模 块,可以得到一个序列的第三特征
Figure PCTCN2022099850-appb-000009
其中,第三特征p i(x)=w ix,维度为d out,参数
Figure PCTCN2022099850-appb-000010
可以采用稀疏随机投影的方式进行初始化。
The projection module 32 contains n compressed projection functions, which will refer to high-dimensional features
Figure PCTCN2022099850-appb-000008
Enter the projection module to get the third feature of a sequence
Figure PCTCN2022099850-appb-000009
Among them, the third feature p i (x)=w i x, the dimension is d out , and the parameter
Figure PCTCN2022099850-appb-000010
It can be initialized by sparse random projection.
可见,第一特征、第二特征和每个第三特征的维度与参考低维特征的维度相同,均为d outIt can be seen that the dimension of the first feature, the second feature and each third feature is the same as that of the reference low-dimensional feature, all of which are d out .
步骤S402,将第一特征、第二特征和至少一个第三特征输入至全局优化模块,得到由全局优化模块输出的第四特征和至少一个第五特征。Step S402, inputting the first feature, the second feature and at least one third feature to the global optimization module to obtain a fourth feature and at least one fifth feature output by the global optimization module.
具体的,全局优化模块33可以包括至少一个编码器。上述终端可以合并特征压缩模块313输出的第一特征,以及投影模块输出32的至少一个第三特征,得到第一向量。接着,将第一向量输入至少一个编码器中的第一个编码器331,得到由第一个编码器331基于多注意力头机制和线性映射层输出的第二向量。其中,第二向量包括与第一特征对应的第六特征,以及与所述至少一个第三特征
Figure PCTCN2022099850-appb-000011
一一对应的至少一个第七特征。
Specifically, the global optimization module 33 may include at least one encoder. The above-mentioned terminal may combine the first feature output by the feature compression module 313 and at least one third feature output by the projection module 32 to obtain the first vector. Next, the first vector is input to the first encoder 331 of the at least one encoder to obtain a second vector output by the first encoder 331 based on the multi-attention head mechanism and the linear mapping layer. Wherein, the second vector includes a sixth feature corresponding to the first feature, and a sixth feature corresponding to the at least one third feature
Figure PCTCN2022099850-appb-000011
One-to-one correspondence with at least one seventh feature.
然后,将第二向量中第一特征cp(x)对应的第六特征与第一线性映射模块311输出的第二特征相加,得到第八特征,并将第二向量中的第六特征替换为第八特征,得到第三向量。再将第三向量输入至至少一个编码器中的第二个编码器,以此类推,直至得到由至少一个编码器的最后一个编码器33n输出的目标向量。其中,目标向量包括第四特征和至少一个第五特征。Then, add the sixth feature corresponding to the first feature cp(x) in the second vector to the second feature output by the first linear mapping module 311 to obtain the eighth feature, and replace the sixth feature in the second vector For the eighth feature, the third vector is obtained. The third vector is then input to the second encoder of the at least one encoder, and so on, until the target vector output by the last encoder 33n of the at least one encoder is obtained. Wherein, the target vector includes a fourth feature and at least one fifth feature.
图5示出了本申请单个编码器的结构。单个编码器可以包括多头注意力机制模块以及编码器映射模块。Fig. 5 shows the structure of a single encoder of the present application. A single encoder can include a multi-head attention module as well as an encoder mapping module.
在本申请的一些实施方式中,上述多头注意力机制模块采用多注意力头机制。终端计算多注意力头机制的输出值的过程可以具体包括:基于第一向量
Figure PCTCN2022099850-appb-000012
确定多注意力头机制的第一输入值Q、第二输入值K和第三输入值V;对第一输入值Q、第二输入值K和第三输入值V分别进行映射处理,得到第一输入值Q对应的第一映射值
Figure PCTCN2022099850-appb-000013
第二输入值K对应的第二映射值
Figure PCTCN2022099850-appb-000014
和第三输入值V对应的第三映射值
Figure PCTCN2022099850-appb-000015
利用第一映射值和第二映射值,计算第三映射值的合并参数;利用第三映射值和合并参数计算多注意力头机制的输出值head i(Q,K,V)。
In some embodiments of the present application, the above-mentioned multi-head attention mechanism module adopts a multi-head attention mechanism. The process for the terminal to calculate the output value of the multi-attention head mechanism may specifically include: based on the first vector
Figure PCTCN2022099850-appb-000012
Determine the first input value Q, the second input value K and the third input value V of the multi-attention head mechanism; respectively map the first input value Q, the second input value K and the third input value V to obtain the first The first mapping value corresponding to an input value Q
Figure PCTCN2022099850-appb-000013
The second mapping value corresponding to the second input value K
Figure PCTCN2022099850-appb-000014
The third mapping value corresponding to the third input value V
Figure PCTCN2022099850-appb-000015
Using the first mapping value and the second mapping value, calculate the merging parameter of the third mapping value; using the third mapping value and the merging parameter to calculate the output value head i (Q, K, V) of the multi-attention head mechanism.
其中,第一映射值的维度和第二映射值的维度均低于第三映射值的维度。Wherein, the dimension of the first mapping value and the dimension of the second mapping value are both lower than the dimension of the third mapping value.
具体的,上述对第一输入值Q的映射处理可以采用线性映射的方式,得 到的第一映射值可以表示为
Figure PCTCN2022099850-appb-000016
其中,
Figure PCTCN2022099850-appb-000017
e为膨胀系数,h n为预设的注意力头数量。同样的,对第二输入值K的映射处理也可以采用线性映射的方式得到的第一映射值可以表示为
Figure PCTCN2022099850-appb-000018
其中,
Figure PCTCN2022099850-appb-000019
对第三输入值V的映射处理可以采用线性映射的方式,得到的第三映射值可以表示为
Figure PCTCN2022099850-appb-000020
其中,
Figure PCTCN2022099850-appb-000021
可见,经过线性映射之后,第一映射值的维度和第二映射值的维度均低于第三映射值的维度。
Specifically, the above-mentioned mapping process on the first input value Q may adopt a linear mapping method, and the obtained first mapping value may be expressed as
Figure PCTCN2022099850-appb-000016
in,
Figure PCTCN2022099850-appb-000017
e is the expansion coefficient, h n is the preset number of attention heads. Similarly, the mapping process of the second input value K can also be obtained by linear mapping, and the first mapping value can be expressed as
Figure PCTCN2022099850-appb-000018
in,
Figure PCTCN2022099850-appb-000019
The mapping process for the third input value V can be performed in a linear mapping manner, and the obtained third mapping value can be expressed as
Figure PCTCN2022099850-appb-000020
in,
Figure PCTCN2022099850-appb-000021
It can be seen that after the linear mapping, the dimension of the first mapping value and the dimension of the second mapping value are lower than the dimension of the third mapping value.
在一些实施方式中,终端可以将第一映射值
Figure PCTCN2022099850-appb-000022
和第二映射值
Figure PCTCN2022099850-appb-000023
的乘积除以,计算预设的特征维度值的开方值,并将得到的商
Figure PCTCN2022099850-appb-000024
输入至softmax函数中,进而将softmax函数的输出值作为上述第三映射值的合并参数。
In some implementations, the terminal can use the first mapping value
Figure PCTCN2022099850-appb-000022
and the second mapped value
Figure PCTCN2022099850-appb-000023
Divided by the product of , calculate the square root value of the preset feature dimension value, and obtain the quotient
Figure PCTCN2022099850-appb-000024
input into the softmax function, and then use the output value of the softmax function as a combination parameter of the above-mentioned third mapping value.
也即,多注意力头机制的输出值
Figure PCTCN2022099850-appb-000025
That is, the output value of the multi-attention head mechanism
Figure PCTCN2022099850-appb-000025
此时终端便可以将每个注意力头的输出通过映射函数合并为向量
Figure PCTCN2022099850-appb-000026
At this point, the terminal can combine the output of each attention head into a vector through the mapping function
Figure PCTCN2022099850-appb-000026
编码器映射模块可以Linear_BN层和Linear_ABN层,其中,Linear_BN层包含一个线性映射层和一个BN层。Linear_ABN层依次包含一个线性映射层,一个激活函数及一个BN层,编码器映射模块用于将多注意力头机制的输出值映射为编码器输出的特征。The encoder mapping module can be a Linear_BN layer and a Linear_ABN layer, wherein the Linear_BN layer includes a linear mapping layer and a BN layer. The Linear_ABN layer consists of a linear mapping layer, an activation function and a BN layer in turn. The encoder mapping module is used to map the output value of the multi-attention head mechanism to the feature output by the encoder.
常规的多注意力头机制一般要求Q、K、V的维度均相同,而在本申请的实施方式中,由于Q和K仅用于提供如何合并V的信息,因此,终端可以对Q和K进行降维,并利用降维后的Q、降维后的K以及原始维度的V计算多注意力头机制的输出值,使得在训练时softmax值的计算量降低,提高了编码器的计算速度,进而提高了模型的训练效率。The conventional multi-attention head mechanism generally requires the dimensions of Q, K, and V to be the same, but in the embodiment of this application, since Q and K are only used to provide information on how to combine V, the terminal can compare Q and K Carry out dimension reduction, and use the reduced Q, the reduced K and the original dimension V to calculate the output value of the multi-attention head mechanism, which reduces the calculation amount of the softmax value during training and improves the calculation speed of the encoder. , thus improving the training efficiency of the model.
步骤S403,将第四特征和至少一个第五特征输入至压缩模块,得到由压缩模块输出的参考低维特征。Step S403, input the fourth feature and at least one fifth feature into the compression module to obtain reference low-dimensional features output by the compression module.
具体的,终端可以将第四特征和至少一个第五特征相加,并将得到的特征输入到压缩模块31的第二线性映射模块312中,得到参考低维特征
Figure PCTCN2022099850-appb-000027
Specifically, the terminal may add the fourth feature and at least one fifth feature, and input the obtained feature into the second linear mapping module 312 of the compression module 31 to obtain the reference low-dimensional feature
Figure PCTCN2022099850-appb-000027
其中,第二线性映射模块312可以包含一个线性映射函数
Figure PCTCN2022099850-appb-000028
参数
Figure PCTCN2022099850-appb-000029
Wherein, the second linear mapping module 312 may include a linear mapping function
Figure PCTCN2022099850-appb-000028
parameter
Figure PCTCN2022099850-appb-000029
在本申请的实施方式中,通过将编码器输出的第六特征与第一线性映射 模块输出的第二特征相加,得到第八特征,再将第八特征作为下一个编码器的输入值,使得编码器迭代的过程中每个编码器的输入值不会过分偏离实际的特征值,提高了特征压缩网络的收敛速度。In the embodiment of the present application, the eighth feature is obtained by adding the sixth feature output by the encoder to the second feature output by the first linear mapping module, and then the eighth feature is used as the input value of the next encoder, In the process of encoder iteration, the input value of each encoder will not deviate too much from the actual feature value, and the convergence speed of the feature compression network is improved.
在完成对特征压缩网络的构建之后,终端可以利用基于样本对象的高维近邻关系和样本对象的低维近邻关系得到的损失函数,对特征压缩网络进行训练。After completing the construction of the feature compression network, the terminal can use the loss function obtained based on the high-dimensional neighbor relationship of the sample object and the low-dimensional neighbor relationship of the sample object to train the feature compression network.
其中,上述高维近邻关系可以为多个样本对象中每两个样本对象分别关联的样本高维特征之间的高维欧氏距离,上述低维近邻关系为多个样本对象中每两个样本对象分别关联的样本低维特征之间的低维欧氏距离。Among them, the above-mentioned high-dimensional neighbor relationship can be the high-dimensional Euclidean distance between the sample high-dimensional features associated with every two sample objects in the multiple sample objects, and the above-mentioned low-dimensional neighbor relationship is the high-dimensional Euclidean distance between every two sample objects in the multiple sample objects. The low-dimensional Euclidean distance between the sample low-dimensional features associated with the objects respectively.
相应的,特征压缩网络的损失函数为基于高维欧氏距离和与高维欧氏距离对应的低维欧氏距离之间的误差值,以及与高维欧氏距离关联的权重值得到的函数。Correspondingly, the loss function of the feature compression network is a function based on the error value between the high-dimensional Euclidean distance and the low-dimensional Euclidean distance corresponding to the high-dimensional Euclidean distance, and the weight value associated with the high-dimensional Euclidean distance .
在本申请的一些实施方式中,权重值的取值和与其关联的高维欧氏距离的大小相关。In some embodiments of the present application, the value of the weight value is related to the magnitude of the associated high-dimensional Euclidean distance.
具体的,终端在计算上述损失函数的损失值loss的过程可以具体包括:计算高维欧氏距离||x i-x j|| 2和与高维欧氏距离对应的低维欧氏距离||f(x i)-f(x j)|| 2之间的误差值;计算与高维欧氏距离关联的权重值ω ij,并利用每个权重值对每个误差值进行加权相加,得到累加值
Figure PCTCN2022099850-appb-000030
将累加值和样本对象的总数量的平方值m 2相除,得到损失函数的损失值loss。也即,损失函数的损失值
Figure PCTCN2022099850-appb-000031
Specifically, the process of the terminal calculating the loss value loss of the above loss function may specifically include: calculating the high-dimensional Euclidean distance || xi -x j || 2 and the low-dimensional Euclidean distance corresponding to the high-dimensional Euclidean distance| The error value between |f(x i )-f(x j )|| 2 ; calculate the weight value ω ij associated with the high-dimensional Euclidean distance, and use each weight value to weight and add each error value , get the accumulated value
Figure PCTCN2022099850-appb-000030
Divide the accumulated value and the square value m 2 of the total number of sample objects to obtain the loss value loss of the loss function. That is, the loss value of the loss function
Figure PCTCN2022099850-appb-000031
其中,m表示样本对象的总数量,||f(x i)-f(x j)|| 2表示高维欧氏距离,f(x i)表示与第i个样本对象关联的样本高维特征,f(x i)表示与第j个样本对象关联的样本高维特征,||x i-x j|| 2表示低维欧氏距离,x i表示与第i个样本对象关联的样本低维特征,xj表示与第j个样本对象关联的样本低维特征,ω ij表示与高维欧氏距离关联的权重值。 Among them, m represents the total number of sample objects, ||f( xi )-f(x j )|| 2 represents the high-dimensional Euclidean distance, and f( xi ) represents the high-dimensional sample associated with the i-th sample object feature, f( xi ) represents the high-dimensional feature of the sample associated with the j-th sample object, || xi -x j || 2 represents the low-dimensional Euclidean distance, and x i represents the sample associated with the i-th sample object Low-dimensional feature, xj represents the sample low-dimensional feature associated with the jth sample object, ω ij represents the weight value associated with the high-dimensional Euclidean distance.
在本申请的一些实施方式中,终端计算上述权重值ω ij的过程可以包括:获取第一超参数α和第二超参数β,并计算各个高维欧氏距离的平均值boundary;然后,计算高维欧氏距离和平均值之商的自然对数值的相反数
Figure PCTCN2022099850-appb-000032
确定相反数和第二超参数之间的最大值,并将第一超参数和最 大值之间的最小值作为与高维欧氏距离关联的权重值。
In some embodiments of the present application, the process for the terminal to calculate the above weight value ω ij may include: obtaining the first hyperparameter α and the second hyperparameter β, and calculating the average boundary of each high-dimensional Euclidean distance; then, calculating The inverse of the natural logarithm value of the quotient of the high-dimensional Euclidean distance and the mean
Figure PCTCN2022099850-appb-000032
Determine the maximum value between the inverse number and the second hyperparameter, and use the minimum value between the first hyperparameter and the maximum value as the weight value associated with the high-dimensional Euclidean distance.
也即,与高维欧氏距离关联的权重值That is, the weight value associated with the high-dimensional Euclidean distance
Figure PCTCN2022099850-appb-000033
Figure PCTCN2022099850-appb-000033
其中,第一超参数α大于第二超参数β。α和β的具体取值可以根据实际情况设,实际应用中,α可以设置为2,β可以设置为0.01。Wherein, the first hyperparameter α is greater than the second hyperparameter β. The specific values of α and β can be set according to the actual situation. In practical applications, α can be set to 2, and β can be set to 0.01.
基于上述公式可知,当高维欧氏距离d ij较小,则与其关联的权重值ω ij将为α或
Figure PCTCN2022099850-appb-000034
权重值ω ij大于β;而当高维欧氏距离d ij较大,则与其关联的权重值ω ij将为β。进而在损失函数中,高维欧氏距离小的两个样本高维特征所对应的权重更高。
Based on the above formula, when the high-dimensional Euclidean distance d ij is small, the weight value ω ij associated with it will be α or
Figure PCTCN2022099850-appb-000034
The weight value ω ij is greater than β; and when the high-dimensional Euclidean distance d ij is larger, the weight value ω ij associated with it will be β. Furthermore, in the loss function, the weights corresponding to the high-dimensional features of two samples with small high-dimensional Euclidean distance are higher.
在本申请的实施方式中,利用上述权重公式与损失函数进行特征压缩网络的训练,可以使高维欧氏距离小的特征之间高维近邻关系与低维近邻关系的精度更高,也即,将高维欧氏距离越小的两个参考高维特征输入至训练好的特征压缩网络后,得到的两个参考低维特征之间近邻关系信息将更完整。而最近邻搜索算法的目的就是搜索与目标特征距离最近的特征,因此,这种方式可以使距离较近的特征之间相邻关系信息更加完整,进而进一步提高搜索精度。In the embodiment of the present application, using the above weight formula and loss function to train the feature compression network can make the high-dimensional neighbor relationship and low-dimensional neighbor relationship between features with small high-dimensional Euclidean distance more accurate, that is, , after inputting the two reference high-dimensional features with smaller high-dimensional Euclidean distance into the trained feature compression network, the obtained neighbor relationship information between the two reference low-dimensional features will be more complete. The purpose of the nearest neighbor search algorithm is to search for the feature with the closest distance to the target feature. Therefore, this method can make the adjacent relationship information between the features with a closer distance more complete, and further improve the search accuracy.
在另一些实施方式中,终端还可以将特征压缩网络输出的低维特征通过标量量化(scalar quantization)的方法量化为低维整数向量
Figure PCTCN2022099850-appb-000035
并将量化得到的低维整数向量作为参考低维特征以构建RNG。
In other embodiments, the terminal can also quantize the low-dimensional features output by the feature compression network into low-dimensional integer vectors through scalar quantization
Figure PCTCN2022099850-appb-000035
And the quantized low-dimensional integer vector is used as a reference low-dimensional feature to construct RNG.
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,因为根据本申请,某些步骤可以采用其它顺序进行。It should be noted that for the foregoing method embodiments, for the sake of simple description, they are expressed as a series of action combinations, but those skilled in the art should know that the present application is not limited by the described action sequence. Because of this application, certain steps may be performed in other orders.
如图6所示为本申请实施例提供的一种最近邻搜索装置600的结构示意图,所述最近邻搜索装置600配置于终端上。FIG. 6 is a schematic structural diagram of a nearest neighbor search apparatus 600 provided in an embodiment of the present application, and the nearest neighbor search apparatus 600 is configured on a terminal.
具体的,所述最近邻搜索装置600可以包括:Specifically, the nearest neighbor search device 600 may include:
特征压缩单元601,用于将参考对象的参考高维特征输入特征压缩网络中,得到由特征压缩网络输出的参考低维特征,特征压缩网络的损失函数为基于样本对象的高维近邻关系和样本对象的低维近邻关系得到的函数;The feature compression unit 601 is used to input the reference high-dimensional features of the reference object into the feature compression network to obtain the reference low-dimensional features output by the feature compression network. The loss function of the feature compression network is based on the high-dimensional neighbor relationship of the sample object and the sample The function obtained by the low-dimensional neighbor relationship of the object;
近邻图构建单元602,用于利用参考低维特征建立低维近邻图;A neighbor graph construction unit 602, configured to construct a low-dimensional neighbor graph using reference low-dimensional features;
特征获取单元603,用于获取目标对象的目标特征;A feature acquisition unit 603, configured to acquire target features of the target object;
最近邻搜索单元604,用于基于低维近邻图与所述目标特征进行最近邻搜索,得到与目标对象距离最近的参考对象。The nearest neighbor search unit 604 is configured to perform a nearest neighbor search based on the low-dimensional neighbor graph and the target feature to obtain a reference object with the closest distance to the target object.
在本申请的一些实施方式中,上述高维近邻关系为多个样本对象中每两个样本对象分别关联的样本高维特征之间的高维欧氏距离,上述低维近邻关系为多个样本对象中每两个样本对象分别关联的样本低维特征之间的低维欧氏距离。上述特征压缩网络的损失函数为基于高维欧氏距离和与高维欧氏距离对应的低维欧氏距离之间的误差值,以及与高维欧氏距离关联的权重值得到的函数,其中,权重值的取值和与其关联的高维欧氏距离的大小相关。In some embodiments of the present application, the above-mentioned high-dimensional neighbor relationship is the high-dimensional Euclidean distance between the sample high-dimensional features associated with every two sample objects in the multiple sample objects, and the above-mentioned low-dimensional neighbor relationship is the multiple sample objects The low-dimensional Euclidean distance between the sample low-dimensional features associated with every two sample objects in the object. The loss function of the above feature compression network is a function based on the error value between the high-dimensional Euclidean distance and the low-dimensional Euclidean distance corresponding to the high-dimensional Euclidean distance, and the weight value associated with the high-dimensional Euclidean distance, where , the value of the weight value is related to the size of the high-dimensional Euclidean distance associated with it.
在本申请的一些实施方式中,上述损失函数的损失值的计算过程,包括:计算所述高维欧氏距离和与所述高维欧氏距离对应的低维欧氏距离之间的误差值;计算与所述高维欧氏距离关联的权重值,并利用每个所述权重值对每个所述误差值进行加权相加,得到累加值;将所述累加值和所述样本对象的总数量的平方值相除,得到所述损失函数的损失值。In some embodiments of the present application, the calculation process of the loss value of the above loss function includes: calculating the error value between the high-dimensional Euclidean distance and the low-dimensional Euclidean distance corresponding to the high-dimensional Euclidean distance ; Calculate the weight value associated with the high-dimensional Euclidean distance, and use each of the weight values to weight and add each of the error values to obtain an accumulated value; combine the accumulated value and the sample object The square value of the total number is divided to obtain the loss value of the loss function.
在本申请的一些实施方式中,上述权重值的计算过程,包括:获取第一超参数和第二超参数,其中,所述第一超参数大于所述第二超参数;计算各个所述高维欧氏距离的平均值;计算所述高维欧氏距离和所述平均值之商的自然对数值的相反数;确定所述相反数和所述第二超参数之间的最大值,并将所述第一超参数和所述最大值之间的最小值作为与所述高维欧氏距离关联的权重值。In some embodiments of the present application, the calculation process of the above weight value includes: obtaining a first hyperparameter and a second hyperparameter, wherein the first hyperparameter is greater than the second hyperparameter; calculating each of the high an average value of the dimensional Euclidean distance; calculating the inverse of the natural logarithm value of the quotient of the high-dimensional Euclidean distance and the average; determining a maximum value between the inverse number and the second hyperparameter, and A minimum value between the first hyperparameter and the maximum value is used as a weight value associated with the high-dimensional Euclidean distance.
在本申请的一些实施方式中,上述特征压缩网络包括压缩模块、投影模块以及全局优化模块,压缩模块包括第一线性映射模块、第二线性映射模块以及特征压缩模块。上述特征压缩单元601可以具体用于:将参考高维特征输入至特征压缩模块、第一线性映射模块和投影模块,得到由特征压缩模块输出的第一特征、由第一线性映射模块输出的第二特征,以及由投影模块输出的至少一个第三特征,其中,第一特征、第二特征和每个第三特征的维度与参考低维特征的维度相同;将第一特征、第二特征和至少一个第三特征输入至全局优化模块,得到由全局优化模块输出的第四特征和至少一个第五特征;将第四特征和至少一个第五特征输入至压缩模块,得到由压缩模块输出的参考低维特征。In some embodiments of the present application, the above-mentioned feature compression network includes a compression module, a projection module, and a global optimization module, and the compression module includes a first linear mapping module, a second linear mapping module, and a feature compression module. The above-mentioned feature compression unit 601 can be specifically used to: input the reference high-dimensional feature to the feature compression module, the first linear mapping module and the projection module, and obtain the first feature output by the feature compression module and the first feature output by the first linear mapping module. Two features, and at least one third feature output by the projection module, wherein the dimension of the first feature, the second feature and each third feature is the same as the dimension of the reference low-dimensional feature; the first feature, the second feature and At least one third feature is input to the global optimization module to obtain a fourth feature and at least one fifth feature output by the global optimization module; the fourth feature and at least one fifth feature are input to the compression module to obtain a reference output by the compression module low-dimensional features.
在本申请的一些实施方式中,上述全局优化模块包括至少一个编码器。 上述特征压缩单元601可以具体用于:将第一特征和至少一个第三特征组成第一向量,并将第一向量输入至少一个编码器中的第一个编码器,得到由第一个编码器基于多注意力头机制和线性映射层输出的第二向量,其中,第二向量包括与第一特征对应的第六特征,以及与至少一个第三特征一一对应的至少一个第七特征;将第二向量中第一特征对应的第六特征与第二特征相加,得到第八特征;将第二向量中的第六特征替换为第八特征,得到第三向量;将第三向量输入至至少一个编码器中的第二个编码器,以此类推,直至得到由至少一个编码器的最后一个编码器输出的目标向量,目标向量包括第四特征和至少一个第五特征。In some embodiments of the present application, the above-mentioned global optimization module includes at least one encoder. The above-mentioned feature compression unit 601 can be specifically configured to: form the first feature and at least one third feature into a first vector, and input the first vector into the first encoder of at least one encoder to obtain The second vector based on the multi-attention head mechanism and the linear mapping layer output, wherein the second vector includes a sixth feature corresponding to the first feature, and at least one seventh feature corresponding to at least one third feature; Add the sixth feature corresponding to the first feature in the second vector to the second feature to obtain the eighth feature; replace the sixth feature in the second vector with the eighth feature to obtain the third vector; input the third vector to The second encoder of the at least one encoder, and so on, until the target vector output by the last encoder of the at least one encoder is obtained, the target vector includes the fourth feature and at least one fifth feature.
在本申请的一些实施方式中,上述的计算过程包括:基于所述第一向量确定所述多注意力头机制的第一输入值、第二输入值和第三输入值;对所述第一输入值、所述第二输入值和所述第三输入值分别进行映射处理,得到所述第一输入值对应的第一映射值、所述第二输入值对应的第二映射值和所述第三输入值对应的第三映射值,其中,所述第一映射值的维度和所述第二映射值的维度均低于所述第三映射值的维度;利用所述第一映射值和所述第二映射值,计算所述第三映射值的合并参数;利用所述第三映射值和所述合并参数计算所述多注意力头机制的输出值。In some embodiments of the present application, the above calculation process includes: determining the first input value, the second input value and the third input value of the multi-attention head mechanism based on the first vector; The input value, the second input value, and the third input value are respectively subjected to mapping processing to obtain the first mapping value corresponding to the first input value, the second mapping value corresponding to the second input value, and the The third mapping value corresponding to the third input value, wherein the dimension of the first mapping value and the dimension of the second mapping value are lower than the dimension of the third mapping value; using the first mapping value and The second mapping value is used to calculate a combination parameter of the third mapping value; and the output value of the multi-attention head mechanism is calculated by using the third mapping value and the combination parameter.
在本申请的一些实施方式中,上述最近邻搜索单元604还可以具体用于:将低维近邻图的每个参考低维特征替换为与其对应的参考高维特征,得到低维近邻图对应的高维近邻图;基于高维近邻图与目标特征进行最近邻搜索,得到与目标对象距离最近的参考对象。In some embodiments of the present application, the above-mentioned nearest neighbor search unit 604 can also be specifically configured to: replace each reference low-dimensional feature of the low-dimensional neighbor graph with its corresponding reference high-dimensional feature, and obtain the corresponding High-dimensional neighbor map; based on the high-dimensional neighbor map and target features, the nearest neighbor search is performed to obtain the reference object closest to the target object.
需要说明的是,为描述的方便和简洁,上述最近邻搜索装置600的具体工作过程,可以参考图1至图5所述方法的对应过程,在此不再赘述。It should be noted that, for the convenience and brevity of the description, the specific working process of the nearest neighbor search device 600 may refer to the corresponding process of the method described in FIG. 1 to FIG. 5 , which will not be repeated here.
如图7所示,为本申请实施例提供的一种终端的示意图。该终端7可以包括:处理器70、存储器71以及存储在所述存储器71中并可在所述处理器70上运行的计算机程序72,例如最近邻搜索程序。所述处理器70执行所述计算机程序72时实现上述各个最近邻搜索方法实施例中的步骤,例如图1所示的步骤S101至S104。或者,所述处理器70执行所述计算机程序72时实现上述各装置实施例中各模块/单元的功能,例如图6所示的特征压缩单元601、近邻图构建单元602、特征获取单元603和最近邻搜索单元604。As shown in FIG. 7 , it is a schematic diagram of a terminal provided in the embodiment of the present application. The terminal 7 may include: a processor 70, a memory 71, and a computer program 72 stored in the memory 71 and operable on the processor 70, such as a nearest neighbor search program. When the processor 70 executes the computer program 72, it implements the steps in the above embodiments of the nearest neighbor search method, for example, steps S101 to S104 shown in FIG. 1 . Alternatively, when the processor 70 executes the computer program 72, it realizes the functions of each module/unit in the above-mentioned device embodiments, such as the feature compression unit 601 shown in FIG. 6 , the neighbor graph construction unit 602, the feature acquisition unit 603 and nearest neighbor search unit 604 .
所述计算机程序可以被分割成一个或多个模块/单元,所述一个或者多个模块/单元被存储在所述存储器71中,并由所述处理器70执行,以完成本申请。所述一个或多个模块/单元可以是能够完成特定功能的一系列计算机程序指令段,该指令段用于描述所述计算机程序在所述终端中的执行过程。The computer program can be divided into one or more modules/units, and the one or more modules/units are stored in the memory 71 and executed by the processor 70 to complete the present application. The one or more modules/units may be a series of computer program instruction segments capable of accomplishing specific functions, and the instruction segments are used to describe the execution process of the computer program in the terminal.
例如,所述计算机程序可以被分割成:特征压缩单元、近邻图构建单元、特征获取单元和最近邻搜索单元。For example, the computer program can be divided into: a feature compression unit, a neighbor graph construction unit, a feature acquisition unit and a nearest neighbor search unit.
各单元具体功能如下:特征压缩单元,用于将参考对象的参考高维特征输入特征压缩网络中,得到由所述特征压缩网络输出的参考低维特征,所述特征压缩网络的损失函数为基于样本对象的高维近邻关系和所述样本对象的低维近邻关系得到的函数;近邻图构建单元,用于利用所述参考低维特征建立低维近邻图;特征获取单元,用于获取目标对象的目标特征;最近邻搜索单元,用于基于所述低维近邻图与所述目标特征进行最近邻搜索,得到与所述目标对象距离最近的参考对象。The specific functions of each unit are as follows: the feature compression unit is used to input the reference high-dimensional features of the reference object into the feature compression network to obtain the reference low-dimensional features output by the feature compression network, and the loss function of the feature compression network is based on The function obtained by the high-dimensional neighbor relationship of the sample object and the low-dimensional neighbor relationship of the sample object; the neighbor graph construction unit is used to use the reference low-dimensional feature to establish a low-dimensional neighbor graph; the feature acquisition unit is used to acquire the target object the target feature; a nearest neighbor search unit, configured to perform a nearest neighbor search based on the low-dimensional neighbor graph and the target feature, to obtain a reference object with the closest distance to the target object.
所述终端可包括,但不仅限于,处理器70、存储器71。本领域技术人员可以理解,图7仅仅是终端的示例,并不构成对终端的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如所述终端还可以包括输入输出设备、网络接入设备、总线等。The terminal may include, but not limited to, a processor 70 and a memory 71 . Those skilled in the art can understand that FIG. 7 is only an example of a terminal, and does not constitute a limitation on the terminal. It may include more or less components than those shown in the figure, or combine certain components, or different components, such as the Terminals may also include input and output devices, network access devices, buses, and so on.
所称处理器70可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。The so-called processor 70 can be a central processing unit (Central Processing Unit, CPU), and can also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Off-the-shelf programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
所述存储器71可以是所述终端的内部存储单元,例如终端的硬盘或内存。所述存储器71也可以是所述终端的外部存储设备,例如所述终端上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,所述存储器71还可以既包括所述终端的内部存储单元也包括外部存储设备。所述存储器71用于存储所述计算机程序以及所述终端所需的其他程序和数据。所述存储器71还可以用于暂时地存储已经输出或者将要输出的数据。The storage 71 may be an internal storage unit of the terminal, such as a hard disk or memory of the terminal. The memory 71 can also be an external storage device of the terminal, such as a plug-in hard disk equipped on the terminal, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) card, a flash memory card (Flash Card) etc. Further, the memory 71 may also include both an internal storage unit of the terminal and an external storage device. The memory 71 is used to store the computer program and other programs and data required by the terminal. The memory 71 can also be used to temporarily store data that has been output or will be output.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。实施例中的各功能单元、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中,上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。另外,各功能单元、模块的具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。上述系统中单元、模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that for the convenience and brevity of description, only the division of the above-mentioned functional units and modules is used for illustration. In practical applications, the above-mentioned functions can be assigned to different functional units, Completion of modules means that the internal structure of the device is divided into different functional units or modules to complete all or part of the functions described above. Each functional unit and module in the embodiment may be integrated into one processing unit, or each unit may exist separately physically, or two or more units may be integrated into one unit, and the above-mentioned integrated units may adopt hardware It can also be implemented in the form of software functional units. In addition, the specific names of the functional units and modules are only for the convenience of distinguishing each other, and are not used to limit the protection scope of the present application. For the specific working process of the units and modules in the above system, reference may be made to the corresponding process in the foregoing method embodiments, and details will not be repeated here.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。In the above-mentioned embodiments, the descriptions of each embodiment have their own emphases, and for parts that are not detailed or recorded in a certain embodiment, refer to the relevant descriptions of other embodiments.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对各个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Those skilled in the art can appreciate that the units and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.
在本申请所提供的实施例中,应该理解到,所揭露的装置/终端和方法,可以通过其它的方式实现。例如,以上所描述的装置/终端实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通讯连接可以是通过一些接口,装置或单元的间接耦合或通讯连接,可以是电性,机械或其它的形式。In the embodiments provided in this application, it should be understood that the disclosed device/terminal and method may be implemented in other ways. For example, the device/terminal embodiments described above are only illustrative. For example, the division of the modules or units is only a logical function division. In actual implementation, there may be other division methods, such as multiple units or Components may be combined or integrated into another system, or some features may be omitted, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单 元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
所述集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实现上述实施例方法中的全部或部分流程,也可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机程序包括计算机程序代码,所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质可以包括:能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、电载波信号、电信信号以及软件分发介质等。需要说明的是,所述计算机可读介质包含的内容可以根据司法管辖区内立法和专利实践的要求进行适当的增减,例如在某些司法管辖区,根据立法和专利实践,计算机可读介质不包括电载波信号和电信信号。If the integrated module/unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, all or part of the processes in the methods of the above embodiments in the present application can also be completed by instructing related hardware through computer programs. The computer programs can be stored in a computer-readable storage medium, and the computer When the program is executed by the processor, the steps in the above-mentioned various method embodiments can be realized. Wherein, the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file or some intermediate form. The computer-readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a USB flash drive, a removable hard disk, a magnetic disk, an optical disk, a computer memory, and a read-only memory (Read-Only Memory, ROM) , random access memory (Random Access Memory, RAM), electric carrier signal, telecommunication signal and software distribution medium, etc. It should be noted that the content contained in the computer-readable medium may be appropriately increased or decreased according to the requirements of legislation and patent practice in the jurisdiction. For example, in some jurisdictions, computer-readable media Excludes electrical carrier signals and telecommunication signals.
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。The above-described embodiments are only used to illustrate the technical solutions of the present application, rather than to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: it can still implement the foregoing embodiments Modifications to the technical solutions described in the examples, or equivalent replacements for some of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the various embodiments of the application, and should be included in the Within the protection scope of this application.

Claims (11)

  1. 一种最近邻搜索方法,其特征在于,包括:A nearest neighbor search method, characterized in that it comprises:
    将参考对象的参考高维特征输入特征压缩网络中,得到由所述特征压缩网络输出的参考低维特征,所述特征压缩网络的损失函数为基于样本对象的高维近邻关系和所述样本对象的低维近邻关系得到的函数;Input the reference high-dimensional feature of the reference object into the feature compression network to obtain the reference low-dimensional feature output by the feature compression network, and the loss function of the feature compression network is based on the high-dimensional neighbor relationship of the sample object and the sample object The function obtained by the low-dimensional nearest neighbor relationship;
    利用所述参考低维特征建立低维近邻图;Establishing a low-dimensional neighbor graph by using the reference low-dimensional features;
    获取目标对象的目标特征;Obtain the target feature of the target object;
    基于所述低维近邻图与所述目标特征进行最近邻搜索,得到与所述目标对象距离最近的参考对象。A nearest neighbor search is performed based on the low-dimensional neighbor graph and the target feature to obtain a reference object with the closest distance to the target object.
  2. 如权利要求1所述的最近邻搜索方法,其特征在于,所述高维近邻关系为多个所述样本对象中每两个样本对象分别关联的样本高维特征之间的高维欧氏距离,所述低维近邻关系为多个所述样本对象中每两个样本对象分别关联的样本低维特征之间的低维欧氏距离;The nearest neighbor search method according to claim 1, wherein the high-dimensional neighbor relationship is a high-dimensional Euclidean distance between sample high-dimensional features associated with every two sample objects in a plurality of sample objects , the low-dimensional neighbor relationship is a low-dimensional Euclidean distance between sample low-dimensional features associated with each two sample objects among the plurality of sample objects;
    所述特征压缩网络的损失函数为基于所述高维欧氏距离和与所述高维欧氏距离对应的低维欧氏距离之间的误差值,以及与所述高维欧氏距离关联的权重值得到的函数,其中,所述权重值的取值和与其关联的高维欧氏距离的大小相关。The loss function of the feature compression network is based on the error value between the high-dimensional Euclidean distance and the low-dimensional Euclidean distance corresponding to the high-dimensional Euclidean distance, and the error value associated with the high-dimensional Euclidean distance A function obtained from the weight value, wherein the value of the weight value is related to the size of the high-dimensional Euclidean distance associated with it.
  3. 如权利要求2所述的最近邻搜索方法,其特征在于,所述损失函数的损失值的计算过程,包括:The nearest neighbor search method according to claim 2, wherein the calculation process of the loss value of the loss function comprises:
    计算所述高维欧氏距离和与所述高维欧氏距离对应的低维欧氏距离之间的误差值;calculating an error value between the high-dimensional Euclidean distance and the low-dimensional Euclidean distance corresponding to the high-dimensional Euclidean distance;
    计算与所述高维欧氏距离关联的权重值,并利用每个所述权重值对每个所述误差值进行加权相加,得到累加值;calculating a weight value associated with the high-dimensional Euclidean distance, and using each of the weight values to weight and add each of the error values to obtain an accumulated value;
    将所述累加值和所述样本对象的总数量的平方值相除,得到所述损失函数的损失值。The accumulated value is divided by the square value of the total number of sample objects to obtain the loss value of the loss function.
  4. 如权利要求2或3所述的最近邻搜索方法,其特征在于,所述权重值的计算过程,包括:The nearest neighbor search method as claimed in claim 2 or 3, wherein the calculation process of the weight value comprises:
    获取第一超参数和第二超参数,其中,所述第一超参数大于所述第二超参数;obtaining a first hyperparameter and a second hyperparameter, wherein the first hyperparameter is greater than the second hyperparameter;
    计算各个所述高维欧氏距离的平均值;Calculate the average value of each of the high-dimensional Euclidean distances;
    计算所述高维欧氏距离和所述平均值之商的自然对数值的相反数;Calculating the inverse of the natural logarithm value of the quotient of the high-dimensional Euclidean distance and the mean;
    确定所述相反数和所述第二超参数之间的最大值,并将所述第一超参数和所述最大值之间的最小值作为与所述高维欧氏距离关联的权重值。determining a maximum value between the inverse number and the second hyperparameter, and using a minimum value between the first hyperparameter and the maximum value as a weight value associated with the high-dimensional Euclidean distance.
  5. 如权利要求1至3任意一项所述的最近邻搜索方法,其特征在于,所述特征压缩网络包括压缩模块、投影模块以及全局优化模块,所述压缩模块包括第一线性映射模块、第二线性映射模块以及特征压缩模块;The nearest neighbor search method according to any one of claims 1 to 3, wherein the feature compression network includes a compression module, a projection module and a global optimization module, and the compression module includes a first linear mapping module, a second Linear mapping module and feature compression module;
    所述将参考对象的参考高维特征输入特征压缩网络中,得到由所述特征压缩网络输出的参考低维特征,包括:The input of the reference high-dimensional features of the reference object into the feature compression network to obtain the reference low-dimensional features output by the feature compression network includes:
    将所述参考高维特征输入至所述特征压缩模块、所述第一线性映射模块和所述投影模块,得到由所述特征压缩模块输出的第一特征、由所述第一线性映射模块输出的第二特征,以及由所述投影模块输出的至少一个第三特征,其中,所述第一特征、所述第二特征和每个所述第三特征的维度与所述参考低维特征的维度相同;The reference high-dimensional features are input to the feature compression module, the first linear mapping module, and the projection module to obtain the first features output by the feature compression module, and the first features output by the first linear mapping module and at least one third feature output by the projection module, wherein the dimensions of the first feature, the second feature, and each of the third features are the same as those of the reference low-dimensional feature the same dimension;
    将所述第一特征、所述第二特征和所述至少一个第三特征输入至所述全局优化模块,得到由所述全局优化模块输出的第四特征和至少一个第五特征;inputting the first feature, the second feature and the at least one third feature to the global optimization module to obtain a fourth feature and at least one fifth feature output by the global optimization module;
    将所述第四特征和所述至少一个第五特征输入至所述压缩模块,得到由所述压缩模块输出的所述参考低维特征。The fourth feature and the at least one fifth feature are input to the compression module to obtain the reference low-dimensional feature output by the compression module.
  6. 如权利要求5所述的最近邻搜索方法,其特征在于,所述全局优化模块包括至少一个编码器;The nearest neighbor search method according to claim 5, wherein said global optimization module comprises at least one encoder;
    所述将所述第一特征、所述第二特征和所述至少一个第三特征输入至所述全局优化模块,得到由所述全局优化模块输出的第四特征和至少一个第五特征,包括:The inputting the first feature, the second feature and the at least one third feature to the global optimization module to obtain a fourth feature and at least one fifth feature output by the global optimization module includes :
    将所述第一特征和所述至少一个第三特征组成第一向量,并将所述第一向量输入所述至少一个编码器中的第一个编码器,得到由所述第一个编码器基于多注意力头机制和线性映射层输出的第二向量,其中,所述第二向量包括与所述第一特征对应的第六特征,以及与所述至少一个第三特征一一对应的至少一个第七特征;Composing the first feature and the at least one third feature into a first vector, and inputting the first vector into a first encoder in the at least one encoder, obtaining Based on the second vector output by the multi-attention head mechanism and the linear mapping layer, the second vector includes a sixth feature corresponding to the first feature, and at least one corresponding to the at least one third feature. a seventh characteristic;
    将所述第二向量中所述第一特征对应的第六特征与所述第二特征相加,得到第八特征;adding the sixth feature corresponding to the first feature in the second vector to the second feature to obtain an eighth feature;
    将所述第二向量中的所述第六特征替换为所述第八特征,得到第三向量;replacing the sixth feature in the second vector with the eighth feature to obtain a third vector;
    将所述第三向量输入至所述至少一个编码器中的第二个编码器,以此类推,直至得到由所述至少一个编码器的最后一个编码器输出的目标向量,所述目标向量包括所述第四特征和所述至少一个第五特征。inputting the third vector to a second encoder of the at least one encoder, and so on, until a target vector output by the last encoder of the at least one encoder is obtained, the target vector comprising The fourth feature and the at least one fifth feature.
  7. 如权利要求6所述的最近邻搜索方法,其特征在于,所述多注意力头机制的输出值的计算过程包括:The nearest neighbor searching method as claimed in claim 6, is characterized in that, the calculation process of the output value of described multi-attention head mechanism comprises:
    基于所述第一向量确定所述多注意力头机制的第一输入值、第二输入值和第三输入值;determining a first input value, a second input value, and a third input value of the multi-attention head mechanism based on the first vector;
    对所述第一输入值、所述第二输入值和所述第三输入值分别进行映射处理,得到所述第一输入值对应的第一映射值、所述第二输入值对应的第二映射值和所述第三输入值对应的第三映射值,其中,所述第一映射值的维度和所述第二映射值的维度均低于所述第三映射值的维度;Perform mapping processing on the first input value, the second input value, and the third input value respectively to obtain a first mapping value corresponding to the first input value and a second mapping value corresponding to the second input value. a mapping value and a third mapping value corresponding to the third input value, wherein the dimension of the first mapping value and the dimension of the second mapping value are both lower than the dimension of the third mapping value;
    利用所述第一映射值和所述第二映射值,计算所述第三映射值的合并参数;calculating a combination parameter of the third mapped value using the first mapped value and the second mapped value;
    利用所述第三映射值和所述合并参数计算所述多注意力头机制的输出值。calculating an output value of the multi-attention head mechanism by using the third mapping value and the merging parameter.
  8. 如权利要求1至3任意一项所述的最近邻搜索方法,其特征在于,所述基于所述低维近邻图与所述目标特征进行最近邻搜索,得到与所述目标对象距离最近的参考对象,包括:The nearest neighbor search method according to any one of claims 1 to 3, wherein the nearest neighbor search is performed based on the low-dimensional neighbor graph and the target feature to obtain a reference with the closest distance to the target object objects, including:
    将所述低维近邻图的每个参考低维特征替换为与其对应的参考高维特征,得到所述低维近邻图对应的高维近邻图;replacing each reference low-dimensional feature of the low-dimensional neighbor graph with its corresponding reference high-dimensional feature to obtain a high-dimensional neighbor graph corresponding to the low-dimensional neighbor graph;
    基于所述高维近邻图与所述目标特征进行最近邻搜索,得到与所述目标对象距离最近的参考对象。A nearest neighbor search is performed based on the high-dimensional neighbor graph and the target feature to obtain a reference object with the closest distance to the target object.
  9. 一种最近邻搜索装置,其特征在于,包括:A nearest neighbor search device is characterized in that it comprises:
    特征压缩单元,用于将参考对象的参考高维特征输入特征压缩网络中,得到由所述特征压缩网络输出的参考低维特征,所述特征压缩网络的损失函数为基于样本对象的高维近邻关系和所述样本对象的低维近邻关系得到的函数;The feature compression unit is used to input the reference high-dimensional feature of the reference object into the feature compression network to obtain the reference low-dimensional feature output by the feature compression network, and the loss function of the feature compression network is based on the high-dimensional neighbor of the sample object The function obtained by the relationship and the low-dimensional neighbor relationship of the sample object;
    近邻图构建单元,用于利用所述参考低维特征建立低维近邻图;A neighbor graph construction unit, configured to use the reference low-dimensional features to build a low-dimensional neighbor graph;
    特征获取单元,用于获取目标对象的目标特征;A feature acquisition unit, configured to acquire the target feature of the target object;
    最近邻搜索单元,用于基于所述低维近邻图与所述目标特征进行最近邻 搜索,得到与所述目标对象距离最近的参考对象。The nearest neighbor search unit is configured to perform a nearest neighbor search based on the low-dimensional neighbor graph and the target feature to obtain the closest reference object to the target object.
  10. 一种终端,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如权利要求1至8任一项所述方法的步骤。A terminal, comprising a memory, a processor, and a computer program stored in the memory and operable on the processor, characterized in that, when the processor executes the computer program, the computer program according to claims 1 to 8 is implemented. The steps of any one of the methods.
  11. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至8任一项所述方法的步骤。A computer-readable storage medium storing a computer program, wherein the computer program implements the steps of the method according to any one of claims 1 to 8 when the computer program is executed by a processor.
PCT/CN2022/099850 2021-10-21 2022-06-20 Nearest neighbor search method and apparatus, terminal, and storage medium WO2023065696A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111227715.7A CN113868291A (en) 2021-10-21 2021-10-21 Nearest neighbor searching method, device, terminal and storage medium
CN202111227715.7 2021-10-21

Publications (1)

Publication Number Publication Date
WO2023065696A1 true WO2023065696A1 (en) 2023-04-27

Family

ID=78997008

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/099850 WO2023065696A1 (en) 2021-10-21 2022-06-20 Nearest neighbor search method and apparatus, terminal, and storage medium

Country Status (2)

Country Link
CN (1) CN113868291A (en)
WO (1) WO2023065696A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113868291A (en) * 2021-10-21 2021-12-31 深圳云天励飞技术股份有限公司 Nearest neighbor searching method, device, terminal and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107316057A (en) * 2017-06-07 2017-11-03 哈尔滨工程大学 Based on the nuclear power unit method for diagnosing faults being locally linear embedding into K nearest neighbor classifiers
CN109558899A (en) * 2018-11-13 2019-04-02 中国石油天然气股份有限公司 Method of Data with Adding Windows and device
WO2019219198A1 (en) * 2018-05-17 2019-11-21 Huawei Technologies Co., Ltd. Device and method for clustering of input-data
CN110717519A (en) * 2019-09-09 2020-01-21 深圳大学 Training, feature extraction and classification method, device and storage medium
CN112200133A (en) * 2020-10-28 2021-01-08 支付宝(杭州)信息技术有限公司 Privacy-protecting face recognition method and device
CN113868291A (en) * 2021-10-21 2021-12-31 深圳云天励飞技术股份有限公司 Nearest neighbor searching method, device, terminal and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107316057A (en) * 2017-06-07 2017-11-03 哈尔滨工程大学 Based on the nuclear power unit method for diagnosing faults being locally linear embedding into K nearest neighbor classifiers
WO2019219198A1 (en) * 2018-05-17 2019-11-21 Huawei Technologies Co., Ltd. Device and method for clustering of input-data
CN109558899A (en) * 2018-11-13 2019-04-02 中国石油天然气股份有限公司 Method of Data with Adding Windows and device
CN110717519A (en) * 2019-09-09 2020-01-21 深圳大学 Training, feature extraction and classification method, device and storage medium
CN112200133A (en) * 2020-10-28 2021-01-08 支付宝(杭州)信息技术有限公司 Privacy-protecting face recognition method and device
CN113868291A (en) * 2021-10-21 2021-12-31 深圳云天励飞技术股份有限公司 Nearest neighbor searching method, device, terminal and storage medium

Also Published As

Publication number Publication date
CN113868291A (en) 2021-12-31

Similar Documents

Publication Publication Date Title
CN111027575B (en) Semi-supervised semantic segmentation method for self-attention confrontation learning
CN107256262B (en) Image retrieval method based on object detection
Snavely et al. Skeletal graphs for efficient structure from motion
Zhu et al. Exploring consistent preferences: discrete hashing with pair-exemplar for scalable landmark search
JP5926291B2 (en) Method and apparatus for identifying similar images
US8892542B2 (en) Contextual weighting and efficient re-ranking for vocabulary tree based image retrieval
Wei et al. Projected residual vector quantization for ANN search
WO2023065697A1 (en) Product quantization search method and apparatus, and terminal and storage medium
Zhu et al. Learning compact visual representation with canonical views for robust mobile landmark search
US10839006B2 (en) Mobile visual search using deep variant coding
CN110458175B (en) Unmanned aerial vehicle image matching pair selection method and system based on vocabulary tree retrieval
CN115630236B (en) Global quick retrieval and positioning method, storage medium and equipment for passive remote sensing image
WO2023065696A1 (en) Nearest neighbor search method and apparatus, terminal, and storage medium
CN112948601B (en) Cross-modal hash retrieval method based on controlled semantic embedding
EP3115908A1 (en) Method and apparatus for multimedia content indexing and retrieval based on product quantization
US11281645B2 (en) Data management system, data management method, and computer program product
CN115565177A (en) Character recognition model training method, character recognition device, character recognition equipment and medium
WO2022007596A1 (en) Image retrieval system, method and apparatus
CN107133348B (en) Approximate searching method based on semantic consistency in large-scale picture set
WO2021051562A1 (en) Facial feature point positioning method and apparatus, computing device, and storage medium
CN114780781B (en) Product quantification method based on fuzzy clustering and asymmetric distance calculation
CN114691918B (en) Radar image retrieval method and device based on artificial intelligence and electronic equipment
CN115457638A (en) Model training method, data retrieval method, device, equipment and storage medium
CN114897075A (en) Heterogeneous image matching method based on meta-template knowledge base
CN110941730B (en) Retrieval method and device based on human face feature data migration

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22882319

Country of ref document: EP

Kind code of ref document: A1