WO2023065477A1 - 一种空间文本的查询方法及装置 - Google Patents

一种空间文本的查询方法及装置 Download PDF

Info

Publication number
WO2023065477A1
WO2023065477A1 PCT/CN2021/135363 CN2021135363W WO2023065477A1 WO 2023065477 A1 WO2023065477 A1 WO 2023065477A1 CN 2021135363 W CN2021135363 W CN 2021135363W WO 2023065477 A1 WO2023065477 A1 WO 2023065477A1
Authority
WO
WIPO (PCT)
Prior art keywords
query
lower triangular
matrix
triangular matrix
vector
Prior art date
Application number
PCT/CN2021/135363
Other languages
English (en)
French (fr)
Inventor
苗银宾
杨玉涛
童秋云
范瑞彬
张开翔
李辉忠
李成博
Original Assignee
深圳前海微众银行股份有限公司
西安电子科技大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳前海微众银行股份有限公司, 西安电子科技大学 filed Critical 深圳前海微众银行股份有限公司
Publication of WO2023065477A1 publication Critical patent/WO2023065477A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/374Thesaurus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/38Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/387Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location

Definitions

  • the invention relates to the field of financial technology (Fintech), in particular to a method and device for querying spatial text.
  • the selected query range is generally a preset range and shape; specifically, a preset coding algorithm is used to generate a gray code for the spatial geographic coordinates;
  • Figure 1 is based on The gray code schematic diagram of a kind of spatial geographic coordinates shown in the invention as an example, as shown in Figure 1, each cell (such as "0011") represents an area range, and spatial text data is recorded in the Gray code shown in Figure 1
  • Each object in the cluster for example, object P is located in the "0011" area).
  • the cell is used as the unit to determine the query range, for example, the query range is the area represented by "0011, 0010, 0111, 0110"; and then according to the text keyword of the query request, in The query result is determined in the query range of "0011, 0010, 0111, 0110".
  • Embodiments of the present invention provide a spatial text query method and device, which are used to query a query range of any shape, meet the query range actually required by the user, improve the accuracy of the query, and improve the accuracy of determining the query result.
  • the embodiment of the present invention provides a spatial text query method, including:
  • the query request includes a query range and a query keyword set;
  • the query range is a closed area formed by a query curve;
  • polynomial fitting is performed on the query curve, and the coefficients of each power term of the fitted polynomial are determined as a query range vector; based on the query range vector and the query keyword vector, get The first lower triangular matrix; encrypting the first lower triangular matrix through the first encryption matrix to obtain the query sub-trapdoor;
  • the object that meets the preset conditions is determined as the query result; wherein, the index of any object is the first obtained according to the spatial position and keyword set of the object.
  • the second lower triangular matrix is obtained after being encrypted by the second encryption matrix.
  • the query range is a closed area formed by the query curve, that is to say, the query range in the present invention can be of any shape; by determining the query keyword vector to realize multi-dimensional keyword query, the accuracy of keyword query is improved. degree; then determine the objects that meet the preset conditions through the fitting curve corresponding to the query curve, which is equivalent to determining the objects corresponding to the query keyword set and within the query range, so as to meet the query range actually required by the user, that is, query The result is within the query range actually required by the user, so the accuracy of the query is improved; and the query request and the information of the object are encrypted through an encryption matrix to ensure the security of the query.
  • the query keyword set is encoded to obtain a query keyword vector, including:
  • the keyword dictionary is obtained by taking the union of the keyword sets of each object in the space text data set;
  • the jth dimension element of the first vector is assigned a value of 0;
  • the first vector after the assignment of each dimension element is determined as the query key vector.
  • a first lower triangular matrix is obtained, including:
  • the second lower triangular matrix is obtained by the following methods, including:
  • the latitude value of the object is processed through n+1 latitude values of n+1 items and the longitude value of the object to determine the index space vector; the keyword set of the object is encoded to obtain index keyvector;
  • the query scope and the query keyword set of the query request are represented by the first lower triangular matrix; the spatial position and the keyword set of the object are represented by the second lower triangular matrix;
  • the second lower triangular matrix determines the objects corresponding to the query keyword set and within the query range, and determines the query result within the query range actually required by the user, thus improving the accuracy of the query.
  • assigning values to the diagonals of the first random lower triangular matrix according to the query range vector and each dimensional element in the query key vector includes:
  • the query scope and the query keyword set of the query request are expressed through a matrix, so that the amount of calculation is reduced when determining the query result, and the query efficiency is improved.
  • assigning values to the diagonals of the second random lower triangular matrix according to the index space vector and each dimensional element in the index key vector includes:
  • the first lower triangular matrix is encrypted by the first encryption matrix to obtain a query sub-trapdoor, including:
  • the index of any object is obtained after the second lower triangular matrix obtained according to the spatial position of the object and the keyword set is encrypted by the second encryption matrix, including:
  • the second lower triangular matrix obtained according to the spatial position of the object and the keyword set;
  • the index of the object is obtained by encrypting the second lower triangular square matrix according to the at least one random reversible square matrix, the third random lower triangular matrix and the fourth random lower triangular matrix.
  • the query request and the information of the object are encrypted through a random matrix to ensure the security of the query.
  • the query curve includes a first query curve and a second query curve; each of the query sub-trapdoors includes a first query sub-trapdoor and a second query sub-trapdoor; based on the query sub-trapdoor and the space
  • the index of each object in the text data set determines the object that meets the preset conditions as the query result, including:
  • the preset conditions include:
  • the absolute value of the trace of the first result matrix and the absolute value of the trace of the second result matrix are less than a first threshold; and the trace of the first result matrix is greater than a second threshold; the trace of the second result matrix less than the second threshold;
  • the first threshold is used to determine objects that meet the set of query keywords
  • the second threshold is used to determine objects conforming to the query scope.
  • the object corresponding to the query keyword set is guaranteed by the first threshold, that is, the keyword set contained in the object includes the query keyword set or is consistent with the query keyword set; Within the range, so as to realize the determination of the query result based on the threshold, improve the accuracy of the query and improve the accuracy of the determination of the query result.
  • an embodiment of the present invention provides a spatial text query device, including:
  • An acquisition module configured to acquire a query request;
  • the query request includes a query range and a query keyword set;
  • the query range is a closed area formed by a query curve;
  • a processing module configured to encode the set of query keywords to obtain a query keyword vector
  • polynomial fitting is performed on the query curve, and the coefficients of each power term of the fitted polynomial are determined as a query range vector; based on the query range vector and the query keyword vector, get The first lower triangular matrix; encrypting the first lower triangular matrix through the first encryption matrix to obtain the query sub-trapdoor;
  • the object that meets the preset conditions is determined as the query result; wherein, the index of any object is the first obtained according to the spatial position and keyword set of the object.
  • the second lower triangular matrix is obtained after being encrypted by the second encryption matrix.
  • processing module is specifically used for:
  • the keyword dictionary is obtained by taking the union of the keyword sets of each object in the space text data set;
  • the jth dimension element of the first vector is assigned a value of 0;
  • the first vector after the assignment of each dimension element is determined as the query key vector.
  • processing module is specifically used for:
  • the latitude value of the object is processed through n+1 latitude values of n+1 items and the longitude value of the object to determine the index space vector; the keyword set of the object is encoded to obtain index keyvector;
  • processing module is specifically used for:
  • processing module is specifically used for:
  • processing module is specifically used for:
  • the second lower triangular matrix obtained according to the spatial position of the object and the keyword set;
  • the index of the object is obtained by encrypting the second lower triangular square matrix according to the at least one random reversible square matrix, the third random lower triangular matrix and the fourth random lower triangular matrix.
  • the query curve includes a first query curve and a second query curve; each of the query sub-trapdoors includes a first query sub-trapdoor and a second query sub-trapdoor; the processing module is specifically used for:
  • an embodiment of the present invention also provides a computer device, including:
  • the processor is used to call the program instructions stored in the memory, and execute the above-mentioned spatial text query method according to the obtained program.
  • an embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are used to cause a computer to execute the above spatial text query method.
  • FIG. 1 is a schematic diagram of a Gray code of spatial geographic coordinates provided by an embodiment of the present invention
  • FIG. 2 is a schematic diagram of a system architecture provided by an embodiment of the present invention.
  • FIG. 3 is a schematic flowchart of a spatial text query method provided by an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of a query range provided by an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of an application scenario provided by an embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of a spatial text query device provided by an embodiment of the present invention.
  • the method for determining the query result based on the query range includes the following four stages:
  • the data owner generates a master key msk.
  • the spatial coordinates of Di are encoded as Gray codes. As shown in Figure 1, the object P is encoded as "0011"; then the keyword set of the object Di is encoded into a bitmap using the bitmap encoding method; finally, the Gray code based on the object and Bitmap generates object information vector;
  • the object information vector is encrypted according to the master key msk.
  • the query vector is encrypted according to the master key msk.
  • objects corresponding to the query keyword set within the query range are determined.
  • the preset coding algorithm can only determine the query range as a rectangle, as shown in Figure 1, it is impossible to determine the query range of a rectangle with each cell (such as "0011") as the basic unit. Query the query range of any shape, which cannot meet the actual needs of users.
  • the determined rectangular query range needs to include the city, but the rectangular query range includes the base of the city In addition, it will also include other regions, such as other cities, causing the banks within the rectangular query range to not only be limited to this city, but also include banks in other cities, resulting in the determined query results (banks) not targeting this city, resulting in
  • the determined query result includes objects not needed by the user, which reduces the accuracy of the query, and the determined query result has low precision.
  • the query request includes a query location and a query keyword. For example, if user A initiates a query request at a specific location, then this location is the query location of the query request, and the query location is generally a latitude and longitude coordinate value , the query point.
  • the query value is determined according to the preset weight, spatial distance and keyword similarity, and the query result is determined according to the size of the query value.
  • the spatial text corresponding to the maximum query value is used as the query result .
  • the index tree is constructed by the data owner based on the plaintext space text, and the smallest outer rectangle is the spatial range of the non-leaf nodes.
  • the problem with the above method is that the query value is related to the preset weight.
  • the query result is likely to appear different from that of the query.
  • the keywords in the request are similar nodes, so the determined query results include the ranking of the relevance value of the spatial text data leaked, and it is easy for the attacker to infer the information of each object according to the ranking of the relevance value, and may analyze the query
  • the user's daily habits and preferences have potential safety hazards.
  • FIG. 2 exemplarily shows a system architecture to which this embodiment of the present invention is applicable.
  • the system architecture includes a server 200 , and the server 200 may include a processor 210 , a communication interface 220 and a memory 230 .
  • the communication interface 220 is used for receiving query requests and sending query results.
  • the processor 210 is the control center of the server 200, and uses various interfaces and routes to connect various parts of the entire server 200, by running or executing software programs/or modules stored in the memory 230, and calling data stored in the memory 230, Various functions of the server 200 are performed and data is processed.
  • the processor 210 may include one or more processing units.
  • the memory 230 can be used to store software programs and modules, and the processor 210 executes various functional applications and data processing by running the software programs and modules stored in the memory 230 .
  • the memory 230 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, at least one application required by a function, etc.; the data storage area may store data created according to business processing, etc.
  • the memory 230 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage devices.
  • FIG. 3 is only an example, which is not limited in this embodiment of the present invention.
  • FIG. 3 exemplarily shows a schematic flowchart of a spatial text query method provided by an embodiment of the present invention, and the process can be executed by a spatial text query device.
  • the process specifically includes:
  • Step 310 obtaining a query request.
  • the query request includes a query range and a query keyword set;
  • the query range is a closed area formed by a query curve; for example, the query range is an elliptical area formed by two curves, and the query keyword set Including query keywords as "Sichuan" and "hot pot”.
  • Step 320 encode the set of query keywords to obtain a query keyword vector.
  • the query keyword vector is obtained by determining whether the keyword recorded in the keyword dictionary exists in the query keyword set.
  • Step 330 for any query curve, perform polynomial fitting on the query curve, and determine coefficients of each power term of the fitted polynomial as a query range vector.
  • the query curve can be represented in a matrix to participate in the calculation.
  • Step 340 based on each query sub-trapdoor and the index of each object in the spatial text dataset, determine the object satisfying the preset condition as the query result.
  • the preset conditions include a first threshold and a second threshold, the first threshold is used to determine the object matching the query keyword set; the second threshold is used to determine the object matching the query range object.
  • the query range can be an area of any shape, and the query range can be a closed area formed by multiple query curves, and the number of query curves is not limited here;
  • FIG. 4 is an exemplary embodiment of the present invention.
  • the query keyword vector is obtained by establishing the first vector and assigning values to elements of each dimension in the first vector according to the keyword dictionary and the query keyword set.
  • a first vector is established; if it is determined that the jth keyword in the keyword dictionary is recorded in the query keyword set, the jth dimension of the first vector is The element is assigned a value of 1; if it is determined that the jth keyword in the keyword dictionary is not recorded in the query keyword set, then the jth dimension element of the first vector is assigned a value of 0; after assigning each dimension element The first vector is determined as the query key vector.
  • the keyword dictionary is obtained by taking the union of the keyword sets of each object in the spatial text data set; for example, given a spatial text data set D, the spatial text data set D includes the information of multiple objects (spatial position and keyword set), for example, the spatial text data set D includes multiple objects, taking the first object as an example; the spatial position coordinates of the first object are (x 1 , y 1 ), and the keyword set includes "Sichuan","spicy" and "hot pot". Among them, x 1 represents the latitude value of the first object, and y 1 represents the longitude value of the first object.
  • the first vector is ⁇ m1, m2, m3, m4 ⁇ ;
  • the first keyword in the dictionary W is "Sichuan”, and it is determined that the first keyword exists in the query keyword set, then the first dimension element (m1) in the first vector is assigned a value of 1; and so on, the second dimension element ( m2) is assigned a value of 0; the third dimension element (m3) is assigned a value of 0; the fourth dimension element (m4) is assigned a value of 1; thus the query key vector is determined to be ⁇ 1, 0, 0, 1 ⁇ .
  • the query range vector is determined according to the polynomial of the fitted curve, such as the fitted curve
  • the query range vector is ⁇ a 0 , a 1 ,..., a 10 ⁇ ; fitting curve
  • the query range vector of is ⁇ b 0 , b 1 ,..., b 10 ⁇ .
  • the constructed random lower triangular matrix is assigned based on the query range vector and the query key vector, so as to obtain the first lower triangular matrix representing the query range information of the fitting curve and the query key set.
  • the matrices are all square matrices.
  • the first random lower triangular matrix is determined based on the degree n of the highest power item in the polynomial and the number m of keywords in the keyword dictionary; according to the query range vector and the query keyword vector in each dimensional element pair The diagonals of the first random lower triangular matrix are assigned to obtain the first lower triangular matrix.
  • the lower triangular matrix is a matrix whose elements above the diagonal are all 0.
  • a (n+m+3) ⁇ (n+m+3)-dimensional lower triangular matrix E 1 (the first random lower triangular matrix) is randomly generated, Then assign values to the diagonals of the first random lower triangular matrix according to the elements of each dimension in the query range vector and the query key vector.
  • the coefficient of the rth power item in the query range vector is assigned to the element of the r+1th row and the r+1th column in the first random lower triangular matrix; 0 ⁇ r ⁇ n; -1 Assigning a value to the element of the n+2th row and the n+2th column in the first random lower triangular matrix; assigning the jth dimension element in the query key vector to the n+th element in the first random lower triangular matrix Elements in column n+2+j in row 2+j; assign the number of keywords in the query keyword set to the elements in the last row and last column in the first random lower triangular matrix.
  • the coefficient of the 0th power item in the query range vector is a 0
  • a 0 is assigned to the r+1th row and r+1th column in the first random lower triangular matrix E1
  • assign values corresponding to the elements of each dimension in the query range vector to the first random lower triangular matrix E1 ;
  • the quantity of keywords in the query keyword set (as the quantity of keywords in the above-mentioned query keyword set is 2) is assigned to the element of the last row and last column in the first random lower triangular matrix E1
  • the first fitting curve is obtained The first lower triangular matrix of Similarly, the second fitting curve The first lower triangular matrix of is
  • a second lower triangular matrix representing the spatial position of the object and a keyword set is stored; specifically, the second lower triangular matrix is obtained in the following manner, including: based on The degree n of the highest power item in the polynomial and the keyword quantity m in the keyword dictionary determine the second random lower triangular matrix; for any object, the latitude value of the object is n+1 processed by n+1 times The item latitude value and the longitude value of the object determine the index space vector; the key set of the object is encoded to obtain the index key vector; according to the index space vector and the elements of each dimension in the index key vector Assign values to the diagonals of the second random lower triangular matrix to obtain a second lower triangular matrix.
  • the keyword set of the i-th object o i includes the keywords "spicy” and "barbecue", then according to the above method for determining the query keyword vector, the index keyword vector of the object is determined to be ⁇ 0, 0, 0 ,0 ⁇ .
  • the spatial position of object o i is ( xi , y i ); wherein, the latitude value of object o i of x i , the longitude value of object o i of y i ; then determine the index space vector as After obtaining the index space vector and the index key vector of the object o i , assign a value to the diagonal E2 of the second random lower triangular matrix; specifically, assign the latitude value of the object's latitude value after r times of processing To the element of row r+1 and column r+1 in the second random lower triangular matrix; 0 ⁇ r ⁇ n; assign the longitude value of the object to the r+th row in the second random lower triangular matrix
  • the element in row r+2 in row 2; the element in dimension j in the index key vector is assigned to the element in row n+3+j in row n+3+j in the second random lower triangular matrix; Assign -1 to the elements in the last row and last column of the second
  • the first lower triangular matrix randomly generate a (n+m+3) ⁇ (n+m+3) dimension lower triangular matrix F 1 (the second random lower triangular matrix), Then assign values to the diagonals of the second random lower triangular matrix according to the index space vector and index key vector of the first object o1 .
  • the first lower triangular matrix and the second lower triangular matrix need to be encrypted.
  • the random reversible square moment, the third random lower triangular matrix, the third lower triangular matrix, the fourth random lower triangular matrix and the fourth lower triangular matrix are pre-generated and used as key components for information to encrypt.
  • the second lower triangular matrix of any object encrypt the second lower triangular square matrix according to the at least one random reversible square matrix, the third random lower triangular matrix and the fourth lower triangular matrix to obtain the index of the object; as index of object o i
  • step 340 for any object, by multiplying the index of the object by the first query sub-trapdoor R 1 and the second query sub-trapdoor R 2 respectively, it is determined whether the object satisfies a preset condition.
  • a first result matrix is determined based on the index of the object and the first query sub-trapdoor;
  • a second result matrix is determined based on the index of the object and the second query sub-trapdoor; determining the trace of the first result matrix and the trace of the second result matrix, and determining whether the object satisfies a preset condition according to the trace of the first result matrix and the trace of the second result matrix.
  • the trace of the matrix is the sum of the elements on the diagonal of the matrix.
  • the first result matrix is tr(C i R 1 );
  • the second result matrix is tr(C i R 2 );
  • the preset conditions include:
  • the absolute value of the trace of the first result matrix and the absolute value of the trace of the second result matrix are less than a first threshold; and the trace of the first result matrix is greater than a second threshold; the trace of the second result matrix is smaller than a second threshold; wherein, the first threshold is used to determine objects that meet the query keyword set; and the second threshold is used to determine objects that meet the query range.
  • the first threshold value is 0.1; because in the third lower triangular matrix D2 , the value of the n+3th row n+3 column to the n+m+3th row n+m+3 column value is 1 , and the elements from row n+3, column n+3 to row n+m+3, column n+m+3 of the first lower triangular matrix and the second lower triangular matrix are used to represent keywords;
  • the value of the first row and the first column to the n+2th row n+2 column is set to 0.001 (preset positive real number), and the first lower triangular matrix and the second lower triangular matrix
  • 0.001 preset positive real number
  • K the trace of the result matrix will be smaller because of the preset positive real number (0.001), therefore, if the absolute value of the trace of the first result matrix and the absolute value of the trace of the second result matrix are not less than the first threshold, it proves that K and ⁇ are not equal, and the keyword of the object does not correspond to the query keyword, that is, the object cannot be used as the query result; otherwise, the object has the prerequisite requirements for the query result.
  • the first result matrix is used to characterize the first query sub-trapdoor, and the first query sub-trapdoor corresponds to the first query curve ⁇ 1 (shown by the dotted line in Figure 4), and in the spatial position, the first The query curve ⁇ 1 is located in the upper half, therefore, if the trace of the first result matrix is greater than 0 (the second threshold), it indicates that the object is located below the first query curve ⁇ 1 ; if the trace of the second result matrix is less than 0 ( second threshold), it indicates that the object is located above the second query curve ⁇ 1 .
  • the object is determined as the query result.
  • FIG. 5 exemplarily shows a schematic structural diagram of an application scenario, as shown in FIG. 5 , including a data owner 510, a cloud server 520, and a client 530;
  • the data owner 510 determines at least one random reversible square matrix, the third lower triangular matrix, the fourth lower triangular matrix, the third random lower triangular matrix and the fourth random lower triangular matrix, and at least one random reversible square matrix, the third The lower triangular matrix, the fourth lower triangular matrix, the third random lower triangular matrix and the fourth random lower triangular matrix are used as key components; the data owner 510 uses at least one inverse matrix of the random reversible square matrix, the third lower triangular matrix, The fourth lower triangular matrix is sent to the client 530;
  • the data owner 510 determines the second lower triangular matrix for each object in the spatial text dataset, and calculates the second lower triangular matrix of each object based on at least one random reversible square matrix, the third random lower triangular matrix and the fourth random lower triangular matrix
  • the matrix is encrypted to obtain the index of each object; and the index is sent to the cloud server 520;
  • the client 530 determines the first lower triangular matrix of the first query curve and the second query curve based on the query range and the query keyword set of the query request, and according to the inverse matrix of at least one random reversible square matrix sent by the data owner 510 , the third lower triangular matrix, and the fourth lower triangular matrix encrypt the first lower triangular matrix, determine the query sub-trapdoors of the first query curve and the second query curve, and query the first query curve and the second query curve The sub-trapdoor is sent to the cloud server 520.
  • the cloud server 520 determines the object satisfying the preset condition as the query result, and returns the query result to the client 530 .
  • FIG. 6 exemplarily shows a schematic structural diagram of a spatial text query device provided by an embodiment of the present invention, and the device can execute a flow of a spatial text query method.
  • the device specifically includes:
  • An acquisition module 610 configured to acquire a query request; the query request includes a query range and a query keyword set; the query range is a closed area formed by a query curve;
  • a processing module 620 configured to encode the set of query keywords to obtain a query keyword vector
  • polynomial fitting is performed on the query curve, and the coefficients of each power term of the fitted polynomial are determined as a query range vector; based on the query range vector and the query keyword vector, get The first lower triangular matrix; encrypting the first lower triangular matrix through the first encryption matrix to obtain the query sub-trapdoor;
  • the object that meets the preset conditions is determined as the query result; wherein, the index of any object is the first obtained according to the spatial position and keyword set of the object.
  • the second lower triangular matrix is obtained after being encrypted by the second encryption matrix.
  • processing module 620 is specifically configured to:
  • the keyword dictionary is obtained by taking the union of the keyword sets of each object in the space text data set;
  • the jth dimension element of the first vector is assigned a value of 0;
  • the first vector after the assignment of each dimension element is determined as the query key vector.
  • processing module 620 is specifically configured to:
  • the latitude value of the object is processed through n+1 latitude values of n+1 items and the longitude value of the object to determine the index space vector; the keyword set of the object is encoded to obtain index keyvector;
  • processing module 620 is specifically configured to:
  • processing module 620 is specifically configured to:
  • processing module 620 is specifically configured to:
  • the second lower triangular matrix obtained according to the spatial position of the object and the keyword set;
  • the index of the object is obtained by encrypting the second lower triangular square matrix according to the at least one random reversible square matrix, the third random lower triangular matrix and the fourth random lower triangular matrix.
  • the query curve includes a first query curve and a second query curve; each of the query sub-trapdoors includes a first query sub-trapdoor and a second query sub-trapdoor; the processing module 620 is specifically configured to:
  • the embodiment of the present invention also provides a computer device, including:
  • the processor is used to call the program instructions stored in the memory, and execute the above-mentioned spatial text query method according to the obtained program.
  • the embodiment of the present invention also provides a computer-readable storage medium, the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are used to make the computer execute the above-mentioned spatial text query method.
  • the embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions
  • the device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明公开了一种空间文本的查询方法及装置,包括:获取查询请求;查询请求包括查询范围和查询关键字集;查询范围是通过查询曲线形成的闭合区域;对查询关键字集进行编码,得到查询关键字向量;对查询曲线进行多项式拟合,确定查询范围向量;基于查询范围向量和查询关键字向量,得到第一下三角矩阵;将第一下三角矩阵经第一加密矩阵加密,得到查询子陷门;基于各查询子陷门及空间文本数据集中各对象的索引,将满足预设条件的对象确定为查询结果;任一对象的索引是根据所述对象的空间位置和关键字集得到的第二下三角矩阵经第二加密矩阵加密后得到的。从而满足用户实际需要的查询范围,提高了查询的准确度。

Description

一种空间文本的查询方法及装置
相关申请的交叉引用
本申请要求在2021年10月18日提交中国专利局、申请号为202111210427.0、申请名称为“一种空间文本的查询方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及金融科技(Fintech)领域,尤其涉及一种空间文本的查询方法及装置。
背景技术
随着计算机技术的发展,越来越多的技术(例如:区块链、云计算或大数据)应用在金融领域,传统金融业正在逐步向金融科技转变,大数据技术也不例外,但由于金融、支付行业的安全性、实时性要求,也对大数据技术中空间关键字查询提出了更高的要求。
现有技术中,用户在查询某一范围内的对象时,所选择的查询范围一般是预设的范围与形状;具体的,使用预设编码算法将空间地理坐标生成格雷码;图一为本发明示例性示出的一种空间地理坐标的格雷码示意图,如图1所示,每一个单元格(如“0011”)表征一个区域范围,图1所示的格雷码中记录有空间文本数据集中的各对象(如对象P位于“0011”区域中)。
基于上述格雷码,用户在确定查询范围时,以单元格为单位,确定查询范围,如查询范围为“0011,0010,0111,0110”表征的区域;然后再根据查询请求的文本关键字,在“0011,0010,0111,0110”查询范围中确定查询结果。
但是,现有技术中无法实现查询任意形状的查询范围,无法满足用户实际需要查询范围,导致确定的查询结果包括用户不需要的对象,降低了查询的准确度。
发明内容
本发明实施例提供一种空间文本的查询方法及装置,用于实现查询任意形状的查询范围,满足用户实际需要的查询范围,提高查询的准确度以及提升确定查询结果的精确度。
第一方面,本发明实施例提供一种空间文本的查询方法,包括:
获取查询请求;所述查询请求包括查询范围和查询关键字集;所述查询范围是通过查询曲线形成的闭合区域;
对所述查询关键字集进行编码,得到查询关键字向量;
针对任一查询曲线,对所述查询曲线进行多项式拟合,将拟合后的多项式的各次方项的系数确定为查询范围向量;基于所述查询范围向量和所述查询关键字向量,得到第一下三角矩阵;将所述第一下三角矩阵经第一加密矩阵加密,得到查询子陷门;
基于各查询子陷门及空间文本数据集中各对象的索引,将满足预设条件的对象确定为查询结果;其中,任一对象的索引是根据所述对象的空间位置和关键字集得到的第二下三 角矩阵经第二加密矩阵加密后得到的。
上述技术方案中,查询范围是通过查询曲线形成的闭合区域,也就是说,本发明中的查询范围可以是任意形状的;通过确定查询关键字向量实现多维关键字查询,提升关键字查询的准确度;然后通过查询曲线对应的拟合曲线确定出满足预设条件的对象,相当于确定出与查询关键字集对应,且在查询范围内的对象,从而满足用户实际需要的查询范围,即查询结果在用户实际需要的查询范围内,因此提高了查询的准确度;且通过加密矩阵对查询请求和对象的信息进行加密,保证了查询安全性。
可选的,对所述查询关键字集进行编码,得到查询关键字向量,包括:
基于关键字字典中关键字数量m,建立第一向量;所述关键字字典是对所述空间文本数据集中各对象的关键字集取并集得到的;
若确定所述关键字字典中第j关键字记录在所述查询关键字集,则将所述第一向量的第j维元素赋值为1;
若确定所述关键字字典中第j关键字未记录在所述查询关键字集,则将所述第一向量的第j维元素赋值为0;
将各维元素赋值后的第一向量确定为所述查询关键字向量。
上述技术方案中,通过将查询关键字以向量的形式进行表征,实现多维的关键字查询,提升查询效率及查询准确度。
可选的,基于所述查询范围向量和所述查询关键字向量,得到第一下三角矩阵,包括:
基于所述多项式中最高次方项的次数n和关键字字典中关键字数量m确定第一随机下三角矩阵;
根据所述查询范围向量和所述查询关键字向量中各维元素对所述第一随机下三角矩阵的对角线进行赋值,得到第一下三角矩阵;
第二下三角矩阵通过如下方式得到,包括:
基于所述多项式中最高次方项的次数n和关键字字典中关键字数量m确定第二随机下三角矩阵;
针对任一对象,将所述对象的纬度值经n+1次处理的n+1项纬度值和所述对象的经度值,确定索引空间向量;对所述对象的关键字集进行编码,得到索引关键字向量;
根据所述索引空间向量和所述索引关键字向量中各维元素对所述第二随机下三角矩阵的对角线进行赋值,得到第二下三角矩阵。
上述技术方案中,通过第一下三角矩阵表征查询请求的查询范围和查询关键字集;通过第二下三角矩阵表征对象的空间位置和关键字集;以此实现根据第一下三角矩阵和第二下三角矩阵确定出与查询关键字集对应,且在查询范围内的对象,在用户实际需要的查询范围内确定查询结果,因此提高了查询的准确度。
可选的,根据所述查询范围向量和所述查询关键字向量中各维元素对所述第一随机下三角矩阵的对角线进行赋值,包括:
将所述查询范围向量中第r次方项的系数赋值至所述第一随机下三角矩阵中第r+1行第r+1列的元素;0≤r≤n;
将-1赋值至所述第一随机下三角矩阵中第n+2行第n+2列的元素;
将所述查询关键字向量中第j维元素赋值至所述第一随机下三角矩阵中第n+2+j行第n+2+j列的元素;
将查询关键字集中关键字的数量赋值至所述第一随机下三角矩阵中最后一行最后一列的元素。
上述技术方案中,通过将查询请求的查询范围和查询关键字集通过一个矩阵进行表达,以使在确定查询结果时减少计算量,提升查询效率。
可选的,根据所述索引空间向量和所述索引关键字向量中各维元素对所述第二随机下三角矩阵的对角线进行赋值,包括:
将所述对象的纬度值经r次处理的纬度值赋值至所述第二随机下三角矩阵中第r+1行第r+1列的元素;0≤r≤n;
将所述对象的经度值赋值至所述第二随机下三角矩阵中第n+2行第n+2列的元素;
将所述索引关键字向量中第j维元素赋值至所述第二随机下三角矩阵中第n+3+j行第n+3+j列的元素;
将-1赋值至所述第二随机下三角矩阵中最后一行最后一列的元素。
通过将对象的空间位置和关键字集通过一个矩阵进行表达,以使在确定查询结果时,不需要对每个对象进行重复计算,减少计算量,提升查询效率。
可选的,将所述第一下三角矩阵经第一加密矩阵加密,得到查询子陷门,包括:
根据至少一个随机可逆方块矩阵的逆矩阵、对角线值为设定值的第三下三角矩阵和对角线值为1的第四下三角矩阵对所述第一下三角矩阵进行加密得到查询子陷门;其中,所述设定值为第1行第1列至第n+2行n+2列的值为预设正实数,第n+3行第n+3列至第n+m+3行第n+m+3列的值为1;
任一对象的索引是根据所述对象的空间位置和关键字集得到的第二下三角矩阵经第二加密矩阵加密后得到的,包括:
根据所述对象的空间位置和关键字集得到的第二下三角矩阵;
根据所述至少一个随机可逆方块矩阵、第三随机下三角矩阵和第四随机下三角矩阵对第二下三角方块矩阵进行加密得到所述对象的索引。
上述技术方案中,通过随机矩阵对查询请求和对象的信息进行加密,保证了查询安全性。
可选的,所述查询曲线包括第一查询曲线和第二查询曲线;所述各查询子陷门包括第一查询子陷门和第二查询子陷门;基于所述查询子陷门及空间文本数据集中各对象的索引,将满足预设条件的对象确定为查询结果,包括:
针对任一对象,基于所述对象的索引与所述第一查询子陷门确定第一结果矩阵;基于所述对象的索引与所述第二查询子陷门确定第二结果矩阵;
确定所述第一结果矩阵的迹和所述第二结果矩阵的迹,根据所述第一结果矩阵的迹和所述第二结果矩阵的迹确定所述对象是否满足预设条件。
可选的,所述预设条件包括:
所述第一结果矩阵的迹的绝对值和所述第二结果矩阵的迹的绝对值小于第一阈值;且所述第一结果矩阵的迹大于第二阈值;所述第二结果矩阵的迹小于第二阈值;
所述第一阈值用于确定符合所述查询关键字集的对象;
所述第二阈值用于确定符合所述查询范围的对象。
上述技术方案中,通过第一阈值保证与查询关键字集相对应的对象,即该对象所包含的关键字集包括查询关键字集或与查询关键字集一致;通过第一阈值保证对象在查询范围内,从而实现基于阈值的确定查询结果,提高查询的准确度以及提升确定查询结果的精确度。
第二方面,本发明实施例提供一种空间文本的查询装置,包括:
获取模块,用于获取查询请求;所述查询请求包括查询范围和查询关键字集;所述查询范围是通过查询曲线形成的闭合区域;
处理模块,用于对所述查询关键字集进行编码,得到查询关键字向量;
针对任一查询曲线,对所述查询曲线进行多项式拟合,将拟合后的多项式的各次方项的系数确定为查询范围向量;基于所述查询范围向量和所述查询关键字向量,得到第一下三角矩阵;将所述第一下三角矩阵经第一加密矩阵加密,得到查询子陷门;
基于各查询子陷门及空间文本数据集中各对象的索引,将满足预设条件的对象确定为查询结果;其中,任一对象的索引是根据所述对象的空间位置和关键字集得到的第二下三角矩阵经第二加密矩阵加密后得到的。
可选的,所述处理模块具体用于:
基于关键字字典中关键字数量m,建立第一向量;所述关键字字典是对所述空间文本数据集中各对象的关键字集取并集得到的;
若确定所述关键字字典中第j关键字记录在所述查询关键字集,则将所述第一向量的第j维元素赋值为1;
若确定所述关键字字典中第j关键字未记录在所述查询关键字集,则将所述第一向量的第j维元素赋值为0;
将各维元素赋值后的第一向量确定为所述查询关键字向量。
可选的,所述处理模块具体用于:
基于所述多项式中最高次方项的次数n和关键字字典中关键字数量m确定第一随机下三角矩阵;
根据所述查询范围向量和所述查询关键字向量中各维元素对所述第一随机下三角矩阵的对角线进行赋值,得到第一下三角矩阵;
基于所述多项式中最高次方项的次数n和关键字字典中关键字数量m确定第二随机下三角矩阵;
针对任一对象,将所述对象的纬度值经n+1次处理的n+1项纬度值和所述对象的经度值,确定索引空间向量;对所述对象的关键字集进行编码,得到索引关键字向量;
根据所述索引空间向量和所述索引关键字向量中各维元素对所述第二随机下三角矩阵的对角线进行赋值,得到第二下三角矩阵。
可选的,所述处理模块具体用于:
将所述查询范围向量中第r次方项的系数赋值至所述第一随机下三角矩阵中第r+1行第r+1列的元素;0≤r≤n;
将-1赋值至所述第一随机下三角矩阵中第n+2行第n+2列的元素;
将所述查询关键字向量中第j维元素赋值至所述第一随机下三角矩阵中第n+2+j行第n+2+j列的元素;
将查询关键字集中关键字的数量赋值至所述第一随机下三角矩阵中最后一行最后一列的元素。
可选的,所述处理模块具体用于:
将所述对象的纬度值经r次处理的纬度值赋值至所述第二随机下三角矩阵中第r+1行第r+1列的元素;0≤r≤n;
将所述对象的经度值赋值至所述第二随机下三角矩阵中第r+2行第r+2列的元素;
将所述索引关键字向量中第j维元素赋值至所述第二随机下三角矩阵中第n+3+j行第n+3+j列的元素;
将-1赋值至所述第二随机下三角矩阵中最后一行最后一列的元素。
可选的,所述处理模块具体用于:
根据至少一个随机可逆方块矩阵的逆矩阵、对角线值为设定值的第三下三角矩阵和对角线值为1的第四下三角矩阵对所述第一下三角矩阵进行加密得到查询子陷门;其中,所述设定值为第1行第1列至第n+2行n+2列的值为预设正实数,第n+3行第n+3列至第n+m+3行第n+m+3列的值为1;
根据所述对象的空间位置和关键字集得到的第二下三角矩阵;
根据所述至少一个随机可逆方块矩阵、第三随机下三角矩阵和第四随机下三角矩阵对第二下三角方块矩阵进行加密得到所述对象的索引。
可选的,所述查询曲线包括第一查询曲线和第二查询曲线;所述各查询子陷门包括第一查询子陷门和第二查询子陷门;所述处理模块具体用于:
针对任一对象,基于所述对象的索引与所述第一查询子陷门确定第一结果矩阵;基于所述对象的索引与所述第二查询子陷门确定第二结果矩阵;
确定所述第一结果矩阵的迹和所述第二结果矩阵的迹,根据所述第一结果矩阵的迹和所述第二结果矩阵的迹确定所述对象是否满足预设条件。
第三方面,本发明实施例还提供一种计算机设备,包括:
存储器,用于存储程序指令;
处理器,用于调用所述存储器中存储的程序指令,按照获得的程序执行上述空间文本的查询方法。
第四方面,本发明实施例还提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可执行指令,所述计算机可执行指令用于使计算机执行上述空间文本的查询方法。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简要介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域的普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本发明实施例提供的一种空间地理坐标的格雷码示意图;
图2为本发明实施例提供的一种系统架构示意图;
图3为本发明实施例提供的一种空间文本的查询方法的流程示意图;
图4为本发明实施例提供的一种查询范围示意图;
图5为本发明实施例提供的一种应用场景的结构示意图;
图6为本发明实施例提供的一种空间文本的查询装置的结构示意图。
具体实施方式
为了使本发明的目的、技术方案和优点更加清楚,下面将结合附图对本发明作进一步地详细描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本发明保护的范围。
现有技术中,基于查询范围确定查询结果的方法包括以下四个阶段:
S1、系统初始化;
数据拥有者生成主密钥msk。
S2、数据加密;
数据拥有者根据预设编码方式对空间文本数据集中的对象进行编码,得到对象信息向量;例如,空间文本数据集DB={D1,D2,……,Dz},第i个对象Di={Dp,Dq},其中,Dp为对象Di的空间坐标,Dq为对象Di的关键字集;使用预设编码算法将空间地理坐标生成如图1所示的格雷码,然后通过预设编码方式将对象Di的空间坐标编码为格雷码,如图1所示,对象P被编码为“0011”;然后再使用位图编码方式将对象Di的关键字集编码成位图;最后基于对象的格雷码和位图生成对象信息向量;
根据主密钥msk将对象信息向量进行加密。
S3、生成陷门;
获取查询请求后,根据预设编码算法对查询请求的查询范围进行编码,得到查询编码;根据位图编码算法对查询请求的查询关键字集进行编码,得到查询位图;基于查询编码和查询位图确定查询向量;
根据主密钥msk对查询向量进行加密。
S4、查询;
根据查询向量和对象信息向量确定出在查询范围内的与查询关键字集相对应的对象。
但在上述查询方法中,因为预设编码算法只能将查询范围确定为矩形,如图1所示,以每一个单元格(如“0011”)为基础单元,确定矩形的查询范围,无法实现查询任意形状 的查询范围,无法满足用户实际需要查询范围。例如,用户想查询某一城市中共有多少银行,因为大多数城市的区域是不规则区域,按照现有技术的方案,确定的矩形查询范围需要包括该城市,但矩形查询范围包括该城市的基础上,还会包括其他区域,如其他城市,导致在该矩形查询范围内的银行不仅局限于该城市,还包括了其他城市的银行,导致确定的查询结果(银行)并非针对该城市,从而导致确定的查询结果包括用户不需要的对象,使查询的准确度降低,确定的查询结果精确度低。
在另一种查询方式中,查询请求包括查询位置和查询关键字,例如,用户A在某一具体位置发起了查询请求,则该位置为查询请求的查询位置,该查询位置一般为经纬度坐标值,即查询点。
针对于查询点查询时,需要确定出该查询点与预先根据明文空间文本数据构建的索引树中各最小外包矩形的最小空间距离,然后再根据各最小外包矩形中存在的空间文本对应的关键字和查询请求的关键字,确定出各空间文本与查询请求的关键字相似度。最后根据预设权重、空间距离和关键字相似度确定出查询值,根据查询值的大小确定查询结果,例如,将最大查询值对应的空间文本(相当于索引树中的叶子节点)作为查询结果。其中,索引树是数据拥有者根据明文空间文本进行构建的,最小外包矩形为非叶子节点的空间范围。
但上述方法存在的问题是,查询值与预设权重相关,例如,对空间距离所预设的权重较小,对关键字相似度所预设的权重较大时,则查询结果易出现与查询请求中的关键字相似的节点,因此导致确定的查询结果包括空间文本数据的相关度值的排序泄露,易发生攻击者根据相关度值的排序推测出各对象的信息,并且可能会分析出查询用户的日常习惯和喜好,存在安全隐患。
因此,现亟需一种空间文本的查询方法,用于满足用户实际需要的查询范围,即查询结果在用户实际需要的查询范围内,提高查询的准确度,提升确定查询结果的精确度,并增加查询安全性。
图2示例性的示出了本发明实施例所适用的一种系统架构,该系统架构包括服务器200,该服务器200可以包括处理器210、通信接口220和存储器230。
其中,通信接口220用于接收查询请求和发送查询结果。
处理器210是服务器200的控制中心,利用各种接口和路线连接整个服务器200的各个部分,通过运行或执行存储在存储器230内的软件程序/或模块,以及调用存储在存储器230内的数据,执行服务器200的各种功能和处理数据。可选地,处理器210可以包括一个或多个处理单元。
存储器230可用于存储软件程序以及模块,处理器210通过运行存储在存储器230的软件程序以及模块,从而执行各种功能应用以及数据处理。存储器230可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序等;存储数据区可存储根据业务处理所创建的数据等。此外,存储器230可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。
需要说明的是,上述图3所示的结构仅是一种示例,本发明实施例对此不做限定。
基于上述描述,图3示例性的示出了本发明实施例提供的一种空间文本的查询方法的流程示意图,该流程可由空间文本的查询装置执行。
如图3所示,该流程具体包括:
步骤310,获取查询请求。
本发明实施例中,所述查询请求包括查询范围和查询关键字集;所述查询范围是通过查询曲线形成的闭合区域;例如查询范围是由两条曲线形成的椭圆形区域,查询关键字集包括查询关键字为“四川”和“火锅”。
步骤320,对所述查询关键字集进行编码,得到查询关键字向量。
本发明实施例中,查询关键字向量是通过确定关键字字典中记录的关键字在查询关键字集是否存在,从而得到的。
步骤330,针对任一查询曲线,对所述查询曲线进行多项式拟合,将拟合后的多项式的各次方项的系数确定为查询范围向量。
本发明实施例中,通过将查询曲线拟合成多项式,以使查询曲线可以表征在矩阵中,以参与计算。
步骤340,基于各查询子陷门及空间文本数据集中各对象的索引,将满足预设条件的对象确定为查询结果。
本发明实施例中,预设条件包括第一阈值和第二阈值,所述第一阈值用于确定符合所述查询关键字集的对象;所述第二阈值用于确定符合所述查询范围的对象。
在步骤310中,查询范围可以是任一形状的区域,查询范围可以是由多条查询曲线形成的闭合区域,在此不对查询曲线的数量做限定;图4为本发明实施例示例性提出的一种查询范围示意图,如图4所示,查询范围是根据两条查询曲线形成的闭合区域,分别为第一查询曲线θ 1(图4中虚线所示)和第二查询曲线θ 2(图4中实线所示);其中,查询曲线可以是任意形状的曲线,在此不做具体限定。
在步骤320中,通过建立第一向量,并根据关键字字典和查询关键字集对第一向量中各维元素进行赋值,从而得到查询关键字向量。
具体的,基于关键字字典中关键字数量m,建立第一向量;若确定所述关键字字典中第j关键字记录在所述查询关键字集,则将所述第一向量的第j维元素赋值为1;若确定所述关键字字典中第j关键字未记录在所述查询关键字集,则将所述第一向量的第j维元素赋值为0;将各维元素赋值后的第一向量确定为所述查询关键字向量。
其中,所述关键字字典是对所述空间文本数据集中各对象的关键字集取并集得到的;例如,给定一个空间文本数据集D,空间文本数据集D中包括多个对象的信息(空间位置和关键字集),如,空间文本数据集D中共包括多个对象,以第1对象为例;第1对象的空间位置坐标为(x 1,y 1),关键字集包括“四川”、“麻辣”和“火锅”。其中,x 1表示第1对象的纬度值,y 1表示第1对象的经度值。
然后针对空间文本数据集D中的所有对象的关键字集,确定关键字字典W;例如,空间文本数据集D中共包括两个对象,第1对象的关键字集包括关键字“四川”、“麻辣”和 “火锅”,第2对象的关键字集包括关键字“麻辣”和“烤肉”,则关键字字典W为第1对象的关键字集和第2对象的关键字集的并集,具体包括关键字“四川”、“麻辣”、“烤肉”和“火锅”;即m=4。
基于上述示例举例说明,基于m=4,建立第一向量为{m1,m2,m3,m4};假设查询关键字集包括关键字“四川”和“火锅”;当j=1时,关键字字典W中第1关键字为“四川”,确定第1关键字在查询关键字集中存在,则将第一向量中第1维元素(m1)赋值为1;以此类推,第2维元素(m2)赋值为0;第3维元素(m3)赋值为0;第4维元素(m4)赋值为1;从而确定查询关键字向量为{1,0,0,1}。
在步骤330中,在对所述查询曲线进行多项式拟合时,多项式拟合的度越大,拟合的精确度越高,计算量也相对较大,如n表示多项式拟合的度,n越大,拟合的精确度越高,当n=10时,拟合精确度超过99%,在本发明实施例中,以n=10为例,但不限于n=10。
基于上述步骤310,第一查询曲线θ 1和第二查询曲线θ 2进行多项式拟合后,得到对应的第一拟合曲线
Figure PCTCN2021135363-appb-000001
和第二拟合曲线
Figure PCTCN2021135363-appb-000002
Figure PCTCN2021135363-appb-000003
Figure PCTCN2021135363-appb-000004
在得到拟合曲线之后,根据拟合曲线的多项式确定查询范围向量,如拟合曲线
Figure PCTCN2021135363-appb-000005
的查询范围向量为{a 0,a 1,……,a 10};拟合曲线
Figure PCTCN2021135363-appb-000006
的查询范围向量为{b 0,b 1,……,b 10}。
针对任一拟合曲线,基于查询范围向量和查询关键字向量对构建的随机下三角矩阵进行赋值,以此得到表征拟合曲线查询范围信息和查询关键字集的第一下三角矩阵。需要说明的是,在本发明实施例中,矩阵均为方块矩阵。
具体的,基于所述多项式中最高次方项的次数n和关键字字典中关键字数量m确定第一随机下三角矩阵;根据所述查询范围向量和所述查询关键字向量中各维元素对所述第一随机下三角矩阵的对角线进行赋值,得到第一下三角矩阵。其中,下三角矩阵为对角线上方的元素全部为0的矩阵。
举例来说,随机生成一个(n+m+3)×(n+m+3)维下三角矩阵E 1(第一随机下三角矩阵),
Figure PCTCN2021135363-appb-000007
然后根据查询范围向量和查询关键字向量中各维元素对第一随机下三角矩阵的对角线进行赋值。
进一步地,将所述查询范围向量中第r次方项的系数赋值至所述第一随机下三角矩阵中第r+1行第r+1列的元素;0≤r≤n;将-1赋值至所述第一随机下三角矩阵中第n+2行第n+2列的元素;将所述查询关键字向量中第j维元素赋值至所述第一随机下三角矩阵中第n+2+j行第n+2+j列的元素;将查询关键字集中关键字的数量赋值至所述第一随机下三角矩阵中最后一行最后一列的元素。
以第一拟合曲线
Figure PCTCN2021135363-appb-000008
举例来说,当r=0时,查询范围向量中第0次方项的系数为a 0,将a 0赋值至第一随机下三角矩阵E 1中的第r+1行第r+1列的元素
Figure PCTCN2021135363-appb-000009
同理将查询范围向量中各维元素对应的赋值至第一随机下三角矩阵E 1中;
将-1赋值至第一随机下三角矩阵E 1中的第r+1行第r+1列的元素
Figure PCTCN2021135363-appb-000010
将所述查询关键字向量中第j维元素赋值至第一随机下三角矩阵中第n+2+j行第n+2+j列的元素;如j=1时,确定查询关键字向量中第j维元素的值为1,则将“1”赋值至第一随机下三角矩阵E 1中第n+2+j行第n+2+j列的元素;
将查询关键字集中关键字的数量(如上述查询关键字集中关键字的数量为2)赋值至所述第一随机下三角矩阵E 1中最后一行最后一列的元素
Figure PCTCN2021135363-appb-000011
综上,得到第一拟合曲线
Figure PCTCN2021135363-appb-000012
的第一下三角矩阵
Figure PCTCN2021135363-appb-000013
同理,第二拟合曲线
Figure PCTCN2021135363-appb-000014
的第一下三角矩阵为
Figure PCTCN2021135363-appb-000015
基于上述描述,针对空间文本数据集中的任一对象,存储有表征所述对象的空间位置和关键字集的第二下三角矩阵;具体的,第二下三角矩阵通过如下方式得到,包括:基于所述多项式中最高次方项的次数n和关键字字典中关键字数量m确定第二随机下三角矩阵;针对任一对象,将所述对象的纬度值经n+1次处理的n+1项纬度值和所述对象的经度值,确定索引空间向量;对所述对象的关键字集进行编码,得到索引关键字向量;根据所述索引空间向量和所述索引关键字向量中各维元素对所述第二随机下三角矩阵的对角线进行赋值,得到第二下三角矩阵。
举例来说,第i对象o i的关键字集包括关键字“麻辣”和“烤肉”,则根据上述确定查询关键字向量的方法,确定该对象的索引关键字向量为{0,0,0,0}。
假设对象o i的空间位置为(x i,y i);其中,x i的对象o i的纬度值,y i的对象o i的经度值;则确定索引空间向量为
Figure PCTCN2021135363-appb-000016
在得到对象o i的索引空间向量和索引关键字向量之后,对第二随机下三角矩阵的对角线E 2进行赋值;具体的,将所述对象的纬度值经r次处理的纬度值赋值至所述第二随机下三角矩阵中第r+1行第r+1列的元素;0≤r≤n;将所述对象的经度值赋值至所述第二随机下三角矩阵中第r+2行第r+2列的元素;将所述索引关键字向量中第j维元素赋值至所述第二随机下三角矩阵中第n+3+j行第n+3+j列的元素;将-1赋值至所述第二随机下三角矩阵中最后一行最后一列的元素。
基于上述确定第一下三角矩阵举例说明,随机生成一个(n+m+3)×(n+m+3)维下三角矩阵F 1(第二随机下三角矩阵),
Figure PCTCN2021135363-appb-000017
然后根据第1对象o 1的索引空间向量和索引关键字向量中各维元素对第二随机下三角矩阵的对角线进行赋值。
当r=0时,索引空间向量中第0次的纬度值为
Figure PCTCN2021135363-appb-000018
Figure PCTCN2021135363-appb-000019
赋值至第二随机下三角矩阵F 1中的第1行第1列的元素
Figure PCTCN2021135363-appb-000020
同理将索引空间向量中各维元素对应的赋值至第二随机下三角矩阵F 1中;
将y i赋值至第二随机下三角矩阵E 2中的第r+1行第r+1列的元素
Figure PCTCN2021135363-appb-000021
将索引关键字向量中第j维元素赋值至第二随机下三角矩阵中第n+2+j行第n+2+j列的元素;如j=1时,确定索引关键字向量中第j维元素的值为0,则将“0”赋值至第二随机下三角矩阵F 1中第n+2+j行第n+2+j列的元素
Figure PCTCN2021135363-appb-000022
将-1赋值至所述第二随机下三角矩阵E 2中最后一行最后一列的元素
Figure PCTCN2021135363-appb-000023
综上,得到第二下三角矩阵
Figure PCTCN2021135363-appb-000024
在得到第一下三角矩阵和第二下三角矩阵之后,为了保证查询安全性,需要对第一下三角矩阵和第二下三角矩阵进行加密。
具体的,根据至少一个随机可逆方块矩阵的逆矩阵、对角线值为设定值的第三随机下三角矩阵和对角线值为1的第四下三角矩阵对所述第一下三角矩阵进行加密得到查询子陷门;其中,所述设定值为第1行第1列至第n+2行n+2列的值为预设正实数,第n+3行第n+3列至第n+m+3行第n+m+3列的值为1。
本发明实施例中,随机可逆方块矩、第三随机下三角矩阵、第三下三角矩阵、第四随机下三角矩阵和第四下三角矩阵是预先生成的,作为密钥分量,用于对信息进行加密。
举例来说,随机生成两个(n+m+3)×(n+m+3)维的可逆方块矩阵M 1和M 2
Figure PCTCN2021135363-appb-000025
随机生成一个(n+m+3)×(n+m+3)维的随机下三角矩阵D 1,将D 1作为第三随机下三角矩阵;基于第三随机下三角矩阵D 1,将第三随机下三角矩阵D 1中第1行第1列至第n+2行n+2列的值设为0.001(预设正实数),第n+3行第n+3列至第n+m+3行第n+m+3列的值为1,以此得到第三下三角矩阵D 2
Figure PCTCN2021135363-appb-000026
随机生成一个(n+m+3)×(n+m+3)维的随机下四角矩阵S 1,将S 1作为第四随机下三角矩阵;基于第四随机下三角矩阵S 1,将第四随机下三角矩阵S 1的对角线赋值为1,得到第四下三角矩阵S 2
Figure PCTCN2021135363-appb-000027
将可逆方块矩阵M 1和M 2的逆矩阵、第三下三角矩阵D 2以及第四下三角矩阵S 2与第一下三角矩阵相乘,得到查询子陷门;如查询曲线θ 1的第一查询子陷门为
Figure PCTCN2021135363-appb-000028
Figure PCTCN2021135363-appb-000029
查询曲线θ 2的第二查询子陷门为
Figure PCTCN2021135363-appb-000030
针对任一对象的第二下三角矩阵,根据所述至少一个随机可逆方块矩阵、第三随机下三角矩阵和第四下三角矩阵对第二下三角方块矩阵进行加密得到所述对象的索引;如对象o i的索引
Figure PCTCN2021135363-appb-000031
在步骤340中,针对任一对象,通过将对象的索引分别与第一查询子陷门R 1和第二查询子陷门R 2相乘,确定该对象是否满足预设条件。
具体的,针对任一对象,基于所述对象的索引与所述第一查询子陷门确定第一结果矩阵;基于所述对象的索引与所述第二查询子陷门确定第二结果矩阵;确定所述第一结果矩阵的迹和所述第二结果矩阵的迹,根据所述第一结果矩阵的迹和所述第二结果矩阵的迹确定所述对象是否满足预设条件。其中,矩阵的迹为矩阵中对角线上各个元素的总和。
举例来说,第一结果矩阵为tr(C iR 1);
Figure PCTCN2021135363-appb-000032
第二结果矩阵为tr(C iR 2);
Figure PCTCN2021135363-appb-000033
其中,K可以为第i对象的关键字集包括的查询关键字的数量;如第i对象的关键字集包括“火锅”、“四川”和“麻辣”,查询关键字集包括的关键字为“火锅”和“四川”,则K=2;K也可以为第i对象的关键字集的关键字数量;δ为查询关键字集中查询关键字的数量。
进一步地,预设条件包括:
所述第一结果矩阵的迹的绝对值和所述第二结果矩阵的迹的绝对值小于第一阈值;且所述第一结果矩阵的迹大于第二阈值;所述第二结果矩阵的迹小于第二阈值;其中,所述第一阈值用于确定符合所述查询关键字集的对象;所述第二阈值用于确定符合所述查询范围的对象。
举例来说,第一阈值为0.1;因为在第三下三角矩阵D 2中,第n+3行第n+3列至第n+m+3行第n+m+3列的值为1,而第一下三角矩阵和第二下三角矩阵的第n+3行第n+3列至第n+m+3行第n+m+3列的元素用于表征关键字;
在第三下三角矩阵D 2中,第1行第1列至第n+2行n+2列的值设为0.001(预设正实数),而第一下三角矩阵和第二下三角矩阵的第1行第1列至第n+2行n+2列的元素用于表征查询范围。
若K=δ,则结果矩阵的迹因为预设正实数(0.001)的缘故会较小,因此,若第一结果矩阵的迹的绝对值和第二结果矩阵的迹的绝对值不小于第一阈值,则证明K与δ不相等,该对象的关键字与查询关键字不对应,即该对象不能作为查询结果;反之,该对象具有作为查询结果的前提要求。
在本发明实施例中,第一结果矩阵用于表征第一查询子陷门,第一查询子陷门对应第一查询曲线θ 1(图4中虚线所示),在空间位置上,第一查询曲线θ 1位于上半部分,因此,若第一结果矩阵的迹大于0(第二阈值),则表明该对象位于第一查询曲线θ 1的下方;若第二结果矩阵的迹小于0(第二阈值),则表明该对象位于第二查询曲线θ 1的上方。
因此,针对任一对象的第一结果矩阵和第二结果矩阵,若第一结果矩阵的迹的绝对值和第二结果矩阵的迹的绝对值小于第一阈值,且第一结果矩阵的迹大于第二阈值,且第二结果矩阵的迹小于第二阈值,则确定该对象的关键字集与查询关键字集相对应,且该对象处于查询范围内,因此将该对象确定为查询结果。
图5示例性的示出了一种应用场景的结构示意图,如图5所示,包括数据拥有者510、云服务器520和用户端530;
由数据拥有者510确定至少一个随机可逆方块矩阵、第三下三角矩阵、第四下三角矩阵、第三随机下三角矩阵和第四随机下三角矩阵,并将至少一个随机可逆方块矩阵、第三下三角矩阵、第四下三角矩阵、第三随机下三角矩阵和第四随机下三角矩阵作为密钥分量;数据拥有者510将其中至少一个随机可逆方块矩阵的逆矩阵、第三下三角矩阵、第四下三角矩阵发送至用户端530;
数据拥有者510对空间文本数据集中每个对象确定第二下三角矩阵,并基于至少一个随机可逆方块矩阵、第三随机下三角矩阵和第四随机下三角矩阵对每个对象的第二下三角矩阵进行加密,得到每个对象的索引;并将索引发送至云服务器520;
用户端530基于查询请求的查询范围和查询关键字集,确定出第一查询曲线和第二查询曲线的第一下三角矩阵,并根据数据拥有者510发送的至少一个随机可逆方块矩阵的逆矩阵、第三下三角矩阵、第四下三角矩阵对第一下三角矩阵进行加密,确定第一查询曲线和第二查询曲线的查询子陷门,并将第一查询曲线和第二查询曲线的查询子陷门发送至云服务器520。
云服务器520基于各查询子陷门及空间文本数据集中各对象的索引,将满足预设条件的对象确定为查询结果,并将查询结果返回至用户端530。
基于相同的技术构思,图6示例性的示出了本发明实施例提供的一种空间文本的查询装置的结构示意图,该装置可以执行空间文本的查询方法的流程。
如图6所示,该装置具体包括:
获取模块610,用于获取查询请求;所述查询请求包括查询范围和查询关键字集;所述查询范围是通过查询曲线形成的闭合区域;
处理模块620,用于对所述查询关键字集进行编码,得到查询关键字向量;
针对任一查询曲线,对所述查询曲线进行多项式拟合,将拟合后的多项式的各次方项的系数确定为查询范围向量;基于所述查询范围向量和所述查询关键字向量,得到第一下三角矩阵;将所述第一下三角矩阵经第一加密矩阵加密,得到查询子陷门;
基于各查询子陷门及空间文本数据集中各对象的索引,将满足预设条件的对象确定为查询结果;其中,任一对象的索引是根据所述对象的空间位置和关键字集得到的第二下三角矩阵经第二加密矩阵加密后得到的。
可选的,所述处理模块620具体用于:
基于关键字字典中关键字数量m,建立第一向量;所述关键字字典是对所述空间文本数据集中各对象的关键字集取并集得到的;
若确定所述关键字字典中第j关键字记录在所述查询关键字集,则将所述第一向量的第j维元素赋值为1;
若确定所述关键字字典中第j关键字未记录在所述查询关键字集,则将所述第一向量的第j维元素赋值为0;
将各维元素赋值后的第一向量确定为所述查询关键字向量。
可选的,所述处理模块620具体用于:
基于所述多项式中最高次方项的次数n和关键字字典中关键字数量m确定第一随机下 三角矩阵;
根据所述查询范围向量和所述查询关键字向量中各维元素对所述第一随机下三角矩阵的对角线进行赋值,得到第一下三角矩阵;
基于所述多项式中最高次方项的次数n和关键字字典中关键字数量m确定第二随机下三角矩阵;
针对任一对象,将所述对象的纬度值经n+1次处理的n+1项纬度值和所述对象的经度值,确定索引空间向量;对所述对象的关键字集进行编码,得到索引关键字向量;
根据所述索引空间向量和所述索引关键字向量中各维元素对所述第二随机下三角矩阵的对角线进行赋值,得到第二下三角矩阵。
可选的,所述处理模块620具体用于:
将所述查询范围向量中第r次方项的系数赋值至所述第一随机下三角矩阵中第r+1行第r+1列的元素;0≤r≤n;
将-1赋值至所述第一随机下三角矩阵中第n+2行第n+2列的元素;
将所述查询关键字向量中第j维元素赋值至所述第一随机下三角矩阵中第n+2+j行第n+2+j列的元素;
将查询关键字集中关键字的数量赋值至所述第一随机下三角矩阵中最后一行最后一列的元素。
可选的,所述处理模块620具体用于:
将所述对象的纬度值经r次处理的纬度值赋值至所述第二随机下三角矩阵中第r+1行第r+1列的元素;0≤r≤n;
将所述对象的经度值赋值至所述第二随机下三角矩阵中第r+2行第r+2列的元素;
将所述索引关键字向量中第j维元素赋值至所述第二随机下三角矩阵中第n+3+j行第n+3+j列的元素;
将-1赋值至所述第二随机下三角矩阵中最后一行最后一列的元素。
可选的,所述处理模块620具体用于:
根据至少一个随机可逆方块矩阵的逆矩阵、对角线值为设定值的第三下三角矩阵和对角线值为1的第四下三角矩阵对所述第一下三角矩阵进行加密得到查询子陷门;其中,所述设定值为第1行第1列至第n+2行n+2列的值为预设正实数,第n+3行第n+3列至第n+m+3行第n+m+3列的值为1;
根据所述对象的空间位置和关键字集得到的第二下三角矩阵;
根据所述至少一个随机可逆方块矩阵、第三随机下三角矩阵和第四随机下三角矩阵对第二下三角方块矩阵进行加密得到所述对象的索引。
可选的,所述查询曲线包括第一查询曲线和第二查询曲线;所述各查询子陷门包括第一查询子陷门和第二查询子陷门;所述处理模块620具体用于:
针对任一对象,基于所述对象的索引与所述第一查询子陷门确定第一结果矩阵;基于所述对象的索引与所述第二查询子陷门确定第二结果矩阵;
确定所述第一结果矩阵的迹和所述第二结果矩阵的迹,根据所述第一结果矩阵的迹和 所述第二结果矩阵的迹确定所述对象是否满足预设条件。
基于相同的技术构思,本发明实施例还提供一种计算机设备,包括:
存储器,用于存储程序指令;
处理器,用于调用所述存储器中存储的程序指令,按照获得的程序执行上述空间文本的查询方法。
基于相同的技术构思,本发明实施例还提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可执行指令,所述计算机可执行指令用于使计算机执行上述空间文本的查询方法。
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请是参照根据本申请的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的精神和范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。

Claims (10)

  1. 一种空间文本的查询方法,其特征在于,包括:
    获取查询请求;所述查询请求包括查询范围和查询关键字集;所述查询范围是通过查询曲线形成的闭合区域;
    对所述查询关键字集进行编码,得到查询关键字向量;
    针对任一查询曲线,对所述查询曲线进行多项式拟合,将拟合后的多项式的各次方项的系数确定为查询范围向量;基于所述查询范围向量和所述查询关键字向量,得到第一下三角矩阵;将所述第一下三角矩阵经第一加密矩阵加密,得到查询子陷门;
    基于各查询子陷门及空间文本数据集中各对象的索引,将满足预设条件的对象确定为查询结果;其中,任一对象的索引是根据所述对象的空间位置和关键字集得到的第二下三角矩阵经第二加密矩阵加密后得到的。
  2. 如权利要求1所述的方法,其特征在于,对所述查询关键字集进行编码,得到查询关键字向量,包括:
    基于关键字字典中关键字数量m,建立第一向量;所述关键字字典是对所述空间文本数据集中各对象的关键字集取并集得到的;
    若确定所述关键字字典中第j关键字记录在所述查询关键字集,则将所述第一向量的第j维元素赋值为1;
    若确定所述关键字字典中第j关键字未记录在所述查询关键字集,则将所述第一向量的第j维元素赋值为0;
    将各维元素赋值后的第一向量确定为所述查询关键字向量。
  3. 如权利要求1所述的方法,其特征在于,基于所述查询范围向量和所述查询关键字向量,得到第一下三角矩阵,包括:
    基于所述多项式中最高次方项的次数n和关键字字典中关键字数量m确定第一随机下三角矩阵;
    根据所述查询范围向量和所述查询关键字向量中各维元素对所述第一随机下三角矩阵的对角线进行赋值,得到第一下三角矩阵;
    第二下三角矩阵通过如下方式得到,包括:
    基于所述多项式中最高次方项的次数n和关键字字典中关键字数量m确定第二随机下三角矩阵;
    针对任一对象,将所述对象的纬度值经n+1次处理的n+1项纬度值和所述对象的经度值,确定索引空间向量;对所述对象的关键字集进行编码,得到索引关键字向量;
    根据所述索引空间向量和所述索引关键字向量中各维元素对所述第二随机下三角矩阵的对角线进行赋值,得到第二下三角矩阵。
  4. 如权利要求3所述的方法,其特征在于,根据所述查询范围向量和所述查询关键字向量中各维元素对所述第一随机下三角矩阵的对角线进行赋值,包括:
    将所述查询范围向量中第r次方项的系数赋值至所述第一随机下三角矩阵中第r+1行 第r+1列的元素;0≤r≤n;
    将-1赋值至所述第一随机下三角矩阵中第n+2行第n+2列的元素;
    将所述查询关键字向量中第j维元素赋值至所述第一随机下三角矩阵中第n+2+j行第n+2+j列的元素;
    将查询关键字集中关键字的数量赋值至所述第一随机下三角矩阵中最后一行最后一列的元素。
  5. 如权利要求3所述的方法,其特征在于,根据所述索引空间向量和所述索引关键字向量中各维元素对所述第二随机下三角矩阵的对角线进行赋值,包括:
    将所述对象的纬度值经r次处理的纬度值赋值至所述第二随机下三角矩阵中第r+1行第r+1列的元素;0≤r≤n;
    将所述对象的经度值赋值至所述第二随机下三角矩阵中第r+2行第r+2列的元素;
    将所述索引关键字向量中第j维元素赋值至所述第二随机下三角矩阵中第n+3+j行第n+3+j列的元素;
    将-1赋值至所述第二随机下三角矩阵中最后一行最后一列的元素。
  6. 如权利要求1所述的方法,其特征在于,将所述第一下三角矩阵经第一加密矩阵加密,得到查询子陷门,包括:
    根据至少一个随机可逆方块矩阵的逆矩阵、对角线值为设定值的第三下三角矩阵和对角线值为1的第四下三角矩阵对所述第一下三角矩阵进行加密得到查询子陷门;其中,所述设定值为第1行第1列至第n+2行n+2列的值为预设正实数,第n+3行第n+3列至第n+m+3行第n+m+3列的值为1;
    任一对象的索引是根据所述对象的空间位置和关键字集得到的第二下三角矩阵经第二加密矩阵加密后得到的,包括:
    根据所述对象的空间位置和关键字集得到的第二下三角矩阵;
    根据所述至少一个随机可逆方块矩阵、第三随机下三角矩阵和第四随机下三角矩阵对第二下三角方块矩阵进行加密得到所述对象的索引。
  7. 如权利要求1至6任一项所述的方法,其特征在于,所述查询曲线包括第一查询曲线和第二查询曲线;所述各查询子陷门包括第一查询子陷门和第二查询子陷门;基于所述查询子陷门及空间文本数据集中各对象的索引,将满足预设条件的对象确定为查询结果,包括:
    针对任一对象,基于所述对象的索引与所述第一查询子陷门确定第一结果矩阵;基于所述对象的索引与所述第二查询子陷门确定第二结果矩阵;
    确定所述第一结果矩阵的迹和所述第二结果矩阵的迹,根据所述第一结果矩阵的迹和所述第二结果矩阵的迹确定所述对象是否满足预设条件。
  8. 如权利要求7所述的方法,其特征在于,所述预设条件包括:
    所述第一结果矩阵的迹的绝对值和所述第二结果矩阵的迹的绝对值小于第一阈值;且所述第一结果矩阵的迹大于第二阈值;所述第二结果矩阵的迹小于第二阈值;
    所述第一阈值用于确定符合所述查询关键字集的对象;
    所述第二阈值用于确定符合所述查询范围的对象。
  9. 一种空间文本的查询装置,其特征在于,包括:
    获取模块,用于获取查询请求;所述查询请求包括查询范围和查询关键字集;所述查询范围是通过查询曲线形成的闭合区域;
    处理模块,用于对所述查询关键字集进行编码,得到查询关键字向量;
    针对任一查询曲线,对所述查询曲线进行多项式拟合,将拟合后的多项式的各次方项的系数确定为查询范围向量;基于所述查询范围向量和所述查询关键字向量,得到第一下三角矩阵;将所述第一下三角矩阵经第一加密矩阵加密,得到查询子陷门;
    基于各查询子陷门及空间文本数据集中各对象的索引,将满足预设条件的对象确定为查询结果;其中,任一对象的索引是根据所述对象的空间位置和关键字集得到的第二下三角矩阵经第二加密矩阵加密后得到的。
  10. 一种计算机设备,其特征在于,包括:
    存储器,用于存储程序指令;
    处理器,用于调用所述存储器中存储的程序指令,按照获得的程序执行权利要求1至8任一项所述的方法。
PCT/CN2021/135363 2021-10-18 2021-12-03 一种空间文本的查询方法及装置 WO2023065477A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111210427.0 2021-10-18
CN202111210427.0A CN113987144A (zh) 2021-10-18 2021-10-18 一种空间文本的查询方法及装置

Publications (1)

Publication Number Publication Date
WO2023065477A1 true WO2023065477A1 (zh) 2023-04-27

Family

ID=79739168

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/135363 WO2023065477A1 (zh) 2021-10-18 2021-12-03 一种空间文本的查询方法及装置

Country Status (2)

Country Link
CN (1) CN113987144A (zh)
WO (1) WO2023065477A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116821279A (zh) * 2023-06-06 2023-09-29 哈尔滨理工大学 一种带排斥关键字的空间关键字查询方法和系统

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101673307A (zh) * 2009-10-21 2010-03-17 中国农业大学 空间数据索引方法及系统
EP2709306A1 (en) * 2012-09-14 2014-03-19 Alcatel Lucent Method and system to perform secure boolean search over encrypted documents
CN108710698A (zh) * 2018-05-23 2018-10-26 湖南大学 云环境下基于密文的多关键词模糊查询方法
CN108985094A (zh) * 2018-06-28 2018-12-11 电子科技大学 云环境下实现密文空间数据的访问控制和范围查询方法
CN110222081A (zh) * 2019-06-08 2019-09-10 西安电子科技大学 多用户环境下基于细粒度排序的数据密文查询方法
CN110222012A (zh) * 2019-06-08 2019-09-10 西安电子科技大学 单一用户环境下基于细粒度排序的数据密文查询方法
CN112257455A (zh) * 2020-10-21 2021-01-22 西安电子科技大学 一种语义理解的密文空间关键字检索方法及系统
CN112528064A (zh) * 2020-12-10 2021-03-19 西安电子科技大学 一种隐私保护的加密图像检索方法及系统
CN112948848A (zh) * 2021-02-05 2021-06-11 杭州师范大学 一种基于改进knn的时空数据范围查询方法
CN113177167A (zh) * 2021-04-28 2021-07-27 湖南大学 一种基于云计算隐私保护的空间关键词搜索方法
CN113221140A (zh) * 2021-04-30 2021-08-06 杭州师范大学 一种基于访问控制的密文时空数据查询方法

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101673307A (zh) * 2009-10-21 2010-03-17 中国农业大学 空间数据索引方法及系统
EP2709306A1 (en) * 2012-09-14 2014-03-19 Alcatel Lucent Method and system to perform secure boolean search over encrypted documents
CN108710698A (zh) * 2018-05-23 2018-10-26 湖南大学 云环境下基于密文的多关键词模糊查询方法
CN108985094A (zh) * 2018-06-28 2018-12-11 电子科技大学 云环境下实现密文空间数据的访问控制和范围查询方法
CN110222081A (zh) * 2019-06-08 2019-09-10 西安电子科技大学 多用户环境下基于细粒度排序的数据密文查询方法
CN110222012A (zh) * 2019-06-08 2019-09-10 西安电子科技大学 单一用户环境下基于细粒度排序的数据密文查询方法
CN112257455A (zh) * 2020-10-21 2021-01-22 西安电子科技大学 一种语义理解的密文空间关键字检索方法及系统
CN112528064A (zh) * 2020-12-10 2021-03-19 西安电子科技大学 一种隐私保护的加密图像检索方法及系统
CN112948848A (zh) * 2021-02-05 2021-06-11 杭州师范大学 一种基于改进knn的时空数据范围查询方法
CN113177167A (zh) * 2021-04-28 2021-07-27 湖南大学 一种基于云计算隐私保护的空间关键词搜索方法
CN113221140A (zh) * 2021-04-30 2021-08-06 杭州师范大学 一种基于访问控制的密文时空数据查询方法

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116821279A (zh) * 2023-06-06 2023-09-29 哈尔滨理工大学 一种带排斥关键字的空间关键字查询方法和系统
CN116821279B (zh) * 2023-06-06 2024-06-07 哈尔滨理工大学 一种带排斥关键字的空间关键字查询方法和系统

Also Published As

Publication number Publication date
CN113987144A (zh) 2022-01-28

Similar Documents

Publication Publication Date Title
Wang et al. Searchable encryption over feature-rich data
US10089487B2 (en) Masking query data access pattern in encrypted data
US9852306B2 (en) Conjunctive search in encrypted data
US10546021B2 (en) Adjacency structures for executing graph algorithms in a relational database
US20190370599A1 (en) Bounded Error Matching for Large Scale Numeric Datasets
CN109885650B (zh) 一种外包云环境隐私保护密文排序检索方法
Bothe et al. Skyline query processing over encrypted data: An attribute-order-preserving-free approach
CN104731860A (zh) 隐私保护的空间关键字查询方法
US10824739B2 (en) Secure data aggregation in databases using static shifting and shifted bucketization
CN114003744A (zh) 基于卷积神经网络和向量同态加密的图像检索方法及系统
WO2023065477A1 (zh) 一种空间文本的查询方法及装置
Yan et al. Multi-keywords fuzzy search encryption supporting dynamic update in an intelligent edge network
Lam et al. Gpu-based private information retrieval for on-device machine learning inference
CN110390011B (zh) 数据分类的方法和装置
Majhi et al. Challenges in Big Data Cloud Computing And Future Research Prospects: A Review: A Review
CN115310125A (zh) 一种加密数据检索系统、方法、计算机设备及存储介质
Magdy et al. Privacy preserving search index for image databases based on SURF and order preserving encryption
CN109165226B (zh) 一种面向密文大型数据集的可搜索加密方法
CN105335530A (zh) 一种提升大数据块重复数据删除性能的方法
Cong et al. Poster: Panacea---Stateless and Non-Interactive Oblivious RAM
He et al. An efficient ciphertext retrieval scheme based on homomorphic encryption for multiple data owners in hybrid cloud
Zhao et al. Towards efficient Secure Boolean Range Query over encrypted spatial data
Lopes et al. A framework for investigating the performance of sum aggregations over encrypted data warehouses
Zhang et al. Secure speech retrieval method using deep hashing and CKKS fully homomorphic encryption
WO2024207647A1 (zh) 信息检索方法、装置、计算机设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21961215

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 18/07/2024)