CN116521816A - Data processing method, retrieval method, device, equipment and storage medium - Google Patents
Data processing method, retrieval method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN116521816A CN116521816A CN202310465990.5A CN202310465990A CN116521816A CN 116521816 A CN116521816 A CN 116521816A CN 202310465990 A CN202310465990 A CN 202310465990A CN 116521816 A CN116521816 A CN 116521816A
- Authority
- CN
- China
- Prior art keywords
- index table
- content
- indexed
- field
- specified
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 238000003672 processing method Methods 0.000 title claims abstract description 31
- 238000010276 construction Methods 0.000 claims abstract description 40
- 230000004044 response Effects 0.000 claims abstract description 26
- 238000012545 processing Methods 0.000 claims abstract description 15
- 238000013479 data entry Methods 0.000 claims description 45
- 238000004590 computer program Methods 0.000 claims description 14
- 230000002441 reversible effect Effects 0.000 claims description 12
- 238000013500 data storage Methods 0.000 claims description 9
- 208000000044 Amnesia Diseases 0.000 description 8
- 238000004891 communication Methods 0.000 description 8
- 208000026139 Memory disease Diseases 0.000 description 7
- 238000010586 diagram Methods 0.000 description 7
- 230000006984 memory degeneration Effects 0.000 description 7
- 208000023060 memory loss Diseases 0.000 description 7
- 230000006870 function Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000001960 triggered effect Effects 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/316—Indexing structures
- G06F16/319—Inverted lists
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/316—Indexing structures
- G06F16/322—Trees
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/38—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/383—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Library & Information Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The disclosure provides a data processing method, a retrieval method, a device, equipment and a storage medium, relates to the technical field of data processing, and particularly relates to the technical fields of databases, information flows and knowledge maps. The specific implementation scheme is as follows: in response to receiving a construction instruction of an inverted index table corresponding to the forward index table, determining a target field value of a specified field in the forward index table; determining the content to be indexed corresponding to the target field value based on the forward index table; the target field value is used as an index key, the content to be indexed is used as a key value of the index key, and an inverted index table is constructed according to a preset construction mode; the preset construction mode comprises the following steps: and if the number of the field values included in the content to be indexed does not exceed the preset number threshold, storing the content to be indexed in an array structure, otherwise, storing the content to be indexed in a specified tree structure. Therefore, by the scheme, the cache failure times can be reduced, and the data query speed can be ensured.
Description
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to the fields of databases, information flows, and knowledge-graph technologies. And more particularly, to a data processing method, a retrieval method, an apparatus, a device, and a storage medium.
Background
An index is an important data structure in a database for improving the access speed of data, and thus, improvement of index performance is particularly important for the database.
In the related art, a prefix tree is generally used as a data structure of an index.
Disclosure of Invention
The present disclosure provides a data processing method, a retrieval method, an apparatus, a device, and a storage medium.
According to an aspect of the present disclosure, there is provided a data processing method including:
in response to receiving a construction instruction of an inverted index table corresponding to a forward index table, determining a target field value of a specified field in the forward index table; wherein the specified field is a field other than an index field in the forward index table;
determining the content to be indexed corresponding to the target field value based on the forward index table; wherein, the content to be indexed comprises: a field value of an index field contained in a specified data entry in the forward index table, wherein the specified data entry is a data entry of which the specified field has the target field value;
Constructing an inverted index table according to a preset construction mode by taking the target field value as an index key and the content to be indexed as a key value of the index key;
the preset construction mode comprises the following steps: and if the number of the field values included in the content to be indexed does not exceed the preset number threshold, storing the content to be indexed in an array structure, otherwise, storing the content to be indexed in a specified tree structure.
According to another aspect of the present disclosure, there is provided a retrieval method including:
in response to receiving a search request, determining a search term indicated by the search request;
determining a key value of a target index key matched with the search term from a designated index table; wherein the specified index table is an inverted index table constructed based on any one of the data processing methods described above;
acquiring a data entry, which is matched with a field value of an index field contained in the data entry and a field value in the key value, from a forward index table corresponding to the specified index table, and taking the data entry as an initial search result;
and determining a search result corresponding to the search request based on the initial search result.
According to another aspect of the present disclosure, there is provided a data processing apparatus including:
The first response module is used for determining a target field value of a designated field in the forward index table in response to receiving a construction instruction of the reverse index table corresponding to the forward index table; wherein the specified field is a field other than an index field in the forward index table;
the first determining module is used for determining the content to be indexed corresponding to the target field value based on the forward index table; wherein, the content to be indexed comprises: a field value of an index field contained in a specified data entry in the forward index table, wherein the specified data entry is a data entry of which the specified field has the target field value;
the construction module is used for constructing an inverted index table according to a preset construction mode by taking the target field value as an index key and the content to be indexed as a key value of the index key;
the preset construction mode comprises the following steps: and if the number of the field values included in the content to be indexed does not exceed the preset number threshold, storing the content to be indexed in an array structure, otherwise, storing the content to be indexed in a specified tree structure.
According to another aspect of the present disclosure, there is provided a retrieval device including:
The second response module is used for responding to the received search request and determining a search term indicated by the search request;
the second determining module is used for determining the key value of the target index key matched with the search term from the appointed index table; wherein the specified index table is an inverted index table constructed based on any one of the data processing methods described above;
the acquisition module is used for acquiring a data item, of which the field value of the contained index field is matched with the field value in the key value, from a forward index table corresponding to the specified index table as an initial retrieval result;
and the third determining module is used for determining a search result corresponding to the search request based on the initial search result.
According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform any one of the data processing methods or the retrieval method described above.
According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing the computer to execute the data processing method according to any one of the above, or the retrieval method.
According to another aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements a data processing method according to any one of the above, or a retrieval method.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a flow chart of a data processing method according to the present disclosure;
FIG. 2 is a schematic diagram of a data structure of an index table according to the present disclosure;
FIG. 3 is another flow chart of a data processing method according to the present disclosure;
FIG. 4 is yet another flow chart of a data processing method according to the present disclosure;
FIG. 5 is a flow chart of a retrieval method according to the present disclosure;
FIG. 6 is a schematic diagram of a data processing apparatus according to the present disclosure;
FIG. 7 is a schematic diagram of a search device according to the present disclosure;
fig. 8 is a block diagram of an electronic device used to implement an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Next, a description will be first given of a data processing method provided in an embodiment of the present disclosure.
It should be noted that, in a specific application, the data processing method provided in the embodiments of the present disclosure may be applied to various electronic devices, for example, a personal computer, a server, and other devices having data processing capabilities. In addition, it is understood that the data processing method provided by the embodiment of the present disclosure may be implemented by software, hardware, or a combination of software and hardware.
The data processing method provided by the embodiment of the disclosure may include the following steps:
in response to receiving a construction instruction of an inverted index table corresponding to a forward index table, determining a target field value of a specified field in the forward index table; wherein the specified field is a field other than an index field in the forward index table;
determining the content to be indexed corresponding to the target field value based on the forward index table; wherein, the content to be indexed comprises: a field value of an index field contained in a specified data entry in the forward index table, wherein the specified data entry is a data entry of which the specified field has the target field value;
constructing an inverted index table according to a preset construction mode by taking the target field value as an index key and the content to be indexed as a key value of the index key;
the preset construction mode comprises the following steps: and if the number of the field values included in the content to be indexed does not exceed the preset number threshold, storing the content to be indexed in an array structure, otherwise, storing the content to be indexed in a specified tree structure.
In the scheme provided by the disclosure, in response to receiving a construction instruction of an inverted index table corresponding to an inverted index table, a target field value in the inverted index table is first determined, then, a content to be indexed corresponding to the target field value in the inverted index table is determined, and a storage structure of the content to be indexed is determined according to the number of field values included in the content to be indexed, so that the target field value is used as an index key, the content to be indexed is used as a key value of the index key, and the content to be indexed is stored according to the determined storage structure, thereby obtaining the inverted index table. Because the array structure has a continuous storage space, when the number of field values included in the content to be indexed does not exceed a preset number threshold, the content to be indexed is stored in the array structure, so that the continuity of data storage can be improved, and the cache failure frequency is reduced; when the number of field values included in the content to be indexed exceeds a preset number threshold, the content to be indexed is stored in a specified tree structure, so that the query speed of data can be ensured. Therefore, by the scheme, the cache failure times can be reduced and the data query speed can be guaranteed.
The data processing method provided by the embodiment of the present disclosure is described below with reference to the accompanying drawings.
As shown in fig. 1, the data processing method provided by the embodiment of the disclosure may include the following steps:
s101, determining a target field value of a designated field in a forward index table in response to receiving a construction instruction of the reverse index table corresponding to the forward index table; wherein the specified field is a field other than an index field in the forward index table;
in this embodiment, the forward index table may include a plurality of data entries, each data entry is a row of data in the forward index table, each row of data may include an index field and a field value of a plurality of information fields, and the specified field may be any one of the plurality of information fields.
There may be various ways of determining the specified field, and the present disclosure is not limited to the determination process of the specified field. Optionally, in practical application, a worker may preset a specified field to be utilized when constructing the inverted index table for the forward index table, where the specified field may be carried in the construction instruction, so that in response to receiving the construction instruction for the inverted index table corresponding to the forward index table, the specified field may be first parsed from the construction instruction, and then a target field value of the specified field in the forward index table is determined from the forward index table. For example, if the forward index table is an advertisement detail table, the index field is an advertisement id, the information field includes a buying word signature, an advertisement bid, an advertiser id, and the like, and if the specified field is a buying word signature, the field value corresponding to the buying word signature includes a signature a, a signature B, and a signature C, then when receiving the construction instruction, it may be determined that the signature a, the signature B, or the signature C is a target field value; wherein the so-called buy word signature is a keyword that may be an advertisement.
In addition, it should be noted that, the construction instruction of the reverse index table corresponding to the forward index table may be triggered at regular time, or the construction instruction is triggered when the electronic device detects that the stored forward index table is changed, which is reasonable, and the triggering timing of the construction instruction is not limited in the embodiment of the present disclosure.
S102, determining the content to be indexed corresponding to the target field value based on the forward index table; wherein, the content to be indexed comprises: a field value of an index field contained in a specified data entry in the forward index table, wherein the specified data entry is a data entry with the target field value in the specified field;
in this embodiment, the content to be indexed is the index content in the inverted index table to be constructed, which may include one or more field values of the index field. It will be appreciated that, since the field values of the index fields in the forward index table are all unique, and the field values in the information fields may be repeated, the field values of the same information field may correspond to the field values of a plurality of index fields, and thus, the field values in the content to be indexed corresponding to the target field values may have one or more. For example, if the target field value is the signature a, and the field values of the specified fields in the data entry a and the data entry B are both the signature a, the data entry a and the data entry B are the specified data entries, and the field values of the index fields in the data entry a and the data entry B may constitute the content to be indexed.
S103, constructing an inverted index table according to a preset construction mode by taking the target field value as an index key and the content to be indexed as a key value of the index key;
the preset construction mode comprises the following steps: if the number of the field values included in the content to be indexed does not exceed the preset number threshold, storing the content to be indexed in an array structure, otherwise, storing the content to be indexed in a specified tree structure.
In this embodiment, after determining the target field value and the content to be indexed in steps S101 and S102, the inverted index table may be constructed by using the target field value as the index key and the content to be indexed as the key value of the index key. For example, in practical application, a predetermined number of thresholds may be set by a worker according to experience, so that an inverted index table may be constructed according to a predetermined construction manner, that is, according to the number of field values included in the content to be indexed corresponding to the target field value, a storage structure of the content to be indexed corresponding to the target field value is determined, so that the content to be indexed is stored, and the inverted index table is obtained.
It can be understood that, because the array structure has a space for continuous storage, if the number of field values included in the content to be indexed does not exceed the predetermined number threshold, the content to be indexed is stored in the array structure, so that the continuity of data storage can be improved, the number of cache failures can be reduced, and in the case of smaller data volume, the content to be indexed is stored in the array structure, so that the query speed of the content to be indexed can be ensured to a certain extent; in addition, considering that the query speed is guaranteed preferentially for the case of large data volume, if the number of field values included in the content to be indexed exceeds a predetermined number threshold, the content to be indexed is stored in a specified tree structure, that is, when the number of field values is large, the tree structure is used for storing, so that the query speed of the data can be guaranteed. Illustratively, the specified Tree structure may be a BTre or a variant thereof, such as a B+ Tree, B-Tree, or the specified Tree structure may also be a prefix Tree, or the like. Compared with the method for storing the content to be indexed by only utilizing the tree structure, the method and the device can reduce the cache failure times and ensure the data query speed.
Optionally, in one implementation, the specified tree structure is a specified prefix tree, and the specified prefix tree is a prefix tree in which tree nodes use an array structure for data storage.
In this implementation manner, the specified tree structure may be a specified prefix tree, and since the higher the tree height is, the more the cache invalidation times are, and the higher the tree height of the prefix tree is only related to the key length of the index key, but not related to the number of index keys, the higher the tree height is, so that the performance of the specified prefix tree structure in cache invalidation is better than BTree or its variant, therefore, the specified prefix tree structure may be adopted for data storage, that is, the content to be indexed is stored in the tree node of the specified prefix tree in an array structure. It can be understood that, because the array structure has a continuous storage space, if the tree node of the specified prefix tree adopts the array structure to store data, the continuity of data storage can be further improved, so that the number of cache failures can be further reduced on the basis of considering both the cache failures and the query speed.
In addition, the manner of storing the content to be indexed corresponding to the target field value may be referred to as artRC (adaptive radix tree of RowContainer, adaptive prefix tree stored in a row) by adopting a combination of an array structure and a specified tree structure. FIG. 2 shows a schematic diagram of a data structure of an index table employing an artRC, where K is an index key, V is a key value, and may also be referred to as an inverted chain. And determining a data structure for storing the content to be indexed as a group structure or a specified tree structure according to the length of the key value, namely the number of field values included in the content to be indexed. In addition, the continuous storage of the array structure and the leaf nodes of the tree structure adopts a storage structure in RC (remote control) form. The RC form of the storage structure contains three key fields: data (array) for storing continuous data; valid bits set (valid bit set) for marking whether data is stored in a corresponding position of data, 1 indicates yes, and 0 indicates no; cursor (array cursor) is used to represent the current position of use, and this variable is only increased or not decreased.
In the scheme provided by the disclosure, in response to receiving a construction instruction of an inverted index table corresponding to an inverted index table, a target field value in the inverted index table is first determined, then, a content to be indexed corresponding to the target field value in the inverted index table is determined, and a storage structure of the content to be indexed is determined according to the number of field values included in the content to be indexed, so that the target field value is used as an index key, the content to be indexed is used as a key value of the index key, and the content to be indexed is stored according to the determined storage structure, thereby obtaining the inverted index table. Because the array structure has a continuous storage space, when the number of field values included in the content to be indexed does not exceed a preset number threshold, the content to be indexed is stored in the array structure, so that the continuity of data storage can be improved, and the cache failure frequency is reduced; when the number of field values included in the content to be indexed exceeds a preset number threshold, the content to be indexed is stored in a specified tree structure, so that the query speed of data can be ensured. Therefore, by the scheme, the cache failure times can be reduced and the data query speed can be guaranteed.
Optionally, in another embodiment of the present disclosure, the array structure is of a plurality of types, and the maximum number of elements stored in the array structures of different types is different;
By way of example, the types of data structures may include RC1, RC7, RC16, and so forth. The maximum number of elements stored in the array structures of different types is different, for example, the maximum number of elements stored in RC1 is 1, and the maximum number of elements stored in RC7 is 7.
Accordingly, in this embodiment, the manner of storing the content to be indexed in an array structure may include steps A1-A2:
a1, determining an array structure of a target type which meets a preset selection condition in a plurality of types of array structures; the predetermined selection condition is that after the content to be indexed is stored, the loss of the storage space is smaller than a predetermined capacity threshold;
it can be understood that, since the field values included in the content to be indexed may have different data sizes, that is, the storage space required for storing the content to be indexed is different, when the content to be indexed is stored, the array structure of the target type may be determined according to the data size of the content to be indexed, so that after the content to be indexed is stored by using the array structure of the target type, the loss of the storage space is less than the predetermined capacity threshold. It should be noted that, the loss of the storage space, that is, the amount of unoccupied space in the storage space, that is, the free amount of the storage space after the storage space stores the content to be indexed.
By way of example, the predetermined capacity threshold may be that the memory consumption when the content to be indexed is stored does not exceed 8 bytes, 16 bytes, etc. In practical applications, the predetermined capacity threshold may be set by a related technician according to the needs, which is not limited by the embodiments of the present disclosure.
A2, storing the content to be indexed by using the array structure of the target type.
It can be understood that, since the target type of array structure is an array structure that makes the loss of the storage space less than the predetermined capacity threshold after storing the content to be indexed, storing the content to be indexed with the target type of array structure can select a suitable array structure for the content to be indexed to be stored, so as to maximize the storage space using the array structure, thereby reducing the memory loss.
Optionally, in an implementation manner, storing the content to be indexed in the array structure of the target type may include steps a21-a23:
a21, detecting whether a memory block for storing the array structure of the target type exists in a preset memory pool;
a22, if the content exists, storing the content to be indexed by using the memory blocks in the memory pool and using the array structure of the target type;
A23, if not, applying for the memory block from the system memory, and storing the content to be indexed by using the applied memory block and the array structure of the target type.
In this implementation, a memory pool may be preset, where a memory block applied in advance may be stored in the memory pool. When the target type array structure is used for storing the content to be indexed, whether a memory block used for storing the target type array structure exists in a preset memory pool or not can be detected, so that if the memory block exists, the memory block can be directly taken out from the memory pool, and the content to be indexed is stored by utilizing the memory block in the target type array structure; if not, the memory block for storing the array structure of the target type can be applied from the system memory, and the content to be indexed is stored by using the applied memory block and the array structure of the target type. In addition, after the stored data content in the memory block is released, the memory pool can also recycle the memory block, so that the memory block can be reused, and the system performance loss caused by continuously applying for or destroying the memory block can be reduced.
Therefore, through the scheme, the memory loss for storing the content to be indexed can be reduced.
Optionally, in another embodiment of the present disclosure, on the basis of the embodiment shown in fig. 1, as shown in fig. 3, after the step S103 uses the target field value as an index key and the content to be indexed is used as a key value of the index key, the method further includes:
s104, in response to a designated updating operation for the data item of the forward index table, updating the key value of the index key corresponding to the updating operation in the reverse index table;
the specified updating operation comprises deleting or adding operation, and the index key corresponding to the updating operation is the field value of the specified field in the data item indicated by the updating operation.
For example, in practical applications, the data log may be periodically read to update the data content in the forward index table. It can be understood that, since the inverted index table is constructed based on the forward index table, and the content to be indexed in the inverted index table is the field value of the index field included in the specified data entry in the forward index table, when the specified update operation occurs, that is, the deletion or addition operation occurs, the data content in the inverted index table needs to be synchronized with the data content in the forward index table, and at this time, the key value of the index key corresponding to the update operation in the inverted index table can be updated. For example, if a data entry with a field value of a specified field being a signature a and a field value of an index field being an index a is deleted in the forward index table, a key value with an index key being the signature a in the reverse index table corresponding to the forward index table may be updated, that is, the index a in the key value with the index key being the signature a is deleted.
Considering that there is a process of updating the key value of the index key corresponding to the update operation in the inverted index table, accordingly, in this embodiment, on the basis of the embodiment shown in fig. 3, as shown in fig. 4, the method further includes:
s401, detecting whether the loss of the storage space corresponding to the target key value in the inverted index table is larger than a preset capacity threshold value or not; the target key value is a key value stored in an array structure, and the loss of the storage space corresponding to the target key value is the loss of the storage space of the array structure stored in the target key value;
it can be understood that after the forward index table is updated, the corresponding reverse index table is updated, and at this time, since the number of field values included in the content to be indexed is changed, it is also possible to detect whether the loss of the storage space corresponding to the key value stored in the reverse index table in the array structure is greater than the predetermined capacity threshold after the reverse index table is updated. It should be noted that the size and the setting manner of the predetermined capacity threshold may be the same as those in the above step A1, which is not described herein.
S402, if yes, determining an array structure of a specified type, and changing the array structure in which the target key value is currently stored into the array structure of the specified type;
After the target key value is stored in the array structure of the specified type, the loss of the storage space is smaller than a preset capacity threshold value.
It can be understood that after the inverted index table is updated, if it is detected that the loss of the storage space corresponding to the target key value in the inverted index table is greater than the predetermined capacity threshold, the array structure in which the target key value is currently stored may be updated, that is, the array structure of the specified type is redetermined, and the array structure in which the target key value is currently stored is changed to the array structure of the specified type, so that after the updated array structure of the specified type stores the target key value, the loss of the storage space is less than the predetermined capacity threshold, thereby reducing the memory loss.
In addition, it may be understood that, since the update specifying operation includes a delete or add operation, when the update specifying operation occurs on the data entry in the forward index table, the number of key values in the reverse index table may change, and at this time, the tree structure may be optimized, that is, upgraded or downgraded, including the operation of inserting the tree node or merging the tree node into the tree structure according to the number of updated key values. The manner of upgrading or downgrading the tree structure may be the same as that of upgrading or downgrading the tree structure in the prior art, and the embodiment of the present disclosure is not limited thereto.
Therefore, according to the scheme, the type of the data structure stored in the target key value can be updated when the inverted index table is updated, so that after the updated array structure stores the target key value, the loss of the storage space is smaller than the preset capacity threshold, and the memory loss is reduced.
After the inverted index table is constructed according to the scheme provided by the above embodiment, as shown in fig. 5, the embodiment of the disclosure further provides a search method, which includes the following steps:
s501, responding to a received search request, and determining a search term indicated by the search request;
in this embodiment, in response to receiving a search request, a search term indicated by the search request may be determined first, so as to perform data search using the search term. In practical application, a user may input a search term in a front-end interface to perform search, where the front-end interface may send a search request to a corresponding back-end processor, where the search request may carry the search term, so that after the back-end processor receives the search request, the search term carried in the search request may be determined as the search term indicated by the search request; or, the search request does not carry the search term, and the back-end processor obtains the search term corresponding to the search request from the front-end interface after receiving the search request, which is reasonable.
S502, determining a key value of a target index key matched with the search term from a specified index table; wherein the appointed index table is an inverted index table constructed based on any one of the data processing methods;
in this embodiment, after determining the search term, the key value of the target index key matched with the search term may be determined from the inverted index table constructed by using the method of the above embodiment, that is, the key value corresponding to the target index key may be determined as the search term in the specified index table.
For example, if the search term is a, and the key value of the index key a in the designated index table includes a_id and b_id, the key value of the target index key matched with the search term is a_id and b_id.
S503, obtaining a data item, of which the field value of the contained index field is matched with the field value in the key value, from a forward index table corresponding to the specified index table as an initial search result;
in this embodiment, since the key value is a field value of an index field in the forward index table, and the forward index table stores data entries including the index field and a plurality of information fields, after determining the key value of the target index key matched with the search term in step S502, the data entry including the index field as the field value in the key value may be obtained from the forward index table corresponding to the specified index table as the initial search result.
For example, if the key values of the target index key matched with the search term are a_id and b_id, the data entry with the field values of the index field being a_id and b_id in the forward index table corresponding to the specified index table is the initial search result.
S504, determining a search result corresponding to the search request based on the initial search result.
In practical application, the search request may also carry a search condition, for example, a region range, a time range, etc., so that after an initial search result is obtained, the initial search result may be screened according to the search condition to obtain a search result corresponding to the search request; or, according to the ranking of the heat value of the initial search results from high to low, selecting the preset number of search results as the search results corresponding to the search request is reasonable.
Therefore, through the scheme, the data containing the specific search word can be quickly searched, and the query speed is improved.
For a better understanding of the disclosure embodiments, the following description is provided in connection with a specific example.
In an in-memory database service, there is typically one update thread and a set of search threads. The updating thread is used for periodically reading the data log, and then updating the table and the index; the retrieval thread is used for processing the retrieval request of the user and converting the retrieval request into query operations on the index and the table. The roles of the update thread and the search thread are described below in connection with the construction flow of the inverted index table in the advertisement service, where the construction flow of the inverted index table is as follows:
(1) Creating an advertisement detail table (corresponding to the forward index table) according to the data log, wherein an index field in the advertisement detail table is an advertisement id, other fields are information fields in the advertisement detail table, and the advertisement detail table is shown in table 1:
TABLE 1
Advertisement id | Buying word signature | Advertisement bidding | Advertiser id | Plan id | … |
30001 | 666111 | 100 | 10001 | 20001 | |
30002 | 666112 | 200 | 10001 | 20001 | |
30003 | 666111 | 150 | 10002 | 20002 | |
30004 | 666111 | 120 | 10003 | 20003 | |
… |
(2) Determining a buying word signature from the advertisement detail table as a designated field, and constructing an inverted index table corresponding to the advertisement detail table, wherein the inverted index table is shown in table 2:
TABLE 2
The "buying word signature" is used as an index key in the inverted index table, and the field value of the index field corresponding to the field value of the "buying word signature" in the advertisement detail table is the content to be indexed corresponding to the index key, namely the "advertisement set" content.
Each field value in the advertisement set is realized by adopting a continuously stored index data structure, namely, when the number of field values in the advertisement set does not exceed a preset number threshold, the advertisement set content is stored in an array structure, and when the number of field values in the advertisement set exceeds the preset number threshold, the advertisement set content is stored in a prefix tree of which the tree nodes adopt the array structure for data storage.
The array structure comprises types of RC1, RC7, RC16, RC80 and RC256, and the maximum element numbers stored in the array structures of different types are different. It will be appreciated that a storage structure in the form of an RC (RowContainer) contains three key fields: data (array) for storing continuous data; valid bits set (valid bit set) for marking whether data is stored in a corresponding position of data, 1 indicates yes, and 0 indicates no; cursor (array cursor) is used to represent the current position of use, and this variable is only increased or not decreased.
When the array structure is adopted to store the advertisement set content, the array structure of the target type meeting the preset selection condition can be determined first, namely, after the advertisement set content is stored, the memory loss is enabled to be not more than 8 bytes/strip, so that the advertisement set content is stored in the array structure of the target type.
In addition, in practical application, the number of the array structures using the RC1 type reaches hundred million levels, so that a memory pool mode can be adopted, and before the array structures using the RC1 type are used, whether memory blocks for storing the array structures of the RC1 type exist in the memory pool can be detected. If the content exists, storing the content to be indexed by using a memory block in the memory pool and an array structure of RC 1; if the data is not stored, the memory block for storing the RC1 type array structure is applied from the system memory, and after the stored data in the memory block is released, the memory pool can also recycle the memory block, so that the memory block can be recycled. Therefore, when the memory block is needed to be utilized subsequently, the memory block can be directly obtained from the memory pool, and the system performance loss caused by continuously applying or destroying the memory block is reduced.
The update thread functions as follows:
after periodically reading the data log, updating the content in the advertisement detail table according to the latest data log, and at this time, in response to the updating operation in the advertisement detail table, updating the key value of the index key corresponding to the updating operation in the inverted index table. Since the updating operation includes a delete or add operation, at this time, the "advertisement set" content in the inverted index table may change, and according to the updated "advertisement set" content, the tree structure may be optimized, including upgrading or downgrading the tree structure, where the upgrading or downgrading of the tree structure may be the same manner as the upgrading or downgrading of the tree structure in the prior art, which is not limited in this example. And, the type of the array structure can be redetermined according to the number of field values in the advertisement set content to determine whether the type of the array structure needs to be updated, so that the memory loss of the array structure storing the advertisement set content is controlled to be not more than 8 bytes/bar.
Wherein the retrieval thread functions as follows:
after the user sends the search request, the search thread returns the advertisement set meeting the requirement according to the search word carried in the search request. The specific implementation flow is as follows:
(1) In response to receiving the search request, determining a search term indicated by the search request, e.g., the search term may be a set of field values of a "buy word signature";
(2) Acquiring the key value of an index key matched with the search word from an inverted index table to obtain the content of an advertisement set corresponding to the field value of the buying word signature;
(3) Searching a data item corresponding to a field value of the advertisement set content serving as an index field from a detail table of the advertisement corresponding to the inverted index table, and taking the data item as an initial retrieval result;
(4) Filtering initial search results which do not accord with the search conditions of the search request to obtain search results corresponding to the search request; the search condition may be a region, a delivery time, or the like.
In the scheme, when the number of the field values included in the content to be indexed does not exceed a preset number threshold, the content to be indexed is stored in an array structure, so that the continuity of data storage can be improved, and the number of cache invalidation times is reduced; when the number of field values included in the content to be indexed exceeds a preset number threshold, the content to be indexed is stored in a specified tree structure, so that the query speed of data can be ensured, and the cache failure times can be reduced and the query speed can be ensured. In addition, the memory loss can be reduced by storing the content to be indexed by utilizing an array structure of the target type that causes the memory loss not to exceed 8 bytes/stripe; by using the memory pool to store the memory blocks of different types of array structures, when the memory blocks of different types of array structures are needed to be used, the memory blocks can be obtained from the memory pool first, and after the data content stored in the memory blocks is released, the memory pool can also recycle the memory blocks, so that the memory blocks can be recycled. Therefore, the system performance loss caused by continuously applying or destroying the memory blocks can be reduced.
Based on the embodiment of the data processing method, the embodiment of the disclosure further provides a data processing device, as shown in fig. 6, where the device includes:
a first response module 610, configured to determine, in response to receiving a construction instruction for an inverted index table corresponding to a forward index table, a target field value of a specified field in the forward index table; wherein the specified field is a field other than an index field in the forward index table;
a first determining module 620, configured to determine, based on the forward index table, content to be indexed corresponding to the target field value; wherein, the content to be indexed comprises: a field value of an index field contained in a specified data entry in the forward index table, wherein the specified data entry is a data entry of which the specified field has the target field value;
a building module 630, configured to build an inverted index table according to a predetermined building manner by using the target field value as an index key and the content to be indexed as a key value of the index key;
the preset construction mode comprises the following steps: and if the number of the field values included in the content to be indexed does not exceed the preset number threshold, storing the content to be indexed in an array structure, otherwise, storing the content to be indexed in a specified tree structure.
Optionally, the specified tree structure is a specified prefix tree, and the specified prefix tree is a prefix tree in which tree nodes adopt an array structure to store data.
Optionally, the array structure is multiple in type, and the maximum element number stored in the array structures of different types is different;
the method for storing the content to be indexed in the array structure comprises the following steps:
determining an array structure of a target type which meets a preset selection condition in a plurality of types of array structures; the predetermined selection condition is that after the content to be indexed is stored, the loss of the storage space is smaller than a predetermined capacity threshold;
and storing the content to be indexed in an array structure of the target type.
Optionally, the storing the content to be indexed in the array structure of the target type includes:
detecting whether a memory block for storing the array structure of the target type exists in a preset memory pool or not;
if the content exists, storing the content to be indexed by using the memory blocks in the memory pool and using the array structure of the target type;
and if the target type array structure does not exist, applying for the memory block from the system memory, and storing the content to be indexed by using the applied memory block and the array structure of the target type.
Optionally, after the reverse index table is constructed according to the predetermined construction mode by taking the target field value as an index key and the content to be indexed as a key value of the index key, the method further includes:
updating key values of index keys corresponding to the updating operation in the inverted index table in response to a specified updating operation for data entries of the forward index table;
the specified updating operation comprises deleting or adding operation, and the index key corresponding to the updating operation is a field value of the specified field in the data item indicated by the updating operation.
Optionally, the method further comprises:
detecting whether the loss of the storage space corresponding to the target key value in the inverted index table is larger than a preset capacity threshold value or not; the target key value is a key value stored in an array structure, and the loss of the storage space corresponding to the target key value is the loss of the storage space of the array structure stored in the target key value;
if yes, determining an array structure of a specified type, and changing the array structure currently stored in the target key value into the array structure of the specified type;
After the target key value is stored in the array structure of the specified type, the loss of the storage space is smaller than a preset capacity threshold value.
Based on the above embodiments of the search method, the embodiments of the present disclosure further provide a search device, as shown in fig. 7, where the device includes:
a second response module 710, configured to determine, in response to receiving a search request, a search term indicated by the search request;
a second determining module 720, configured to determine, from the specified index table, a key value of a target index key that matches the search term; wherein the specified index table is an inverted index table constructed based on any one of the data processing methods described above;
an obtaining module 730, configured to obtain, from a forward index table corresponding to the specified index table, a data entry in which a field value of the index field is matched with a field value in the key value, as an initial search result;
and a third determining module 740, configured to determine a search result corresponding to the search request based on the initial search result.
In the technical scheme of the disclosure, the related processes of collecting, storing, using, processing, transmitting, providing, disclosing and the like of the personal information of the user accord with the regulations of related laws and regulations, and the public order colloquial is not violated.
According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.
An electronic device provided by the present disclosure may include:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the data processing method, or the steps of the retrieval method, described above.
The present disclosure provides a computer-readable storage medium having stored therein a computer program which, when executed by a processor, implements the steps of any one of the data processing methods described above, or the steps of the retrieval method described above.
In a further embodiment provided by the present disclosure, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the steps of any of the data processing methods of the above embodiments, or the steps of the retrieval method described above.
Fig. 8 illustrates a schematic block diagram of an example electronic device 800 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 8, the apparatus 800 includes a computing unit 801 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 802 or a computer program loaded from a storage unit 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data required for the operation of the device 800 can also be stored. The computing unit 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to the bus 804.
Various components in device 800 are connected to I/O interface 805, including: an input unit 806 such as a keyboard, mouse, etc.; an output unit 807 such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, etc.; and a communication unit 809, such as a network card, modem, wireless communication transceiver, or the like. The communication unit 809 allows the device 800 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
The computing unit 801 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 801 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 801 performs the respective methods and processes described above, such as a data processing method, or a retrieval method. For example, in some embodiments, the data processing method, or the retrieval method, may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 808. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 800 via ROM 802 and/or communication unit 809. When a computer program is loaded into the RAM 803 and executed by the computing unit 801, one or more steps of the data processing method described above, or the retrieval method, may be performed. Alternatively, in other embodiments, the computing unit 801 may be configured to perform the data processing method, or the retrieval method, in any other suitable manner (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel, sequentially, or in a different order, provided that the desired results of the disclosed aspects are achieved, and are not limited herein.
The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.
Claims (12)
1. A data processing method, comprising:
in response to receiving a construction instruction of an inverted index table corresponding to a forward index table, determining a target field value of a specified field in the forward index table; wherein the specified field is a field other than an index field in the forward index table;
determining the content to be indexed corresponding to the target field value based on the forward index table; wherein, the content to be indexed comprises: a field value of an index field contained in a specified data entry in the forward index table, wherein the specified data entry is a data entry of which the specified field has the target field value;
Constructing an inverted index table according to a preset construction mode by taking the target field value as an index key and the content to be indexed as a key value of the index key;
the preset construction mode comprises the following steps: and if the number of the field values included in the content to be indexed does not exceed the preset number threshold, storing the content to be indexed in an array structure, otherwise, storing the content to be indexed in a specified tree structure.
2. The method of claim 1, wherein the specified tree structure is a specified prefix tree that is a prefix tree for tree nodes to employ array structures for data storage.
3. The method of claim 1 or 2, wherein the array structure is of a plurality of types, the maximum number of elements stored in different types of array structures being different;
the method for storing the content to be indexed in the array structure comprises the following steps:
determining an array structure of a target type which meets a preset selection condition in a plurality of types of array structures; the predetermined selection condition is that after the content to be indexed is stored, the loss of the storage space is smaller than a predetermined capacity threshold;
and storing the content to be indexed in an array structure of the target type.
4. The method of claim 3, wherein the storing the content to be indexed in the array structure of the target type comprises:
detecting whether a memory block for storing the array structure of the target type exists in a preset memory pool or not;
if the content exists, storing the content to be indexed by using the memory blocks in the memory pool and using the array structure of the target type;
and if the target type array structure does not exist, applying for the memory block from the system memory, and storing the content to be indexed by using the applied memory block and the array structure of the target type.
5. The method of claim 3, wherein after the reverse index table is constructed in a predetermined construction manner with the target field value as an index key and the content to be indexed as a key value of the index key, the method further comprises:
updating key values of index keys corresponding to the updating operation in the inverted index table in response to a specified updating operation for data entries of the forward index table;
the specified updating operation comprises deleting or adding operation, and the index key corresponding to the updating operation is a field value of the specified field in the data item indicated by the updating operation.
6. The method of claim 5, the method further comprising:
detecting whether the loss of the storage space corresponding to the target key value in the inverted index table is larger than a preset capacity threshold value or not; the target key value is a key value stored in an array structure, and the loss of the storage space corresponding to the target key value is the loss of the storage space of the array structure stored in the target key value;
if yes, determining an array structure of a specified type, and changing the array structure currently stored in the target key value into the array structure of the specified type;
after the target key value is stored in the array structure of the specified type, the loss of the storage space is smaller than a preset capacity threshold value.
7. A retrieval method, comprising:
in response to receiving a search request, determining a search term indicated by the search request;
determining a key value of a target index key matched with the search term from a designated index table; wherein the specified index table is an inverted index table constructed based on the method of any one of claims 1-6;
acquiring a data entry, which is matched with a field value of an index field contained in the data entry and a field value in the key value, from a forward index table corresponding to the specified index table, and taking the data entry as an initial search result;
And determining a search result corresponding to the search request based on the initial search result.
8. A data processing apparatus comprising:
the first response module is used for determining a target field value of a designated field in the forward index table in response to receiving a construction instruction of the reverse index table corresponding to the forward index table; wherein the specified field is a field other than an index field in the forward index table;
the first determining module is used for determining the content to be indexed corresponding to the target field value based on the forward index table; wherein, the content to be indexed comprises: a field value of an index field contained in a specified data entry in the forward index table, wherein the specified data entry is a data entry of which the specified field has the target field value;
the construction module is used for constructing an inverted index table according to a preset construction mode by taking the target field value as an index key and the content to be indexed as a key value of the index key;
the preset construction mode comprises the following steps: and if the number of the field values included in the content to be indexed does not exceed the preset number threshold, storing the content to be indexed in an array structure, otherwise, storing the content to be indexed in a specified tree structure.
9. A retrieval device, comprising:
the second response module is used for responding to the received search request and determining a search term indicated by the search request;
the second determining module is used for determining the key value of the target index key matched with the search term from the appointed index table; wherein the specified index table is an inverted index table constructed based on the method of any one of claims 1-6;
the acquisition module is used for acquiring a data item, of which the field value of the contained index field is matched with the field value in the key value, from a forward index table corresponding to the specified index table as an initial retrieval result;
and the third determining module is used for determining a search result corresponding to the search request based on the initial search result.
10. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-7.
11. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-7.
12. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310465990.5A CN116521816A (en) | 2023-04-26 | 2023-04-26 | Data processing method, retrieval method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310465990.5A CN116521816A (en) | 2023-04-26 | 2023-04-26 | Data processing method, retrieval method, device, equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116521816A true CN116521816A (en) | 2023-08-01 |
Family
ID=87402426
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310465990.5A Pending CN116521816A (en) | 2023-04-26 | 2023-04-26 | Data processing method, retrieval method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116521816A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117519839A (en) * | 2024-01-05 | 2024-02-06 | 恒生电子股份有限公司 | Data loading method and device |
CN118585528A (en) * | 2024-08-06 | 2024-09-03 | 杭州古珀医疗科技有限公司 | Data query method and device based on dynamic configuration tag inverted index |
-
2023
- 2023-04-26 CN CN202310465990.5A patent/CN116521816A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117519839A (en) * | 2024-01-05 | 2024-02-06 | 恒生电子股份有限公司 | Data loading method and device |
CN117519839B (en) * | 2024-01-05 | 2024-04-16 | 恒生电子股份有限公司 | Data loading method and device |
CN118585528A (en) * | 2024-08-06 | 2024-09-03 | 杭州古珀医疗科技有限公司 | Data query method and device based on dynamic configuration tag inverted index |
CN118585528B (en) * | 2024-08-06 | 2024-10-25 | 杭州古珀医疗科技有限公司 | Data query method and device based on dynamic configuration tag inverted index |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN116521816A (en) | Data processing method, retrieval method, device, equipment and storage medium | |
CN111247518A (en) | Database sharding | |
CN113568940B (en) | Method, device, equipment and storage medium for data query | |
CN105302807B (en) | Method and device for acquiring information category | |
CN113961510B (en) | File processing method, device, equipment and storage medium | |
CN113886434A (en) | Database cluster-based query and storage method, device and equipment | |
CN107609192A (en) | The supplement searching method and device of a kind of search engine | |
CN112818230B (en) | Content recommendation method, device, electronic equipment and storage medium | |
CN111756832B (en) | Method and device for pushing information, electronic equipment and computer readable storage medium | |
CN114817651B (en) | Data storage method, data query method, device and equipment | |
CN112989170A (en) | Keyword matching method applied to information search, information search method and device | |
CN111488736A (en) | Self-learning word segmentation method and device, computer equipment and storage medium | |
CN103530345A (en) | Short text characteristic extension and fitting characteristic library building method and device | |
CN112887426B (en) | Information stream pushing method and device, electronic equipment and storage medium | |
CN112800315B (en) | Data processing method, device, equipment and storage medium | |
US10275399B2 (en) | Faster main memory scans in unsorted dictionary-encoded vectors | |
CN115525659A (en) | Data query method and device, electronic equipment and storage medium | |
CN114443910A (en) | Data storage method, searching device and electronic equipment | |
CN112528156A (en) | Method for establishing sequencing model, method for automatically completing query and corresponding device | |
CN113377402A (en) | Multi-version concurrent storage method and device | |
CN112631517A (en) | Data storage method and device, electronic equipment and storage medium | |
CN113449155B (en) | Method, apparatus, device and medium for feature representation processing | |
US12147448B2 (en) | Data reading method, device and storage medium | |
EP4131017A2 (en) | Distributed data storage | |
CN113569144B (en) | Method, device, equipment, storage medium and program product for searching promotion content |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |