CN112328595A - Data searching method, device, equipment and storage medium - Google Patents

Data searching method, device, equipment and storage medium Download PDF

Info

Publication number
CN112328595A
CN112328595A CN202011194002.0A CN202011194002A CN112328595A CN 112328595 A CN112328595 A CN 112328595A CN 202011194002 A CN202011194002 A CN 202011194002A CN 112328595 A CN112328595 A CN 112328595A
Authority
CN
China
Prior art keywords
data
target
file
identifier
searching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011194002.0A
Other languages
Chinese (zh)
Inventor
管亚亭
张峻滔
门飞飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Smk Network Technology Co ltd
Original Assignee
Shanghai Smk Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Smk Network Technology Co ltd filed Critical Shanghai Smk Network Technology Co ltd
Priority to CN202011194002.0A priority Critical patent/CN112328595A/en
Publication of CN112328595A publication Critical patent/CN112328595A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Abstract

The application discloses a data searching method, a data searching device, data searching equipment and a storage medium. The data searching method comprises the steps of receiving a data searching instruction, wherein the data searching instruction comprises a target user identifier and target searching time; searching a target data index corresponding to the target user identification in the target searching time in a preset index set of a database; the target data index includes a target file identification; the preset index set is obtained by aggregating all corresponding label data of all user identifications in a preset historical time period; and acquiring target data corresponding to the data search instruction in the target file corresponding to the target file identifier. By adopting the data searching method, the data searching device, the data searching equipment and the storage medium, the response time can be effectively shortened, and the data searching efficiency is improved.

Description

Data searching method, device, equipment and storage medium
Technical Field
The application relates to the technical field of internet, in particular to a data searching method, device, equipment and storage medium.
Background
User portrayal, a kind of label tree that can comprehensively depict user attribute and behavior information of a specific user in a specific historical time, is often used as a data basis for user analysis, data mining, and the like.
At this stage, all tag data for all users will typically be stored in the database in full. Thus, when searching for the tag data of a certain user, the user identification and the time are usually used as keywords to search for all the tag data in the database. However, with the development of big data, the data size of the tag data of the user stored in the database is also larger and larger, and at this time, the data search is still performed according to the foregoing method, which results in a longer response time and a lower data search efficiency.
Disclosure of Invention
An embodiment of the application aims to provide a data searching method, a data searching device, data searching equipment and a storage medium, so as to solve the technical problems of long response time and low data searching efficiency in the prior art.
The technical scheme of the application is as follows:
in a first aspect, a data searching method is provided, which may include:
receiving a data searching instruction, wherein the data searching instruction comprises a target user identifier and target searching time;
searching a target data index corresponding to the target user identification in the target searching time in a preset index set of a database; the target data index includes a target file identifier; the preset index set is obtained by aggregating all corresponding label data in a preset historical time period according to all user identifications;
and acquiring target data corresponding to the data searching instruction in the target file corresponding to the target file identifier.
In some embodiments, before receiving the data lookup instruction, the method may further comprise:
acquiring all label data corresponding to all user identifications in a preset historical time period;
and aggregating all the label data according to a preset index aggregation rule to obtain a preset index set of the database.
In some embodiments, aggregating all tag data according to a preset aggregation rule to obtain a preset index set may include:
dividing all the label data to obtain first label data corresponding to each user identifier in each unit time interval;
acquiring a file identifier of a first file where each first label data is located;
according to each file identifier, obtaining a first index corresponding to a file where first tag data corresponding to each user identifier is located in each unit time interval;
and aggregating all the first indexes to obtain a preset index set of the database.
In some embodiments, after obtaining the file identifier of each first tag data, before obtaining the first index corresponding to the file of each user identifier in each unit time period according to the file identifier, the method may further include:
acquiring a position identifier of each first label data in a first file, wherein the position identifier comprises at least one of a row identifier and a column identifier;
obtaining a first index corresponding to a file where each user identifier is located in each unit time interval according to the file identifiers, wherein the first index comprises:
and obtaining a first index corresponding to the file where each user identifier is located in each unit time interval according to the file identifier and the position identifier.
In some embodiments, the target data index may also include a target location identification;
the target user identification and/or the target search time are one or more.
In some embodiments, obtaining target data corresponding to the data search instruction in the target file corresponding to the obtained target file identifier may include:
determining a target file where the target data is located according to the target file identifier;
and acquiring target data corresponding to the data searching instruction in the target file.
In some embodiments, the target data index may also include a target location identification;
acquiring target data corresponding to the data search instruction from a target file, wherein the target data comprises the following steps:
determining the target position of the target data in the target file according to the target position identification;
and acquiring target data corresponding to the data searching instruction at a target position in the target file.
In some embodiments, the database is HBase.
In a second aspect, a data searching apparatus is provided, which may include:
the receiving module can be used for receiving a data searching instruction, wherein the data searching instruction at least comprises a target user identifier and target searching time;
the searching module can be used for searching a target data index corresponding to the target user identifier in the target searching time in a preset index set of the database; the target data index includes a target file identifier; the preset index set is obtained by aggregating all corresponding label data in a preset historical time period according to all user identifications;
the obtaining module may be configured to obtain, in the target file corresponding to the obtained target file identifier, target data corresponding to the data search instruction.
In a third aspect, a data searching apparatus is provided, which may include:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the data lookup method as shown in any embodiment of the first aspect.
In a fourth aspect, there is provided a storage medium, wherein instructions, when executed by a processor of a data lookup apparatus or a server, cause the data lookup apparatus or the server to implement a data lookup method as shown in any one of the embodiments of the first aspect.
The technical scheme provided by the embodiment of the application at least has the following beneficial effects:
according to the method and the device, the target data index corresponding to the target searching time is searched in the preset index set of the database by receiving the data searching instruction comprising the target user identification and the target searching time, and the target data are obtained according to the target data index. Therefore, the target data are obtained according to the target data index of the target data in the preset index set, namely, the data are searched in the form of the index, so that the target data do not need to be searched in all the tag data of the database, the data volume needing to be traversed can be effectively reduced, the data searching time is effectively reduced, the response time can be effectively reduced, and the data searching efficiency is improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and, together with the description, serve to explain the principles of the application and are not to be construed as limiting the application.
Fig. 1 is a schematic flowchart of a data searching method provided in an embodiment of the present application;
fig. 2 is a schematic diagram of a process for obtaining a preset index set based on aggregation of all tag data according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a data aggregation variation provided by an embodiment of the present application;
FIG. 4 is a schematic flow chart of a data searching method provided in an embodiment of the present application;
FIG. 5 is a schematic structural diagram of a data searching apparatus according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of a data searching apparatus according to an embodiment of the present application.
Detailed Description
In order to make the technical solutions of the present application better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.
Based on the background art, when data is searched in the prior art, the data is searched in all the tag data in the database, so that the response time is long, and the data searching efficiency is low.
Specifically, with the rise of big data technology, the data quantity and the data abundance of the label data of the user are larger and larger. Based on various log data, massive feature labels of the user are calculated and stored, and therefore the user portrait is obtained. The user portrait can provide data basis for various user analyses, such as machine learning, data mining and the like, so that marketing, popularization and the like are facilitated.
At present, when searching for tag data of a specific user at a specific historical time, a user identifier of the specific user and the specific historical time are usually input to a server configured in advance for data query. After receiving the user identifier and the historical time, the server may search all tag data in the database for the tag data corresponding to the user identifier and the historical time. Thus, the data lookup process needs to traverse all the tag data in the database, resulting in a long response time.
Therefore, in order to solve the above technical problem, the present application provides a data search method, device, equipment, and storage medium, where a data search instruction including a target user identifier and a target search time is received, a target data index corresponding to the target search time is searched in a preset index set of a database, and target data is obtained according to the target data index. Therefore, target data do not need to be searched in all tag data of the database, the target data are obtained according to the target data index of the target data in the preset index set, namely, data searching can be achieved through the form of indexes, so that the data volume needing traversal can be effectively reduced, the data searching time is effectively reduced, the response time can be effectively reduced, and the data searching efficiency is improved.
First, a data searching method provided in the embodiment of the present application is described in detail below.
Fig. 1 shows a flow diagram of a data searching method provided by an embodiment of the present application, where an execution subject of the data searching method may be a server or a server cluster. As shown in fig. 1, the data search method may include the following steps:
s110, receiving a data searching instruction.
The data search instruction may include a target user identifier and a target search time.
As an example, the target user identifier may be a user identifier of a specific user corresponding to the tag data to be searched, and the target user identifier may be one or more.
The target search time may be the time of the tag data to be searched, and the time may be a period of time, such as 10/20/00-24/00 in 2020, and the target search time may be one or more.
As a specific example, when performing data search, a worker may input a target user identifier and target search time corresponding to tag data to be searched to a server, and generate a data search instruction including the target user identifier and the target search time. In this way, the server can receive the data search instruction to trigger the data search process according to the data search instruction.
And S120, searching a target data index corresponding to the target user identifier in the target searching time in a preset index set of the database.
Wherein the target data index may include a target file identification.
The database can be HBase, and data in HBase is stored in a key value pair mode, so that the data searching efficiency can be further improved by adopting the HBase database and utilizing indexes and key value pairs.
As an example, the preset index set may be obtained by aggregating all tag data corresponding to all user identifiers in a preset history period, and the preset index set may be in the form of a table.
The target data index may be an index corresponding to tag data corresponding to the target user identifier at the target seek time, such as may include a file identifier.
As a specific example, after receiving a data search instruction including a target user identifier and a target search time, a preset index set in a database may be obtained, where the database may be a database inside a server or a distributed database. Then, a target data index corresponding to the target user identifier within the target search time may be searched in the preset index set.
It can be understood that, when searching for the target data index, all indexes corresponding to the target user identifier may be first searched for in the preset index set, and then the index corresponding to the target search time is searched for in all indexes corresponding to the target user identifier, and the index is determined as the target data index. Or, all indexes corresponding to the target search time may be searched in the preset index set, and then the index corresponding to the target user identifier is searched in all indexes corresponding to the search target search time, and the index is determined as the target data index.
S130, acquiring target data corresponding to the data searching instruction in the target file corresponding to the target file identification.
As an example, the target file may be a file corresponding to the target file identification, such as may be a folder or a document.
The target data may be tag data corresponding to the data search instruction, that is, tag data corresponding to the target user identifier at the target search time.
As a specific example, after finding a target data index including a target file identifier corresponding to a target user identifier at a target finding time, target data corresponding to a data finding instruction may be obtained in a target file corresponding to the target data index.
In some embodiments, the specific implementation method of step S130 may be as follows:
determining a target file where the target data is located according to the target file identifier;
and acquiring target data corresponding to the data searching instruction in a target file.
As a specific example, a target file identifier in the target data index may be obtained, and a target file where the target data is located is determined according to the target file identifier, that is, a target file corresponding to the target file identifier is determined. Then, the target data corresponding to the data search instruction, that is, the target data corresponding to the target user identifier at the target search time may be obtained in the target file.
According to the method and the device, the target data index corresponding to the target searching time is searched in the preset index set of the database by receiving the data searching instruction comprising the target user identification and the target searching time, and the target data are obtained according to the target data index. Therefore, the target data are obtained according to the target data index of the target data in the preset index set, namely, the data are searched in the form of the index, so that the target data do not need to be searched in all the tag data of the database, the data volume needing to be traversed can be effectively reduced, the data searching time is effectively reduced, the response time can be effectively reduced, and the data searching efficiency is improved.
In some embodiments, a preset index set of the database may be aggregated before performing the data lookup process. Accordingly, before performing the above steps S110 to S130, the following steps may be further performed:
acquiring all label data corresponding to all user identifications in a preset historical time period;
and aggregating all the label data according to a preset index aggregation rule to obtain a preset index set of the database.
As an example, the preset history period may be a preset time period for aggregating data to obtain a preset index set. Since the tag data corresponding to the user is generated every day, the preset historical time period is usually set to one day, that is, the preset index set is obtained by aggregating the newly generated tag data every day. It is to be understood that, each time the preset index set is aggregated, only the corresponding index of the newly generated tag data may be added to the existing index.
The preset aggregation rule may be a preset aggregation method for aggregating all tag data corresponding to all user identifiers in a preset historical time period to obtain a preset index set.
As a specific example, all tag data generated by all user identifications within a preset history period may be acquired before performing the data lookup process. Then, all the tag data generated by all the user identifiers in a preset historical time period can be aggregated according to a preset aggregation rule, so as to obtain a preset index set of data.
In this way, the preset index set of the database is obtained through pre-aggregation, on one hand. A data basis may be provided for the data lookup method described above. On the other hand, compared with the aggregation of the preset index set in real time when data search is performed each time, the aggregation processing times can be effectively reduced, so that the response time of the data search can be further reduced, and the data search efficiency is improved.
In some embodiments, the specific implementation manner of aggregating all the tag data according to the preset aggregation rule to obtain the preset index set may be as follows:
dividing all the label data to obtain first label data corresponding to each user identifier in each unit time interval;
acquiring a file identifier of a first file where each first label data is located;
according to each file identifier, obtaining a first index corresponding to a file where first tag data corresponding to each user identifier is located in each unit time interval;
and aggregating all the first indexes to obtain a preset index set of the database.
As an example, the unit period may be a time unit preset to divide all tag data corresponding to all user identifications within a preset history period, such as one day.
The first tag data may be tag data corresponding to each user identification for each unit period.
The first file may be a file in which the first tag data is located.
As a specific example, after all the tag data corresponding to all the user identifiers in the preset history period are acquired, all the tag data may be divided according to unit periods, so as to obtain first tag data corresponding to each user identifier in each unit period. Then, a first file in which each of the first tag data is located may be determined, a file identifier of the first file in which each of the first tag data is located may be obtained, and then, according to each file identifier, a first index corresponding to the file in which the first tag data corresponding to each of the user identifiers is located in each unit time interval is obtained in combination with each unit time interval, where a format of the first index may be: user (file identification). After the first index corresponding to the file where the first tag data corresponding to each user identifier is located is obtained in each unit time period, all the first indexes may be aggregated to obtain an aggregated first index set, which is a preset index set of the database.
It is to be understood that the preset index set is actually a first index including the tag data corresponding to each user identifier in each unit time. On this basis, the plurality of preset index sets may be further aggregated to obtain an annual index set, for example, the plurality of preset index sets may be aggregated according to the sequence of the time periods to which the preset index sets belong. In this way, it is possible to directly inquire which files each user specifically appears in those units of time, and thus more accurate positioning of the target data can be achieved.
As a specific example, referring to fig. 2, fig. 2 is a schematic diagram illustrating a process of obtaining a preset index set based on aggregation of all tag data according to an embodiment of the present application. As shown in fig. 2, the users 1 to M in fig. 2 represent different user identities, the tag N represents nth tag data of the user, and D represents a D-th date. After all the tag data corresponding to all the user identifiers in the preset historical time period are obtained, the tags can be preliminarily divided to obtain first tag data 210 corresponding to each user identifier in each unit time period, that is, the first tag data corresponding to each user identifier in each date, and each row in 210 represents the first tag data corresponding to each user identifier in one date. In connection with fig. 3, at this stage, the data magnitude of all tag data is M × N × D.
Then, a first file in which each first tag data is located may be determined, a file identifier of each first file may be obtained, according to each file identifier, a first index corresponding to a file in which the first tag data corresponding to each user identifier is located in each unit time period may be obtained, and then each first index is aggregated to obtain a preset index set of the database, that is, 220 in fig. 2. Each of the indexes 220 may include a file identifier corresponding to the first tag data of all the user identifiers in one date, that is, a first index corresponding to each identified first tag data in each date, for example, user 1 (file identifier 1), and each row in the index 220 may represent one first index. In connection with fig. 3, at this stage, the data magnitude of all tag data is M × D.
It is understood that aggregation processing may also be performed on a plurality of preset index sets, so as to obtain the annual index set 230 of the database. That is, the preset index sets of all dates in a year may be aggregated into the same list to obtain an annual index set, where the annual index set may include indexes corresponding to the tag data of each user identifier on each date. In connection with fig. 3, at this stage, the data magnitude of all tag data is M.
Therefore, through the tag aggregation processing, a preset index set can be obtained, and a data basis can be provided for data searching. And moreover, the data magnitude is reduced from M × N × D to M × N or M through the tag aggregation processing, so that the data volume is effectively reduced, and in the data searching process, the data is only required to be searched in a small number of preset index sets without traversing all tag data, so that the response time can be further reduced, and the data searching efficiency is improved.
In some embodiments, the location identifier may be combined with the file identifier as the first index. Correspondingly, after the file identifier of each first tag data is obtained, before the first index corresponding to the file of each user identifier in each unit time interval is obtained according to the file identifier, the following steps may also be performed:
and acquiring the position identification of each first label data in the first file.
The position identifier may include at least one of a row identifier and a column identifier, such as a row number and a column number.
At this time, a specific implementation manner of obtaining the first index corresponding to the file in which each user identifier is located in each unit time interval according to the file identifier may be as follows:
and obtaining a first index corresponding to the file where each user identifier is located in each unit time interval according to the file identifier and the position identifier.
As a specific example, after obtaining the file identifier of the first file in which each first tag data is located, the identifier of the location where the first tag data is located in the first file may also be determined, for example, the number of rows may be. At this time, when the first index is determined, the first index may be obtained by combining the file identifier and the location identifier, and at this time, the format of the first index may be: user (file id, location id), such as user 1 (file 1, line number 10).
In some embodiments, in the case that the first index includes a file identifier and a location identifier, the target data index in step S120 may further include a target location identifier on the basis of including the target file identifier.
In this way, the first index not only includes the file identifier, but also includes the location identifier, so that the positioning accuracy of the first index can be higher. At this time, when data is searched, the target data index may also include target position information, so that the target data index is more accurate, thereby further reducing response time and improving data search efficiency.
In some embodiments, in a case that the target data index further includes a target location identifier, the specific implementation method for obtaining the target data corresponding to the data search instruction in the target file may be as follows:
determining the target position of the target data in the target file according to the target position identification;
and acquiring target data corresponding to the data searching instruction at a target position in the target file.
As a specific example, in the case that the target data index further includes a target location identifier, after determining a target file corresponding to the target file identifier, the target location identifier in the target data index may also be obtained, and the target location where the target data is located is determined according to the target location identifier, for example, the line number of the target data in the target file may be determined. Then, at the target position in the target file, target data corresponding to the data search instruction, that is, target data corresponding to the target user identifier at the target search time may be acquired.
Therefore, on the basis of determining the target file where the target data is located, the position of the target data in the target file is further determined, so that the precision of the target data index is higher, the data positioning is more accurate, the response time can be further shortened, and the data searching efficiency is improved.
In order to describe the data searching method provided in the embodiment of the present application more clearly, the data searching method provided in the embodiment of the present application is described below with reference to fig. 4. As shown in fig. 4, the data search method may include the steps of:
and S410, acquiring all label data corresponding to all user identifications in a preset historical time period.
And S420, dividing all the label data to obtain first label data corresponding to each user identifier in each unit time interval.
S430, acquiring the file identifier and the position identifier of the first file where each first tag data is located.
S440, according to each file identifier, obtaining a first index corresponding to a file in which the first tag data corresponding to each user identifier is located in each unit time interval.
S450, aggregating all the first indexes to obtain a preset index set of the database.
And S460, aggregating the plurality of preset index sets to obtain a year index set of the database.
And S470, introducing a preset index set into Hbase.
The specific implementation method and principle of the above steps are the same as those of the above method embodiments, and for the sake of brevity, detailed description is omitted here.
As a specific example, taking a unit time period as one day as an example, the aggregation process of the preset index set may be:
1) aggregating all the label data into one line by taking the user identification as a main key, and according to the format: the user < tag ID > is stored as a day table, 210 in fig. 2.
2) And aiming at each day table, taking the user identification as a main key, acquiring a file identification and a row identification from the file information of the file where the label data corresponding to each user identification is located, and according to the format: the user identification (file identification, row identification) is stored as an index for the day, 220 in fig. 2. And aggregating the plurality of day indexes to obtain a preset index set.
3) Indexing the days of all the days, taking the user as a main key, and aggregating as follows: the user identifies the format of < date (file number, line number) > and this step generates an annual index set containing years as units. Thus, it can be known that each user specifically appears in which rows of which files on which days, and can accurately further locate the target data.
4) The day index can also be synchronized, and stored into the hbase in a mode that the user identifier is a row and the date is a column, and the file identifier and the row identifier of the target data can be acquired at the minute level.
The specific implementation method and principle of the above steps are the same as those of the above method embodiments, and for the sake of brevity, detailed description is omitted here.
Based on the same inventive concept, the embodiment of the application also provides a data searching device. The data search apparatus is explained below.
Fig. 5 shows a data searching apparatus provided in an embodiment of the present application. As shown in fig. 5, the data searching apparatus 500 may include:
a receiving module 510, configured to receive a data search instruction, where the data search instruction at least includes a target user identifier and a target search time;
the searching module 520 may be configured to search a target data index corresponding to the target user identifier in the target search time in a preset index set of the database; the target data index includes a target file identifier; the preset index set is obtained by aggregating all corresponding label data in a preset historical time period according to all user identifications;
the obtaining module 530 may be configured to obtain, in the target file corresponding to the obtained target file identifier, target data corresponding to the data search instruction.
In some embodiments, the data searching apparatus 500 may further include:
the acquisition module can be used for acquiring all label data corresponding to all user identifications in a preset historical time period;
and the aggregation module can be used for aggregating all the label data according to a preset index aggregation rule to obtain a preset index set of the database.
In some embodiments, the aggregation module may include:
the dividing unit may be configured to divide all tag data to obtain first tag data corresponding to each user identifier in each unit time period;
the first acquisition unit is used for acquiring the file identifier of the first file where each piece of first label data is located;
the first index unit may be configured to obtain, according to each file identifier, a first index corresponding to a file in which the first tag data corresponding to each user identifier is located in each unit time period;
and the aggregation unit can be used for aggregating all the first indexes to obtain a preset index set of the database.
In some embodiments, the aggregation module may further include:
the second obtaining unit may be configured to obtain a location identifier of each first tag data in the first file, where the location identifier includes at least one of a row identifier and a column identifier;
the first indexing unit may be configured to obtain, according to the file identifier and the location identifier, a first index corresponding to a file in which each user identifier is located in each unit time period.
In some embodiments, the target data index further includes a target location identification;
the target user identification and/or the target search time are one or more.
In some embodiments, the obtaining module 530 may include:
the first determining unit may be configured to determine, according to the target file identifier, a target file in which the target data is located;
the third obtaining unit may be configured to obtain target data corresponding to the data search instruction in the target file.
In some embodiments, the target data index further includes a target location identification;
the obtaining module 530 may further include:
the second determining unit may be configured to determine a target location of the target data in the target file according to the target location identifier;
the fourth obtaining unit may be configured to obtain, at a target position in the target file, target data corresponding to the data search instruction.
In some embodiments, the database may be an HBase.
The data searching apparatus may be configured to execute the method provided in the foregoing method embodiment, and the specific implementation principle and technical effect are similar, which are not described herein again for brevity.
Based on the same inventive concept, the present application also provides a data search device, as shown in fig. 6, which may include a processor 601 and a memory 602 storing computer program instructions.
Specifically, the processor 601 may include a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement the embodiments of the present invention.
Memory 602 may include mass storage for data or instructions. By way of example, and not limitation, memory 602 may include a Hard Disk Drive (HDD), floppy Disk Drive, flash memory, optical Disk, magneto-optical Disk, tape, or Universal Serial Bus (USB) Drive or a combination of two or more of these. Memory 602 may include removable or non-removable (or fixed) media, where appropriate. The memory 602 may be internal or external to the integrated gateway disaster recovery device, where appropriate. In a particular embodiment, the memory 602 is a non-volatile solid-state memory.
The memory may include Read Only Memory (ROM), Random Access Memory (RAM), magnetic disk storage media devices, optical storage media devices, flash memory devices, electrical, optical, or other physical/tangible memory storage devices. Thus, in general, the memory includes one or more tangible (non-transitory) computer-readable storage media (e.g., memory devices) encoded with software comprising computer-executable instructions and when the software is executed (e.g., by one or more processors), it is operable to perform operations described with reference to the methods according to an aspect of the present disclosure.
The processor 601 realizes any of the data searching methods in the above embodiments by reading and executing computer program instructions stored in the memory 602.
In one example, the data lookup device may also include a communication interface 603 and a bus 610. As shown in fig. 6, the processor 601, the memory 602, and the communication interface 603 are connected via a bus 610 to complete communication therebetween.
The communication interface 603 is mainly used for implementing communication between modules, apparatuses, units and/or devices in the embodiments of the present application.
Bus 610 includes hardware, software, or both to couple the components of the online data traffic billing device to each other. By way of example, and not limitation, a Bus may include an Accelerated Graphics Port (AGP) or other Graphics Bus, an Enhanced Industry Standard Architecture (EISA) Bus, a Front-Side Bus (Front Side Bus, FSB), a Hyper Transport (HT) interconnect, an Industry Standard Architecture (ISA) Bus, an infiniband interconnect, a Low Pin Count (LPC) Bus, a memory Bus, a Micro Channel Architecture (MCA) Bus, a Peripheral Component Interconnect (PCI) Bus, a PCI-Express (PCI-X) Bus, a Serial Advanced Technology Attachment (SATA) Bus, a video electronics standards association local (VLB) Bus, or other suitable Bus or a combination of two or more of these. Bus 610 may include one or more buses, where appropriate. Although specific buses are described and shown in the embodiments of the application, any suitable buses or interconnects are contemplated by the application.
In addition, in combination with the data searching method in the foregoing embodiment, the embodiment of the present application may provide a storage medium to implement. The storage medium having stored thereon computer program instructions; the computer program instructions, when executed by a processor, implement any of the data lookup methods in the above embodiments.
It is to be understood that the present application is not limited to the particular arrangements and instrumentality described above and shown in the attached drawings. A detailed description of known methods is omitted herein for the sake of brevity. In the above embodiments, several specific steps are described and shown as examples. However, the method processes of the present application are not limited to the specific steps described and illustrated, and those skilled in the art can make various changes, modifications, and additions or change the order between the steps after comprehending the spirit of the present application.
The functional blocks shown in the above-described structural block diagrams may be implemented as hardware, software, firmware, or a combination thereof. When implemented in hardware, it may be, for example, an electronic Circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, plug-in, function card, or the like. When implemented in software, the elements of the present application are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine-readable medium or transmitted by a data signal carried in a carrier wave over a transmission medium or a communication link. A "machine-readable medium" may include any medium that can store or transfer information. Examples of a machine-readable medium include electronic circuits, semiconductor memory devices, ROM, flash memory, Erasable ROM (EROM), floppy disks, CD-ROMs, optical disks, hard disks, fiber optic media, Radio Frequency (RF) links, and so forth. The code segments may be downloaded via computer networks such as the internet, intranet, etc.
It should also be noted that the exemplary embodiments mentioned in this application describe some methods or systems based on a series of steps or devices. However, the present application is not limited to the order of the above-described steps, that is, the steps may be performed in the order mentioned in the embodiments, may be performed in an order different from the order in the embodiments, or may be performed simultaneously.
Aspects of the present disclosure are described above with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, enable the implementation of the functions/acts specified in the flowchart and/or block diagram block or blocks. Such a processor may be, but is not limited to, a general purpose processor, a special purpose processor, an application specific processor, or a field programmable logic circuit. It will also be understood that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware for performing the specified functions or acts, or combinations of special purpose hardware and computer instructions.
As described above, only the specific embodiments of the present application are provided, and it can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the system, the module and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. It should be understood that the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present application, and these modifications or substitutions should be covered within the scope of the present application.

Claims (11)

1. A method for data retrieval, comprising:
receiving a data searching instruction, wherein the data searching instruction comprises a target user identifier and target searching time;
searching a target data index corresponding to the target user identification in the target searching time in a preset index set of a database; the target data index includes a target file identification; the preset index set is obtained by aggregating all corresponding label data of all user identifications in a preset historical time period;
and acquiring target data corresponding to the data search instruction in the target file corresponding to the target file identifier.
2. The method of claim 1, wherein prior to receiving the data lookup instruction, the method further comprises:
acquiring all label data corresponding to all user identifications in a preset historical time period;
and aggregating all the label data according to a preset index aggregation rule to obtain a preset index set of the database.
3. The method according to claim 2, wherein the aggregating all the tag data according to a preset aggregation rule to obtain a preset index set comprises:
dividing all the label data to obtain first label data corresponding to each user identifier in each unit time interval;
acquiring a file identifier of a first file in which each first label data is located;
obtaining a first index corresponding to a file in which first tag data corresponding to each user identifier is located in each unit time interval according to each file identifier;
and aggregating all the first indexes to obtain a preset index set of the database.
4. The method according to claim 3, wherein after the obtaining of the file identifier of each first tag data, before obtaining the first index corresponding to the file of each user identifier in each unit time interval according to the file identifier, the method further comprises:
acquiring a position identifier of each first label data in the first file, wherein the position identifier comprises at least one of a row identifier and a column identifier;
the obtaining a first index corresponding to a file in which each user identifier is located in each unit time interval according to the file identifier includes:
and obtaining a first index corresponding to the file where each user identifier is located in each unit time interval according to the file identifier and the position identifier.
5. The method of claim 4, wherein the target data index further comprises a target location identification;
the target user identification and/or the target search time are one or more.
6. The method according to claim 1, wherein the obtaining target data corresponding to the data search instruction from the target file corresponding to the obtained target file identifier comprises:
determining a target file where the target data is located according to the target file identifier;
and acquiring target data corresponding to the data searching instruction in the target file.
7. The method of claim 4, wherein the target data index further comprises a target location identification;
the obtaining of the target data corresponding to the data search instruction in the target file includes:
determining the target position of the target data in the target file according to the target position identification;
and acquiring target data corresponding to the data searching instruction at a target position in the target file.
8. The method according to any one of claims 1 to 7, wherein the database is HBase.
9. A data search apparatus, comprising:
the receiving module is used for receiving a data searching instruction, and the data searching instruction at least comprises a target user identifier and target searching time;
the searching module is used for searching a target data index corresponding to the target user identifier in the target searching time in a preset index set of a database; the target data index includes a target file identification; the preset index set is obtained by aggregating all corresponding label data of all user identifications in a preset historical time period;
and the acquisition module is used for acquiring target data corresponding to the data search instruction from the target file corresponding to the acquired target file identifier.
10. A data search apparatus, comprising:
a processor;
a memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the data lookup method of any one of claims 1 to 8.
11. A storage medium, wherein instructions in the storage medium, when executed by a processor of a data search apparatus or an electronic device, cause the data search apparatus or the server to implement the data search method according to any one of claims 1 to 8.
CN202011194002.0A 2020-10-30 2020-10-30 Data searching method, device, equipment and storage medium Pending CN112328595A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011194002.0A CN112328595A (en) 2020-10-30 2020-10-30 Data searching method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011194002.0A CN112328595A (en) 2020-10-30 2020-10-30 Data searching method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112328595A true CN112328595A (en) 2021-02-05

Family

ID=74297608

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011194002.0A Pending CN112328595A (en) 2020-10-30 2020-10-30 Data searching method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112328595A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1828607A (en) * 2006-04-03 2006-09-06 无锡永中科技有限公司 Data search method for tree-type structural file
CN102193917A (en) * 2010-03-01 2011-09-21 中国移动通信集团公司 Method and device for processing and querying data
CN102375852A (en) * 2010-08-24 2012-03-14 中国移动通信集团公司 Method for building data index as well as method and system using data index for inquiring data
CN106126592A (en) * 2016-06-20 2016-11-16 北京小米移动软件有限公司 The processing method and processing device of search data
CN106156164A (en) * 2015-04-15 2016-11-23 腾讯科技(深圳)有限公司 resource information processing method and device
CN107704527A (en) * 2017-09-18 2018-02-16 华为技术有限公司 Date storage method, device and storage medium
US10169349B2 (en) * 2013-09-05 2019-01-01 Smith Seckman Reid, Inc. Library indexing system and method
CN109687991A (en) * 2018-09-07 2019-04-26 平安科技(深圳)有限公司 User behavior recognition method, apparatus, equipment and storage medium
CN111813744A (en) * 2020-07-08 2020-10-23 平安科技(深圳)有限公司 File searching method, device, equipment and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1828607A (en) * 2006-04-03 2006-09-06 无锡永中科技有限公司 Data search method for tree-type structural file
CN102193917A (en) * 2010-03-01 2011-09-21 中国移动通信集团公司 Method and device for processing and querying data
CN102375852A (en) * 2010-08-24 2012-03-14 中国移动通信集团公司 Method for building data index as well as method and system using data index for inquiring data
US10169349B2 (en) * 2013-09-05 2019-01-01 Smith Seckman Reid, Inc. Library indexing system and method
CN106156164A (en) * 2015-04-15 2016-11-23 腾讯科技(深圳)有限公司 resource information processing method and device
CN106126592A (en) * 2016-06-20 2016-11-16 北京小米移动软件有限公司 The processing method and processing device of search data
CN107704527A (en) * 2017-09-18 2018-02-16 华为技术有限公司 Date storage method, device and storage medium
CN109687991A (en) * 2018-09-07 2019-04-26 平安科技(深圳)有限公司 User behavior recognition method, apparatus, equipment and storage medium
CN111813744A (en) * 2020-07-08 2020-10-23 平安科技(深圳)有限公司 File searching method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN107016018B (en) Database index creation method and device
CN112214577B (en) Method, device, equipment and computer storage medium for determining target user
CN111339151B (en) Online examination method, device, equipment and computer storage medium
CN112364014B (en) Data query method, device, server and storage medium
CN112925757A (en) Method, equipment and storage medium for tracking operation log of intelligent equipment
CN112199935A (en) Data comparison method and device, electronic equipment and computer readable storage medium
CN111339211B (en) Method, device, equipment and medium for analyzing network problems
CN114398394A (en) Data blood margin analysis method, device, equipment and storage medium
CN112328595A (en) Data searching method, device, equipment and storage medium
CN115329039A (en) Recruitment enterprise searching method, system, electronic equipment and storage medium
CN115658072A (en) Data blood margin analysis method, device, equipment and computer readable storage medium
CN115186741A (en) Method, device and equipment for verifying POI fusion data
CN114090643A (en) Recruitment information recommendation method, device, equipment and storage medium
CN108416056B (en) Dependency learning method, device, equipment and medium based on condition containing dependency
CN112306961B (en) Log processing method, device, equipment and storage medium
CN106802931B (en) Method and device for searching data table based on influence analysis
CN113177023B (en) Log retrieval method and device and electronic equipment
CN112364018B (en) Method, device and equipment for generating wide table and storage medium
CN117421280A (en) Data storage method and device
CN113392105B (en) Service data processing method and terminal equipment
CN113127674B (en) Song list recommendation method and device, electronic equipment and computer storage medium
CN116028521A (en) Data processing method, device, equipment, medium and product
CN114741385A (en) Method, device and equipment for generating general data relation structure and readable storage medium
CN114090014A (en) Program splitting method, device, equipment and computer storage medium
CN115640290A (en) Data layering method, device and equipment and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210205

RJ01 Rejection of invention patent application after publication