CN108268512B - Label query method and device - Google Patents

Label query method and device Download PDF

Info

Publication number
CN108268512B
CN108268512B CN201611263049.1A CN201611263049A CN108268512B CN 108268512 B CN108268512 B CN 108268512B CN 201611263049 A CN201611263049 A CN 201611263049A CN 108268512 B CN108268512 B CN 108268512B
Authority
CN
China
Prior art keywords
query
data
tag
label
sub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611263049.1A
Other languages
Chinese (zh)
Other versions
CN108268512A (en
Inventor
喻弘
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Shanghai Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Shanghai Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201611263049.1A priority Critical patent/CN108268512B/en
Publication of CN108268512A publication Critical patent/CN108268512A/en
Application granted granted Critical
Publication of CN108268512B publication Critical patent/CN108268512B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24534Query rewriting; Transformation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a label query method and a device, wherein the method comprises the following steps: receiving a tag query request, generating a tag query plan according to the tag query request, and analyzing and splitting the tag query plan according to data stored by a plurality of data platforms to obtain a plurality of tag query sub-plans; acquiring types of a plurality of data platforms corresponding to the plurality of label query sub-plans according to definition information of pre-stored label data, and generating a plurality of query commands according to the plurality of label query sub-plans and the plurality of data platforms; sending corresponding query commands to the data platforms through query interfaces corresponding to the data platforms so that the data platforms execute corresponding query tasks according to the query commands; and receiving a plurality of inquiry sub-results sent by a plurality of data platforms, carrying out aggregation association calculation on the plurality of inquiry sub-results, obtaining inquiry results and outputting the inquiry results. The device is used for executing the method. The embodiment of the invention realizes cross-platform label query.

Description

Label query method and device
Technical Field
The embodiment of the invention relates to the technical field of data services, in particular to a label query method and a label query device.
Background
The label is a convergent integration analysis of user information, behavior data, order data and position data such as basic attributes, behavior characteristics and service preferences of a customer, and based on a service rule, a natural language-like mode is adopted to describe the characteristics of the customer again, for example: high-grade white collar, campus clients, Zhou Jie Lun fans, stockholders and potential clients of 4G terminals. Effectively depicts and utilizes the label information, can comprehensively master the attribute characteristics of the customers, senses the behavior change of the customers, supports the development of daily operation, accurate marketing and customer service work, can support the expansion of business modes, and is oriented to future exploration and the development of new markets. The label library is a system for bearing label management and application service, and mainly comprises functions of label source data access, label rule configuration, label calculation, label result generation, label application and the like.
In the prior art, a label construction method of a label library is generally based on service application scene requirements, a data period of day and month is adopted, a plurality of source system service data are synchronously copied to the same target data platform (a data warehouse, a data mart, an application system database and the like), label rule configuration and data correlation calculation are carried out based on the target data platform, and a required result label is generated and applied to the label.
Therefore, how to provide a scheme to realize cross-platform tag query becomes a problem to be solved urgently.
Disclosure of Invention
In view of the defects in the prior art, embodiments of the present invention provide a method and an apparatus for querying a tag.
In one aspect, an embodiment of the present invention provides a tag query method, including:
receiving a tag query request, generating a tag query plan according to the tag query request, and analyzing and splitting the tag query plan according to data stored by a plurality of data platforms to obtain a plurality of tag query sub-plans;
acquiring a plurality of data platforms corresponding to the plurality of label query sub-plans according to definition information of pre-stored label data, and generating a plurality of query commands according to the plurality of label query sub-plans and the types of the plurality of data platforms;
sending the corresponding query command to the plurality of data platforms through the query interfaces corresponding to the plurality of data platforms, so that the data platforms execute corresponding query tasks according to the query command;
and receiving a plurality of inquiry sub-results sent by a plurality of data platforms, carrying out aggregation association calculation on the plurality of inquiry sub-results, obtaining inquiry results and outputting the inquiry results.
In another aspect, an embodiment of the present invention provides a tag query apparatus, including:
the query plan generating unit is used for receiving a tag query request, generating a tag query plan according to the tag query request, and analyzing and splitting the tag query plan according to data stored by a plurality of data platforms to obtain a plurality of tag query sub-plans;
the query command generating unit is used for acquiring a plurality of data platforms corresponding to the plurality of tag query sub-plans according to definition information of pre-stored tag data, and generating a plurality of query commands according to the plurality of tag query sub-plans and the types of the plurality of data platforms;
the query command sending unit is used for sending the corresponding query commands to the data platforms through query interfaces corresponding to the data platforms so that the data platforms execute corresponding query tasks according to the query commands;
and the query result processing unit is used for receiving the query sub-results sent by the data platform, performing aggregation association calculation on the plurality of query sub-results, obtaining the query results and outputting the query results.
According to the label query method and device provided by the embodiment of the invention, the label query plan is generated through the label query request sent by the user, the label query plan is analyzed to generate the label query sub-plans, the data platforms corresponding to the label query sub-plans are analyzed according to the pre-stored label definition to generate the query commands corresponding to different data platforms, and cross-platform label query is realized through the query interfaces corresponding to the different data platforms.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a tag construction method in the prior art;
FIG. 2 is a schematic flow chart illustrating a tag query method according to an embodiment of the present invention;
FIG. 3 is a schematic flow chart illustrating a tag query method according to another embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a tag query device according to an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of another tag query device in an embodiment of the present invention;
fig. 6 is a schematic structural diagram of another tag inquiry apparatus in the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 2 is a schematic flow diagram of a tag query method in an embodiment of the present invention, and as shown in fig. 2, the tag query method provided in the embodiment of the present invention includes:
s1, receiving a label query request, generating a label query plan according to the label query request, and analyzing and splitting the label query plan according to data stored by a plurality of data platforms to obtain a plurality of label query sub-plans;
specifically, when a user queries a tag through the tag query device, the tag query device receives a tag query request, generates a tag query plan according to the received tag query request, and analyzes and splits the generated tag query plan according to data stored in a plurality of data platforms to generate a plurality of tag query sub-plans. Namely, the label query plan is divided into a plurality of query sub-plans according to the data content stored by each data platform
S2, acquiring a plurality of data platforms corresponding to the plurality of label query sub-plans according to definition information of pre-stored label data, and generating a plurality of query commands according to the plurality of label query sub-plans and the types of the plurality of data platforms;
specifically, after the plurality of tag query sub-plans are obtained, which data platform the plurality of tag query sub-plans correspond to is judged according to the definition information of the pre-stored tag data. And generating query commands according to the tag query sub-plans and the types of the data platforms corresponding to the tag query sub-plans, wherein each tag query sub-plan corresponds to one query command. The data platform supported in the embodiment of the invention comprises: mainstream relational databases such as: DB2, Oracle, Teradata, etc., large data platforms such as: hadoop, Spark, real-time streaming messages, etc.
S3, sending the corresponding query command to the data platforms through the query interfaces corresponding to the data platforms, so that the data platforms execute the corresponding query tasks according to the query command;
specifically, the obtained query command is sent to the data platform through a query interface corresponding to the data platform, and the data platform executes the corresponding query task after receiving the query command.
And S4, receiving a plurality of inquiry sub-results sent by the data platforms, performing aggregation correlation calculation on the plurality of inquiry sub-results, obtaining inquiry results, and outputting the inquiry results.
Specifically, a plurality of query commands are sent to a plurality of data platforms, each data platform executes a corresponding query task according to the query commands, obtains a plurality of query sub-results, and sends the query sub-results to the tag query device. And after receiving the query sub-results sent by the data platforms, the tag query device performs aggregation association calculation on the query sub-results to obtain and output the query results. The specific method for performing association calculation on the multiple query sub-results may be: converting the received inquiry sub-results corresponding to different data platforms into corresponding dataframes, namely data frames, and then performing aggregation association calculation, such as: join, filter, groupBy, orderBy, Aggregates, etc., generating query results, and outputting the query results.
For example: the user sends a label query request, such as a white-collar and high-grade and Zhou Jieren fan, through a label rule of a user interface of the label query device, and the label query device generates a label query plan which is not only a high-grade white-collar but also a Zhou Jieren fan according to the label query request. The label inquiry device acquires data of a high-grade white collar and data of a Zhougelong fan, stores the data in two data-in-place platforms respectively, analyzes the label inquiry plan and generates two label inquiry sub-plans, wherein the first label inquiry sub-plan is used for inquiring customer information of the high-grade white collar, and the second label inquiry sub-plan is used for inquiring customer information of the Zhougelong fan. And data of a first tag query sub-plan is acquired according to definition information of pre-stored tag data and stored in the data platform A, and data of a second tag query sub-plan is stored in the data platform B. And respectively generating query commands a and b according to different label query sub-plans and data platforms. And sending the query command a to the data platform A through a query interface c of the data platform A, and sending the query command B to the data platform B through a query interface d of the data platform B. And the data platform A and the data platform B execute a task of inquiring data of the high-grade white collar and the Zhou Jie Lung fan according to the inquiry commands a and B respectively, obtain inquiry sub-results M and N respectively and send the inquiry sub-results to the label inquiry device. And the label inquiry device performs aggregation calculation on the inquiry sub-results M and N to generate an inquiry result Q, namely a data set of the customer information of the high-grade white collar and the Zhou Jie Lun fan.
According to the label query method provided by the embodiment of the invention, the label query plan is generated through the label query request sent by the user, the label query plan is analyzed to generate the label query sub-plans, the data platforms corresponding to the label query sub-plans are analyzed according to the pre-stored label definition to generate the query commands corresponding to different data platforms, and the label query is carried out through the query interfaces corresponding to the different data platforms, so that the cross-platform label query is realized.
On the basis of the above embodiment, the method further includes: obtaining undefined label data in a plurality of data platforms and defining the label data;
obtaining definition information of the tag data, and storing the definition information, wherein the definition information includes: the tag information, corresponding data platform information, table information and field attribute information of the tag data.
Specifically, the tag query apparatus provided in the embodiment of the present invention obtains undefined tag data in different data platforms, defines the undefined tag data, obtains definition information of the tag data, and stores the definition information. Wherein the definition information of the tag data includes: tag information of the tag data, corresponding data platform information, table information, and field attribute information. The corresponding data platform information, namely the data source of the tag data, comprises a big data platform and a database such as a database from DB2 or a Hadoop big data platform; the table information includes the association relationship between the tables, i.e. from which table in the data platform the tag data originates or needs to be obtained by performing association calculation through which tables, and the field attribute information, i.e. which field in the data platform the tag data is. The label data definition in the embodiment of the invention can be seen, the label data does not need to be copied to a label inquiry device, and when an inquiry request is received, the corresponding label data is directly inquired on the corresponding data platform through different inquiry interfaces, so that the storage resource is saved, and the data redundancy is avoided.
According to the tag query method provided by the embodiment of the invention, because the query of the tag data of a plurality of data platforms is supported, new definition needs to be carried out on the tag data so as to rapidly judge the position of the tag data to be queried, the original tag data does not need to be copied, the storage space is saved, the data redundancy is avoided, and the cross-platform tag query is realized.
On the basis of the above embodiment, the acquiring undefined tag data in different data platforms and defining the tag data includes: and acquiring undefined label data in a plurality of data platforms in real time, and defining the label data.
Specifically, the embodiment of the present invention obtains undefined tag data in a plurality of data platforms in real time to define, obtains definition information of the tag data and stores the definition information, and the defined data does not need to be redefined. By acquiring the tag data in real time and defining, the effectiveness of data query is improved.
On the basis of the above embodiment, the analyzing and splitting the tag query plan according to the data stored in the multiple data platforms to obtain multiple tag query sub-plans includes: and according to the data stored by the data platforms, the query plan is split, lexical analysis and syntactic analysis are carried out, and a plurality of label query sub-plans are generated by analysis.
Specifically, after receiving a query request of a user and generating a tag query plan, according to the data content stored in each data platform, splitting, lexical analysis and syntactic analysis are performed on the tag query plan, and the tag query plan is analyzed into a plurality of tag query sub-plans, so that different query commands are further generated according to the tag query sub-plans. The label query plan can be split, lexical analysis and grammatical analysis by training a model and simulating natural language, and the specific method is not limited by the real-time example of the invention.
According to the label query method provided by the embodiment of the invention, the query plan is split, lexical analysis and syntactic analysis are carried out, a plurality of label query sub-plans are generated, and label data query supporting a plurality of data platforms is realized.
On the basis of the above embodiment, the generating a plurality of query commands according to a plurality of tag query sub-plans and a plurality of data platforms includes generating a plurality of query commands through an SQ L parser according to a plurality of tag query sub-plans and a plurality of data platforms, where the plurality of query commands are query statements corresponding to the plurality of data platforms.
Specifically, different data platforms have different grammars and correspond to different statements, and according to the tag query sub-plan and the corresponding data platform, the query command of the query statement corresponding to different data platforms is generated through an SQ L parser in the embodiment of the invention, for example, if the data corresponding to the tag query sub-plan is judged to be stored in a DB2 database, the query command of the query statement corresponding to a DB2 database can be generated through the DB2 SQ L parser, if the data corresponding to the tag query sub-plan is judged to be stored in an Oracle database, the query command of the query statement corresponding to the Oracle database can be generated through an Oracle SQ L parser, if the data corresponding to the tag query sub-plan is judged to be stored in a Teradata database, the query command of the query statement corresponding to the Teradata database can be generated through a Teradata L parser, and if the data corresponding to the tag query sub-plan is judged to be stored in a Hadoop large data platform, the query command of the Hidoq L can be generated through a Hadoop L generator.
For example, for the query command sent to the corresponding JDBC Driver through the JDBC API, namely the query interface, the database executes the corresponding SQ L query task and returns the query result to the query engine in the form of a result set, for the data stored in the Hadoop big data platform, the query command can be sent through Spark SQ L HiveContext, the Hadoop big data platform executes the query task and returns the query result in the form of a Dataframe.
According to the label query method provided by the embodiment of the invention, different query statements are generated according to different data platforms, and query commands are sent to the data platforms through different query interfaces, so that cross-platform label data query is realized, and the label data does not need to be copied, thereby avoiding data redundancy and improving the timeliness of label query.
Fig. 3 is a schematic flow chart of another tag query method in the embodiment of the present invention, and as shown in fig. 3, the tag query method in the embodiment of the present invention includes:
r1, sending a query request. The user sends the query request through the user interface of the tag query device.
And R2, generating a label query plan. And generating a tag query plan according to the pre-stored definition information of the tag data of the plurality of data platforms and the query request.
The tag query plan is parsed by different SQ L parsers, including a DB2 SQ L parser, an Oracle SQ L parser, a Teradata SQ L parser, and a Hadoop SQ L parser, as shown in FIG. 3.
And R4, generating a label query sub-plan, as shown in FIG. 3, analyzing and generating different label query sub-plans by different SQ L resolvers, and generating corresponding query commands.
And R5, executing the query task, as shown in FIG. 3, different SQ L resolvers resolve different tag query sub-plans, generate corresponding query commands, and send the query commands to corresponding data platforms, so that each data platform executes the corresponding query task.
R6, and returning a query sub-result. And after different data platforms execute different query tasks, different query sub-results are returned.
And R7, performing aggregation calculation on the query sub-results to generate a query result. And performing aggregation operation on the different query sub-results to generate a query result, and outputting the query result.
According to the tag query method provided by the embodiment of the invention, query result data can be applied to various scenes, such as: self-service data acquisition, self-service analysis, label pushing, label opening and the like of the label. The method breaks through the limitation of accessing a single data platform by the traditional tag library, realizes the cross-platform tag query facing mass heterogeneous data of multiple data platforms under the mixed-type large data platform architecture environment, realizes the online real-time combined query of the cross-platform tag of 'data migration and data landing prevention', avoids the storage pressure of data migration and copying among the platforms, and greatly reduces the system resource investment. The label data query across multiple platforms is realized, and the label query speed is improved.
Fig. 4 is a schematic structural diagram of a tag query apparatus in an embodiment of the present invention, and as shown in fig. 4, the tag query apparatus provided in the embodiment of the present invention includes: an inquiry plan generating unit 41, an inquiry command generating unit 42, an inquiry command transmitting unit 43, and an inquiry result processing unit 44, wherein:
the query plan generating unit 41 is configured to receive a tag query request, generate a tag query plan according to the tag query request, and analyze and split the tag query plan according to data stored in a plurality of data platforms to obtain a plurality of tag query sub-plans; the query command generating unit 42 is configured to obtain a plurality of data platforms corresponding to the plurality of tag query sub-plans according to definition information of pre-stored tag data, and generate a plurality of query commands according to the plurality of tag query sub-plans and types of the plurality of data platforms; the query command sending unit 43 is configured to send the corresponding query command to the multiple data platforms through the query interfaces corresponding to the multiple data platforms, so that the data platforms execute the corresponding query tasks according to the query command; the query result processing unit 44 is configured to receive the query sub-results sent by the data platform, perform aggregation association calculation on the query sub-results, obtain a query result, and output the query result.
Specifically, when a user performs tag query through the tag query device, the query plan generating unit 41 receives a tag query request, generates a tag query plan according to the received tag query request, analyzes and splits the generated tag query plan according to the content of data stored in each data platform, and generates a plurality of tag query sub-plans. The query command generating unit 42 determines which data platform the plurality of tag query sub-plans correspond to according to the definition information of the pre-stored tag data. And generating query commands according to the label query sub-plans and the data platforms corresponding to the label query sub-plans, wherein each label query sub-plan corresponds to one query command. The query command sending unit 43 sends the obtained query command to the data platform through the query interface corresponding to the data platform, and the data platform executes the corresponding query task after receiving the query command. The query command sending unit 43 sends a plurality of query commands to a plurality of data platforms, and each data platform executes a corresponding query task according to the query command to obtain a plurality of query sub-results, and sends the query sub-results to the query result processing unit 44. After receiving the query sub-results sent by the multiple data platforms, the query result processing unit 44 performs aggregate association calculation on the multiple query sub-results, obtains a query result, and outputs the query result.
The specific type of the data platform and the specific manner of aggregation association calculation are the same as those in the above embodiments, and are not described herein again.
The tag query device provided in the embodiment of the present invention generates a tag query plan by generating a tag query request sent by a user, analyzes the tag query plan, generates tag query sub-plans, analyzes data platforms corresponding to the tag query sub-plans according to a pre-stored tag definition, generates query commands corresponding to different data platforms, and performs tag query through query interfaces corresponding to the different data platforms, thereby implementing cross-platform tag query.
Fig. 5 is a schematic structural diagram of another tag query device in an embodiment of the present invention, and as shown in fig. 5, on the basis of the above embodiment, the device further includes: the tag data definition unit 51 is configured to obtain undefined tag data in a plurality of data platforms and define the tag data; obtaining definition information of the tag data, and storing the definition information, wherein the definition information includes: the tag information, corresponding data platform information, table information and field attribute information of the tag data.
Specifically, the tag data definition unit 51 acquires undefined tag data in different data platforms, defines the undefined tag data, acquires definition information of the tag data, and stores the definition information. Wherein the definition information of the tag data includes: tag information of the tag data, corresponding data platform information, table information, and field attribute information. Namely, the tag data is newly defined, and the specific position corresponding to each tag data is stored. The specific meaning of the definition information of the tag data is the same as that of the above embodiment, and is not described herein again.
The tag query device provided by the embodiment of the invention supports the tag data query of a plurality of data platforms, so that new definition needs to be performed on the tag data, so as to rapidly judge the position of the tag data to be queried, and the original tag data does not need to be copied, thereby saving the storage space, avoiding data redundancy, and realizing the cross-platform tag query.
On the basis of the above embodiment, the tag data defining unit is specifically configured to: and acquiring undefined label data in different data platforms in real time, and defining the label data.
Specifically, the tag data definition unit acquires undefined tag data in a plurality of data platforms in real time to define, acquires definition information of the tag data and stores the definition information, and the defined data does not need to be redefined. By acquiring the tag data in real time and defining, the effectiveness of data query is improved.
On the basis of the foregoing embodiment, the query plan generating unit is specifically configured to: and according to the data stored by the data platforms, the query plan is split, lexical analysis and syntactic analysis are carried out, and a plurality of label query sub-plans are generated by analysis.
Specifically, after receiving a query request of a user and generating a tag query plan, the query plan generating unit splits, lexical analysis and syntactic analysis on the tag query plan according to the content of data stored in the multiple data platforms, and parses the tag query plan into multiple tag query sub-plans, so as to further generate different query commands according to the tag query sub-plans. The label query plan can be split, lexical analysis and grammatical analysis by training a model and simulating natural language, and the specific method is not limited by the real-time example of the invention.
The label query device provided by the embodiment of the invention splits the query plan, analyzes the lexical analysis and the grammar to generate a plurality of label query sub-plans, thereby realizing the label data query supporting a plurality of data platforms.
On the basis of the above embodiment, the query command generating unit is specifically configured to generate, by using an SQ L parser, a plurality of query commands according to the plurality of tag query sub-plans and the plurality of data platforms, where the plurality of query commands include query statements corresponding to the plurality of data platforms.
Specifically, because different data platforms have different grammars and correspond to different statements, the query command generating unit in the embodiment of the present invention queries the sub-plan and the corresponding data platform according to the tag, and generates the query command of the query statement corresponding to the different data platforms through the SQ L parser.
For example, for the query command sent to the corresponding JDBC Driver through the JDBC API, namely the query interface, the database executes the corresponding SQ L query task and returns the query result to the query engine in the form of a result set, for the data stored in the Hadoop big data platform, the query command can be sent through Spark SQ L HiveContext, the Hadoop big data platform executes the query task and returns the query result in the form of a Dataframe.
The device provided by the invention is used for executing the method, and the specific implementation mode of the device is consistent with that of the method, and is not described again here.
The label inquiry device provided by the embodiment of the invention generates different inquiry sentences according to different data platforms, and sends inquiry commands to the data platforms through different inquiry interfaces, so that cross-platform label data inquiry is realized, and the label data does not need to be copied, thereby avoiding data redundancy and improving the timeliness of label inquiry.
Fig. 6 is a schematic structural diagram of another tag query device in an embodiment of the present invention, as shown in fig. 6, the device may include: a processor (processor)61, a memory (memory)62 and a communication bus 63, wherein the processor 61 and the memory 62 communicate with each other via the communication bus 63. The processor 61 may call logic instructions in the memory 62 to perform the following method: receiving a tag query request, generating a tag query plan according to the tag query request, and analyzing the tag query plan to obtain a plurality of tag query sub-plans; acquiring a plurality of data platforms corresponding to the plurality of label query sub-plans according to definition information of pre-stored label data, and generating a plurality of query commands according to the plurality of label query sub-plans and the plurality of data platforms; sending the corresponding query command to the plurality of data platforms through the query interfaces corresponding to the plurality of data platforms, so that the data platforms execute corresponding query tasks according to the query command; and receiving a plurality of inquiry sub-results sent by a plurality of data platforms, carrying out aggregation association calculation on the plurality of inquiry sub-results, obtaining inquiry results and outputting the inquiry results.
Furthermore, the logic instructions in the memory 62 may be implemented in software functional units and stored in a computer readable storage medium when sold or used as a stand-alone product. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Embodiments of the present invention provide a computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions that, when executed by a computer, enable the computer to perform the methods provided by the above-mentioned method embodiments, for example, including: receiving a tag query request, generating a tag query plan according to the tag query request, and analyzing the tag query plan to obtain a plurality of tag query sub-plans; acquiring a plurality of data platforms corresponding to the plurality of label query sub-plans according to definition information of pre-stored label data, and generating a plurality of query commands according to the plurality of label query sub-plans and the plurality of data platforms; sending the corresponding query command to the plurality of data platforms through the query interfaces corresponding to the plurality of data platforms, so that the data platforms execute corresponding query tasks according to the query command; and receiving a plurality of inquiry sub-results sent by a plurality of data platforms, carrying out aggregation association calculation on the plurality of inquiry sub-results, obtaining inquiry results and outputting the inquiry results.
Embodiments of the present invention provide a non-transitory computer-readable storage medium, which stores computer instructions, where the computer instructions cause the computer to perform the methods provided by the above method embodiments, for example, the methods include: receiving a tag query request, generating a tag query plan according to the tag query request, and analyzing the tag query plan to obtain a plurality of tag query sub-plans; acquiring a plurality of data platforms corresponding to the plurality of label query sub-plans according to definition information of pre-stored label data, and generating a plurality of query commands according to the plurality of label query sub-plans and the plurality of data platforms; sending the corresponding query command to the plurality of data platforms through the query interfaces corresponding to the plurality of data platforms, so that the data platforms execute corresponding query tasks according to the query command; and receiving a plurality of inquiry sub-results sent by a plurality of data platforms, carrying out aggregation association calculation on the plurality of inquiry sub-results, obtaining inquiry results and outputting the inquiry results.
The above-described embodiments of the apparatus and system are only schematic, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Claims (8)

1. A label query method is characterized by comprising the following steps:
receiving a tag query request, generating a tag query plan according to the tag query request, and analyzing and splitting the tag query plan according to data stored by a plurality of data platforms to obtain a plurality of tag query sub-plans;
acquiring a plurality of data platforms corresponding to the plurality of label query sub-plans according to definition information of pre-stored label data, and generating a plurality of query commands according to the plurality of label query sub-plans and the types of the plurality of data platforms;
sending the corresponding query command to the plurality of data platforms through the query interfaces corresponding to the plurality of data platforms, so that the data platforms execute corresponding query tasks according to the query command;
receiving a plurality of inquiry sub-results sent by a plurality of data platforms, carrying out aggregation association calculation on the plurality of inquiry sub-results to obtain inquiry results, and outputting the inquiry results;
wherein the method further comprises: obtaining undefined label data in a plurality of data platforms and defining the label data;
obtaining definition information of the tag data, and storing the definition information, wherein the definition information includes: the tag information, corresponding data platform information, table information and field attribute information of the tag data.
2. The method of claim 1, wherein obtaining undefined tag data in a plurality of the data platforms and defining the tag data comprises: and acquiring undefined label data in a plurality of data platforms in real time, and defining the label data.
3. The method of claim 1, wherein parsing and splitting the tagged query plan from data stored by a plurality of data platforms to obtain a plurality of tagged query sub-plans comprises: and according to the data stored by the data platforms, the query plan is split, lexical analysis and syntactic analysis are carried out, and a plurality of label query sub-plans are generated by analysis.
4. The method of claim 1, wherein generating a plurality of query commands based on the plurality of tagged query sub-plans and the plurality of types of data platforms comprises generating a plurality of query commands based on the plurality of tagged query sub-plans and the plurality of types of data platforms via an SQ L parser, wherein the plurality of query commands are query statements corresponding to the plurality of data platforms.
5. A tag interrogation apparatus, comprising:
the query plan generating unit is used for receiving a tag query request, generating a tag query plan according to the tag query request, and analyzing and splitting the tag query plan according to data stored by a plurality of data platforms to obtain a plurality of tag query sub-plans;
the query command generating unit is used for acquiring a plurality of data platforms corresponding to the plurality of tag query sub-plans according to definition information of pre-stored tag data, and generating a plurality of query commands according to the plurality of tag query sub-plans and the types of the plurality of data platforms;
the query command sending unit is used for sending the corresponding query commands to the data platforms through query interfaces corresponding to the data platforms so that the data platforms execute corresponding query tasks according to the query commands;
the query result processing unit is used for receiving the query sub-results sent by the data platform, performing aggregation association calculation on the plurality of query sub-results to obtain query results and outputting the query results;
the tag data definition unit is used for acquiring undefined tag data in the data platforms and defining the tag data;
obtaining definition information of the tag data, and storing the definition information, wherein the definition information includes: the tag information, corresponding data platform information, table information and field attribute information of the tag data.
6. The apparatus according to claim 5, wherein the tag data definition unit is specifically configured to: and acquiring undefined label data in a plurality of data platforms in real time, and defining the label data.
7. The apparatus according to claim 5, wherein the query plan generating unit is specifically configured to: and according to the data stored by the data platforms, the query plan is split, lexical analysis and syntactic analysis are carried out, and a plurality of label query sub-plans are generated by analysis.
8. The apparatus according to any of claims 5-7, wherein the query command generating unit is specifically configured to generate a plurality of the query commands through an SQ L parser according to a plurality of the tagged query sub-plans and a plurality of types of the data platforms, wherein the plurality of the query commands are query statements corresponding to the plurality of the data platforms.
CN201611263049.1A 2016-12-30 2016-12-30 Label query method and device Active CN108268512B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611263049.1A CN108268512B (en) 2016-12-30 2016-12-30 Label query method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611263049.1A CN108268512B (en) 2016-12-30 2016-12-30 Label query method and device

Publications (2)

Publication Number Publication Date
CN108268512A CN108268512A (en) 2018-07-10
CN108268512B true CN108268512B (en) 2020-07-31

Family

ID=62753892

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611263049.1A Active CN108268512B (en) 2016-12-30 2016-12-30 Label query method and device

Country Status (1)

Country Link
CN (1) CN108268512B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109871393A (en) * 2019-03-05 2019-06-11 云南电网有限责任公司信息中心 A kind of access method based on label system
CN110737706A (en) * 2019-09-06 2020-01-31 平安城市建设科技(深圳)有限公司 Data management method, device, equipment and computer readable storage medium
CN111930708B (en) * 2020-07-14 2023-07-11 上海德拓信息技术股份有限公司 Ceph object storage-based object tag expansion system and method
CN113901083B (en) * 2021-09-14 2023-05-12 北京柏睿数据技术股份有限公司 Heterogeneous data source operation resource analysis positioning method and equipment based on multiple resolvers

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7933900B2 (en) * 2005-10-23 2011-04-26 Google Inc. Search over structured data
US9501474B2 (en) * 2008-07-16 2016-11-22 Oracle International Corporation Enhanced use of tags when storing relationship information of enterprise objects
US20140025626A1 (en) * 2012-04-19 2014-01-23 Avalon Consulting, LLC Method of using search engine facet indexes to enable search-enhanced business intelligence analysis
CN103136364B (en) * 2013-03-14 2016-08-24 曙光信息产业(北京)有限公司 Clustered database system and data query processing method thereof
CN103123652A (en) * 2013-03-14 2013-05-29 曙光信息产业(北京)有限公司 Data query method and cluster database system
CN103425780B (en) * 2013-08-19 2016-08-17 曙光信息产业股份有限公司 The querying method of a kind of data and device
CN104572676B (en) * 2013-10-16 2017-11-17 中国银联股份有限公司 A kind of inter-library paging query method for multiple database table
CN104899225B (en) * 2014-03-07 2018-10-16 北京四达时代软件技术股份有限公司 Object Relation Mapping method, apparatus and processor
CN105677681A (en) * 2014-11-21 2016-06-15 北京神州泰岳软件股份有限公司 Data search method and device based on multiple databases
CN105159920A (en) * 2015-07-28 2015-12-16 卡斯柯信号有限公司 Attribute tag based database access method

Also Published As

Publication number Publication date
CN108268512A (en) 2018-07-10

Similar Documents

Publication Publication Date Title
CN108268512B (en) Label query method and device
CN110908997B (en) Data blood relationship construction method and device, server and readable storage medium
US11670288B1 (en) Generating predicted follow-on requests to a natural language request received by a natural language processing system
US8943052B2 (en) System and method for data modeling
WO2016082468A1 (en) Data graphing method, device and database server
US20130173664A1 (en) Mapping non-relational database objects into a relational database model
AU2017268630A1 (en) Method, device, server and storage apparatus of reviewing SQL
CN108073625B (en) System and method for metadata information management
US10102246B2 (en) Natural language consumer segmentation
WO2011092203A1 (en) System and method for building a cloud aware massive data analytics solution background
CN107291770B (en) Mass data query method and device in distributed system
CN114356971A (en) Data processing method, device and system
CN110955646A (en) Data storage and query method, device, equipment and medium
US20210358500A1 (en) Platform selection for performing requested actions in audio-based computing environments
US20150120697A1 (en) System and method for analysis of a database proxy
CN111177244A (en) Data association analysis method for multiple heterogeneous databases
CN111198898A (en) Big data query method and big data query device
CN113806429A (en) Canvas type log analysis method based on large data stream processing framework
CN110716955A (en) Method and system for quickly responding to data query request
EP3480693A1 (en) Distributed computing framework and distributed computing method
CN113962597A (en) Data analysis method and device, electronic equipment and storage medium
CN113609100A (en) Data storage method, data query method, data storage device, data query device and electronic equipment
CN106980617B (en) Method and system for operating database based on JSON statement
CN113407807A (en) Query optimization method and device for search engine and electronic equipment
CN106843822B (en) Execution code generation method and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant